for-6.5/block-2023-06-23

-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmSV8dwQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpilGD/9Yys1oxIXJpRf00fzrylAlBthRxMjFQVWw
 zAut106hAQiBHvU8IkmGA3MvEFVHxtzwYhHI7IR8K3aZBIqscweCqmVI9JyogJw9
 U9Twnzel47VmuKdM94FeoN+hbj1fP8EWTjzmy67/zEEfFCdmHvNlMi3lSrGYIpFy
 39LxTB99Y4UarM5PtWbes37GYYljzMSWKuo4AfBkvq1eQa+sZ0Vq2xAABKq3UM7f
 apqhgHtkJooRePDP0eQp+kAyyVMgW2jIK+oIdJDxNF3CKTu2w40RzaYz6fp+jVSU
 H4R/xS59GW4/xql+VBJDh/qJg9K62DPPYjlW8BmSR8+IjvfFpsyH3/MacE50CD3P
 20fs/Mnj49H79fDrQEHJI53cOOb2EmUitbwLbvOcColNTPpt8loBtdQxjF2RMU8R
 Nyort9DJPFclYCxky1LYg1CNEC2Ln4Zy/jD47wPvqRmOQphOoVlV/hPnOEqvjaZC
 49Vn70W2DeE9cXvYI7ha+XIg6/oj+Gs3iusEbV08Ci7EAtXgI+ZUUsQ97K8UNiUh
 h2lqSJtuI7lBpYP9sf+BeCch5UCC+xGYyTdoM5f58lehWBBPtbs0g7S9RyRyOYxe
 n+yxEUo3dAGzJ/xsKAjinbZfeWIpr0b1TkAh4w3Cq/BKzRr9Bp8lBAxYuancbQ+Y
 1ADPteUOTA==
 =zP4Y
 -----END PGP SIGNATURE-----

Merge tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linux

Pull block updates from Jens Axboe:

 - NVMe pull request via Keith:
      - Various cleanups all around (Irvin, Chaitanya, Christophe)
      - Better struct packing (Christophe JAILLET)
      - Reduce controller error logs for optional commands (Keith)
      - Support for >=64KiB block sizes (Daniel Gomez)
      - Fabrics fixes and code organization (Max, Chaitanya, Daniel
        Wagner)

 - bcache updates via Coly:
      - Fix a race at init time (Mingzhe Zou)
      - Misc fixes and cleanups (Andrea, Thomas, Zheng, Ye)

 - use page pinning in the block layer for dio (David)

 - convert old block dio code to page pinning (David, Christoph)

 - cleanups for pktcdvd (Andy)

 - cleanups for rnbd (Guoqing)

 - use the unchecked __bio_add_page() for the initial single page
   additions (Johannes)

 - fix overflows in the Amiga partition handling code (Michael)

 - improve mq-deadline zoned device support (Bart)

 - keep passthrough requests out of the IO schedulers (Christoph, Ming)

 - improve support for flush requests, making them less special to deal
   with (Christoph)

 - add bdev holder ops and shutdown methods (Christoph)

 - fix the name_to_dev_t() situation and use cases (Christoph)

 - decouple the block open flags from fmode_t (Christoph)

 - ublk updates and cleanups, including adding user copy support (Ming)

 - BFQ sanity checking (Bart)

 - convert brd from radix to xarray (Pankaj)

 - constify various structures (Thomas, Ivan)

 - more fine grained persistent reservation ioctl capability checks
   (Jingbo)

 - misc fixes and cleanups (Arnd, Azeem, Demi, Ed, Hengqi, Hou, Jan,
   Jordy, Li, Min, Yu, Zhong, Waiman)

* tag 'for-6.5/block-2023-06-23' of git://git.kernel.dk/linux: (266 commits)
  scsi/sg: don't grab scsi host module reference
  ext4: Fix warning in blkdev_put()
  block: don't return -EINVAL for not found names in devt_from_devname
  cdrom: Fix spectre-v1 gadget
  block: Improve kernel-doc headers
  blk-mq: don't insert passthrough request into sw queue
  bsg: make bsg_class a static const structure
  ublk: make ublk_chr_class a static const structure
  aoe: make aoe_class a static const structure
  block/rnbd: make all 'class' structures const
  block: fix the exclusive open mask in disk_scan_partitions
  block: add overflow checks for Amiga partition support
  block: change all __u32 annotations to __be32 in affs_hardblocks.h
  block: fix signed int overflow in Amiga partition support
  block: add capacity validation in bdev_add_partition()
  block: fine-granular CAP_SYS_ADMIN for Persistent Reservation
  block: disallow Persistent Reservation on partitions
  reiserfs: fix blkdev_put() warning from release_journal_dev()
  block: fix wrong mode for blkdev_get_by_dev() from disk_scan_partitions()
  block: document the holder argument to blkdev_get_by_path
  ...
This commit is contained in:
Linus Torvalds 2023-06-26 12:47:20 -07:00
commit a0433f8cae
218 changed files with 4750 additions and 3887 deletions

View File

@ -508,9 +508,6 @@ cache_miss_collisions
cache miss, but raced with a write and data was already present (usually 0
since the synchronization for cache misses was rewritten)
cache_readaheads
Count of times readahead occurred.
Sysfs - cache set
~~~~~~~~~~~~~~~~~

View File

@ -2022,31 +2022,33 @@ that attribute:
no-change
Do not modify the I/O priority class.
none-to-rt
For requests that do not have an I/O priority class (NONE),
change the I/O priority class into RT. Do not modify
the I/O priority class of other requests.
promote-to-rt
For requests that have a non-RT I/O priority class, change it into RT.
Also change the priority level of these requests to 4. Do not modify
the I/O priority of requests that have priority class RT.
restrict-to-be
For requests that do not have an I/O priority class or that have I/O
priority class RT, change it into BE. Do not modify the I/O priority
class of requests that have priority class IDLE.
priority class RT, change it into BE. Also change the priority level
of these requests to 0. Do not modify the I/O priority class of
requests that have priority class IDLE.
idle
Change the I/O priority class of all requests into IDLE, the lowest
I/O priority class.
none-to-rt
Deprecated. Just an alias for promote-to-rt.
The following numerical values are associated with the I/O priority policies:
+-------------+---+
| no-change | 0 |
+-------------+---+
| none-to-rt | 1 |
+-------------+---+
| rt-to-be | 2 |
+-------------+---+
| all-to-idle | 3 |
+-------------+---+
+----------------+---+
| no-change | 0 |
+----------------+---+
| rt-to-be | 2 |
+----------------+---+
| all-to-idle | 3 |
+----------------+---+
The numerical value that corresponds to each I/O priority class is as follows:
@ -2062,9 +2064,13 @@ The numerical value that corresponds to each I/O priority class is as follows:
The algorithm to set the I/O priority class for a request is as follows:
- Translate the I/O priority class policy into a number.
- Change the request I/O priority class into the maximum of the I/O priority
class policy number and the numerical I/O priority class.
- If I/O priority class policy is promote-to-rt, change the request I/O
priority class to IOPRIO_CLASS_RT and change the request I/O priority
level to 4.
- If I/O priorityt class is not promote-to-rt, translate the I/O priority
class policy into a number, then change the request I/O priority class
into the maximum of the I/O priority class policy number and the numerical
I/O priority class.
PID
---

View File

@ -5452,7 +5452,12 @@
port and the regular usb controller gets disabled.
root= [KNL] Root filesystem
See name_to_dev_t comment in init/do_mounts.c.
Usually this a a block device specifier of some kind,
see the early_lookup_bdev comment in
block/early-lookup.c for details.
Alternatively this can be "ram" for the legacy initial
ramdisk, "nfs" and "cifs" for root on a network file
system, or "mtd" and "ubi" for mounting from raw flash.
rootdelay= [KNL] Delay (in seconds) to pause before attempting to
mount the root filesystem

View File

@ -112,6 +112,12 @@ pages:
This also leads to limitations: there are only 31-10==21 bits available for a
counter that increments 10 bits at a time.
* Because of that limitation, special handling is applied to the zero pages
when using FOLL_PIN. We only pretend to pin a zero page - we don't alter its
refcount or pincount at all (it is permanent, so there's no need). The
unpinning functions also don't do anything to a zero page. This is
transparent to the caller.
* Callers must specifically request "dma-pinned tracking of pages". In other
words, just calling get_user_pages() will not suffice; a new set of functions,
pin_user_page() and related, must be used.

View File

@ -658,7 +658,7 @@ setup_arch(char **cmdline_p)
#endif
/* Default root filesystem to sda2. */
ROOT_DEV = Root_SDA2;
ROOT_DEV = MKDEV(SCSI_DISK0_MAJOR, 2);
#ifdef CONFIG_EISA
/* FIXME: only set this when we actually have EISA in this box? */

View File

@ -627,7 +627,7 @@ setup_arch (char **cmdline_p)
* is physical disk 1 partition 1 and the Linux root disk is
* physical disk 1 partition 2.
*/
ROOT_DEV = Root_SDA2; /* default to second partition on first drive */
ROOT_DEV = MKDEV(SCSI_DISK0_MAJOR, 2);
if (is_uv_system())
uv_setup(cmdline_p);

View File

@ -76,7 +76,8 @@ int pmac_newworld;
static int current_root_goodness = -1;
#define DEFAULT_ROOT_DEVICE Root_SDA1 /* sda1 - slightly silly choice */
/* sda1 - slightly silly choice */
#define DEFAULT_ROOT_DEVICE MKDEV(SCSI_DISK0_MAJOR, 1)
sys_ctrler_t sys_ctrler = SYS_CTRLER_UNKNOWN;
EXPORT_SYMBOL(sys_ctrler);

View File

@ -108,9 +108,9 @@ static inline void ubd_set_bit(__u64 bit, unsigned char *data)
static DEFINE_MUTEX(ubd_lock);
static DEFINE_MUTEX(ubd_mutex); /* replaces BKL, might not be needed */
static int ubd_open(struct block_device *bdev, fmode_t mode);
static void ubd_release(struct gendisk *disk, fmode_t mode);
static int ubd_ioctl(struct block_device *bdev, fmode_t mode,
static int ubd_open(struct gendisk *disk, blk_mode_t mode);
static void ubd_release(struct gendisk *disk);
static int ubd_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long arg);
static int ubd_getgeo(struct block_device *bdev, struct hd_geometry *geo);
@ -1154,9 +1154,8 @@ static int __init ubd_driver_init(void){
device_initcall(ubd_driver_init);
static int ubd_open(struct block_device *bdev, fmode_t mode)
static int ubd_open(struct gendisk *disk, blk_mode_t mode)
{
struct gendisk *disk = bdev->bd_disk;
struct ubd *ubd_dev = disk->private_data;
int err = 0;
@ -1171,19 +1170,12 @@ static int ubd_open(struct block_device *bdev, fmode_t mode)
}
ubd_dev->count++;
set_disk_ro(disk, !ubd_dev->openflags.w);
/* This should no more be needed. And it didn't work anyway to exclude
* read-write remounting of filesystems.*/
/*if((mode & FMODE_WRITE) && !ubd_dev->openflags.w){
if(--ubd_dev->count == 0) ubd_close_dev(ubd_dev);
err = -EROFS;
}*/
out:
mutex_unlock(&ubd_mutex);
return err;
}
static void ubd_release(struct gendisk *disk, fmode_t mode)
static void ubd_release(struct gendisk *disk)
{
struct ubd *ubd_dev = disk->private_data;
@ -1397,7 +1389,7 @@ static int ubd_getgeo(struct block_device *bdev, struct hd_geometry *geo)
return 0;
}
static int ubd_ioctl(struct block_device *bdev, fmode_t mode,
static int ubd_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long arg)
{
struct ubd *ubd_dev = bdev->bd_disk->private_data;

View File

@ -120,9 +120,9 @@ static void simdisk_submit_bio(struct bio *bio)
bio_endio(bio);
}
static int simdisk_open(struct block_device *bdev, fmode_t mode)
static int simdisk_open(struct gendisk *disk, blk_mode_t mode)
{
struct simdisk *dev = bdev->bd_disk->private_data;
struct simdisk *dev = disk->private_data;
spin_lock(&dev->lock);
++dev->users;
@ -130,7 +130,7 @@ static int simdisk_open(struct block_device *bdev, fmode_t mode)
return 0;
}
static void simdisk_release(struct gendisk *disk, fmode_t mode)
static void simdisk_release(struct gendisk *disk)
{
struct simdisk *dev = disk->private_data;
spin_lock(&dev->lock);

View File

@ -9,7 +9,7 @@ obj-y := bdev.o fops.o bio.o elevator.o blk-core.o blk-sysfs.o \
blk-lib.o blk-mq.o blk-mq-tag.o blk-stat.o \
blk-mq-sysfs.o blk-mq-cpumap.o blk-mq-sched.o ioctl.o \
genhd.o ioprio.o badblocks.o partitions/ blk-rq-qos.o \
disk-events.o blk-ia-ranges.o
disk-events.o blk-ia-ranges.o early-lookup.o
obj-$(CONFIG_BOUNCE) += bounce.o
obj-$(CONFIG_BLK_DEV_BSG_COMMON) += bsg.o

View File

@ -93,7 +93,7 @@ EXPORT_SYMBOL(invalidate_bdev);
* Drop all buffers & page cache for given bdev range. This function bails
* with error if bdev has other exclusive owner (such as filesystem).
*/
int truncate_bdev_range(struct block_device *bdev, fmode_t mode,
int truncate_bdev_range(struct block_device *bdev, blk_mode_t mode,
loff_t lstart, loff_t lend)
{
/*
@ -101,14 +101,14 @@ int truncate_bdev_range(struct block_device *bdev, fmode_t mode,
* while we discard the buffer cache to avoid discarding buffers
* under live filesystem.
*/
if (!(mode & FMODE_EXCL)) {
int err = bd_prepare_to_claim(bdev, truncate_bdev_range);
if (!(mode & BLK_OPEN_EXCL)) {
int err = bd_prepare_to_claim(bdev, truncate_bdev_range, NULL);
if (err)
goto invalidate;
}
truncate_inode_pages_range(bdev->bd_inode->i_mapping, lstart, lend);
if (!(mode & FMODE_EXCL))
if (!(mode & BLK_OPEN_EXCL))
bd_abort_claiming(bdev, truncate_bdev_range);
return 0;
@ -308,7 +308,7 @@ EXPORT_SYMBOL(thaw_bdev);
* pseudo-fs
*/
static __cacheline_aligned_in_smp DEFINE_SPINLOCK(bdev_lock);
static __cacheline_aligned_in_smp DEFINE_MUTEX(bdev_lock);
static struct kmem_cache * bdev_cachep __read_mostly;
static struct inode *bdev_alloc_inode(struct super_block *sb)
@ -415,6 +415,7 @@ struct block_device *bdev_alloc(struct gendisk *disk, u8 partno)
bdev = I_BDEV(inode);
mutex_init(&bdev->bd_fsfreeze_mutex);
spin_lock_init(&bdev->bd_size_lock);
mutex_init(&bdev->bd_holder_lock);
bdev->bd_partno = partno;
bdev->bd_inode = inode;
bdev->bd_queue = disk->queue;
@ -463,39 +464,48 @@ long nr_blockdev_pages(void)
/**
* bd_may_claim - test whether a block device can be claimed
* @bdev: block device of interest
* @whole: whole block device containing @bdev, may equal @bdev
* @holder: holder trying to claim @bdev
* @hops: holder ops
*
* Test whether @bdev can be claimed by @holder.
*
* CONTEXT:
* spin_lock(&bdev_lock).
*
* RETURNS:
* %true if @bdev can be claimed, %false otherwise.
*/
static bool bd_may_claim(struct block_device *bdev, struct block_device *whole,
void *holder)
static bool bd_may_claim(struct block_device *bdev, void *holder,
const struct blk_holder_ops *hops)
{
if (bdev->bd_holder == holder)
return true; /* already a holder */
else if (bdev->bd_holder != NULL)
return false; /* held by someone else */
else if (whole == bdev)
return true; /* is a whole device which isn't held */
struct block_device *whole = bdev_whole(bdev);
else if (whole->bd_holder == bd_may_claim)
return true; /* is a partition of a device that is being partitioned */
else if (whole->bd_holder != NULL)
return false; /* is a partition of a held device */
else
return true; /* is a partition of an un-held device */
lockdep_assert_held(&bdev_lock);
if (bdev->bd_holder) {
/*
* The same holder can always re-claim.
*/
if (bdev->bd_holder == holder) {
if (WARN_ON_ONCE(bdev->bd_holder_ops != hops))
return false;
return true;
}
return false;
}
/*
* If the whole devices holder is set to bd_may_claim, a partition on
* the device is claimed, but not the whole device.
*/
if (whole != bdev &&
whole->bd_holder && whole->bd_holder != bd_may_claim)
return false;
return true;
}
/**
* bd_prepare_to_claim - claim a block device
* @bdev: block device of interest
* @holder: holder trying to claim @bdev
* @hops: holder ops.
*
* Claim @bdev. This function fails if @bdev is already claimed by another
* holder and waits if another claiming is in progress. return, the caller
@ -504,17 +514,18 @@ static bool bd_may_claim(struct block_device *bdev, struct block_device *whole,
* RETURNS:
* 0 if @bdev can be claimed, -EBUSY otherwise.
*/
int bd_prepare_to_claim(struct block_device *bdev, void *holder)
int bd_prepare_to_claim(struct block_device *bdev, void *holder,
const struct blk_holder_ops *hops)
{
struct block_device *whole = bdev_whole(bdev);
if (WARN_ON_ONCE(!holder))
return -EINVAL;
retry:
spin_lock(&bdev_lock);
mutex_lock(&bdev_lock);
/* if someone else claimed, fail */
if (!bd_may_claim(bdev, whole, holder)) {
spin_unlock(&bdev_lock);
if (!bd_may_claim(bdev, holder, hops)) {
mutex_unlock(&bdev_lock);
return -EBUSY;
}
@ -524,7 +535,7 @@ int bd_prepare_to_claim(struct block_device *bdev, void *holder)
DEFINE_WAIT(wait);
prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE);
spin_unlock(&bdev_lock);
mutex_unlock(&bdev_lock);
schedule();
finish_wait(wq, &wait);
goto retry;
@ -532,7 +543,7 @@ int bd_prepare_to_claim(struct block_device *bdev, void *holder)
/* yay, all mine */
whole->bd_claiming = holder;
spin_unlock(&bdev_lock);
mutex_unlock(&bdev_lock);
return 0;
}
EXPORT_SYMBOL_GPL(bd_prepare_to_claim); /* only for the loop driver */
@ -550,16 +561,18 @@ static void bd_clear_claiming(struct block_device *whole, void *holder)
* bd_finish_claiming - finish claiming of a block device
* @bdev: block device of interest
* @holder: holder that has claimed @bdev
* @hops: block device holder operations
*
* Finish exclusive open of a block device. Mark the device as exlusively
* open by the holder and wake up all waiters for exclusive open to finish.
*/
static void bd_finish_claiming(struct block_device *bdev, void *holder)
static void bd_finish_claiming(struct block_device *bdev, void *holder,
const struct blk_holder_ops *hops)
{
struct block_device *whole = bdev_whole(bdev);
spin_lock(&bdev_lock);
BUG_ON(!bd_may_claim(bdev, whole, holder));
mutex_lock(&bdev_lock);
BUG_ON(!bd_may_claim(bdev, holder, hops));
/*
* Note that for a whole device bd_holders will be incremented twice,
* and bd_holder will be set to bd_may_claim before being set to holder
@ -567,9 +580,12 @@ static void bd_finish_claiming(struct block_device *bdev, void *holder)
whole->bd_holders++;
whole->bd_holder = bd_may_claim;
bdev->bd_holders++;
mutex_lock(&bdev->bd_holder_lock);
bdev->bd_holder = holder;
bdev->bd_holder_ops = hops;
mutex_unlock(&bdev->bd_holder_lock);
bd_clear_claiming(whole, holder);
spin_unlock(&bdev_lock);
mutex_unlock(&bdev_lock);
}
/**
@ -583,12 +599,47 @@ static void bd_finish_claiming(struct block_device *bdev, void *holder)
*/
void bd_abort_claiming(struct block_device *bdev, void *holder)
{
spin_lock(&bdev_lock);
mutex_lock(&bdev_lock);
bd_clear_claiming(bdev_whole(bdev), holder);
spin_unlock(&bdev_lock);
mutex_unlock(&bdev_lock);
}
EXPORT_SYMBOL(bd_abort_claiming);
static void bd_end_claim(struct block_device *bdev, void *holder)
{
struct block_device *whole = bdev_whole(bdev);
bool unblock = false;
/*
* Release a claim on the device. The holder fields are protected with
* bdev_lock. open_mutex is used to synchronize disk_holder unlinking.
*/
mutex_lock(&bdev_lock);
WARN_ON_ONCE(bdev->bd_holder != holder);
WARN_ON_ONCE(--bdev->bd_holders < 0);
WARN_ON_ONCE(--whole->bd_holders < 0);
if (!bdev->bd_holders) {
mutex_lock(&bdev->bd_holder_lock);
bdev->bd_holder = NULL;
bdev->bd_holder_ops = NULL;
mutex_unlock(&bdev->bd_holder_lock);
if (bdev->bd_write_holder)
unblock = true;
}
if (!whole->bd_holders)
whole->bd_holder = NULL;
mutex_unlock(&bdev_lock);
/*
* If this was the last claim, remove holder link and unblock evpoll if
* it was a write holder.
*/
if (unblock) {
disk_unblock_events(bdev->bd_disk);
bdev->bd_write_holder = false;
}
}
static void blkdev_flush_mapping(struct block_device *bdev)
{
WARN_ON_ONCE(bdev->bd_holders);
@ -597,13 +648,13 @@ static void blkdev_flush_mapping(struct block_device *bdev)
bdev_write_inode(bdev);
}
static int blkdev_get_whole(struct block_device *bdev, fmode_t mode)
static int blkdev_get_whole(struct block_device *bdev, blk_mode_t mode)
{
struct gendisk *disk = bdev->bd_disk;
int ret;
if (disk->fops->open) {
ret = disk->fops->open(bdev, mode);
ret = disk->fops->open(disk, mode);
if (ret) {
/* avoid ghost partitions on a removed medium */
if (ret == -ENOMEDIUM &&
@ -621,22 +672,19 @@ static int blkdev_get_whole(struct block_device *bdev, fmode_t mode)
return 0;
}
static void blkdev_put_whole(struct block_device *bdev, fmode_t mode)
static void blkdev_put_whole(struct block_device *bdev)
{
if (atomic_dec_and_test(&bdev->bd_openers))
blkdev_flush_mapping(bdev);
if (bdev->bd_disk->fops->release)
bdev->bd_disk->fops->release(bdev->bd_disk, mode);
bdev->bd_disk->fops->release(bdev->bd_disk);
}
static int blkdev_get_part(struct block_device *part, fmode_t mode)
static int blkdev_get_part(struct block_device *part, blk_mode_t mode)
{
struct gendisk *disk = part->bd_disk;
int ret;
if (atomic_read(&part->bd_openers))
goto done;
ret = blkdev_get_whole(bdev_whole(part), mode);
if (ret)
return ret;
@ -645,26 +693,27 @@ static int blkdev_get_part(struct block_device *part, fmode_t mode)
if (!bdev_nr_sectors(part))
goto out_blkdev_put;
disk->open_partitions++;
set_init_blocksize(part);
done:
if (!atomic_read(&part->bd_openers)) {
disk->open_partitions++;
set_init_blocksize(part);
}
atomic_inc(&part->bd_openers);
return 0;
out_blkdev_put:
blkdev_put_whole(bdev_whole(part), mode);
blkdev_put_whole(bdev_whole(part));
return ret;
}
static void blkdev_put_part(struct block_device *part, fmode_t mode)
static void blkdev_put_part(struct block_device *part)
{
struct block_device *whole = bdev_whole(part);
if (!atomic_dec_and_test(&part->bd_openers))
return;
blkdev_flush_mapping(part);
whole->bd_disk->open_partitions--;
blkdev_put_whole(whole, mode);
if (atomic_dec_and_test(&part->bd_openers)) {
blkdev_flush_mapping(part);
whole->bd_disk->open_partitions--;
}
blkdev_put_whole(whole);
}
struct block_device *blkdev_get_no_open(dev_t dev)
@ -695,17 +744,17 @@ void blkdev_put_no_open(struct block_device *bdev)
{
put_device(&bdev->bd_device);
}
/**
* blkdev_get_by_dev - open a block device by device number
* @dev: device number of block device to open
* @mode: FMODE_* mask
* @mode: open mode (BLK_OPEN_*)
* @holder: exclusive holder identifier
* @hops: holder operations
*
* Open the block device described by device number @dev. If @mode includes
* %FMODE_EXCL, the block device is opened with exclusive access. Specifying
* %FMODE_EXCL with a %NULL @holder is invalid. Exclusive opens may nest for
* the same @holder.
* Open the block device described by device number @dev. If @holder is not
* %NULL, the block device is opened with exclusive access. Exclusive opens may
* nest for the same @holder.
*
* Use this interface ONLY if you really do not have anything better - i.e. when
* you are behind a truly sucky interface and all you are given is a device
@ -717,7 +766,8 @@ void blkdev_put_no_open(struct block_device *bdev)
* RETURNS:
* Reference to the block_device on success, ERR_PTR(-errno) on failure.
*/
struct block_device *blkdev_get_by_dev(dev_t dev, fmode_t mode, void *holder)
struct block_device *blkdev_get_by_dev(dev_t dev, blk_mode_t mode, void *holder,
const struct blk_holder_ops *hops)
{
bool unblock_events = true;
struct block_device *bdev;
@ -726,8 +776,8 @@ struct block_device *blkdev_get_by_dev(dev_t dev, fmode_t mode, void *holder)
ret = devcgroup_check_permission(DEVCG_DEV_BLOCK,
MAJOR(dev), MINOR(dev),
((mode & FMODE_READ) ? DEVCG_ACC_READ : 0) |
((mode & FMODE_WRITE) ? DEVCG_ACC_WRITE : 0));
((mode & BLK_OPEN_READ) ? DEVCG_ACC_READ : 0) |
((mode & BLK_OPEN_WRITE) ? DEVCG_ACC_WRITE : 0));
if (ret)
return ERR_PTR(ret);
@ -736,10 +786,16 @@ struct block_device *blkdev_get_by_dev(dev_t dev, fmode_t mode, void *holder)
return ERR_PTR(-ENXIO);
disk = bdev->bd_disk;
if (mode & FMODE_EXCL) {
ret = bd_prepare_to_claim(bdev, holder);
if (holder) {
mode |= BLK_OPEN_EXCL;
ret = bd_prepare_to_claim(bdev, holder, hops);
if (ret)
goto put_blkdev;
} else {
if (WARN_ON_ONCE(mode & BLK_OPEN_EXCL)) {
ret = -EIO;
goto put_blkdev;
}
}
disk_block_events(disk);
@ -756,8 +812,8 @@ struct block_device *blkdev_get_by_dev(dev_t dev, fmode_t mode, void *holder)
ret = blkdev_get_whole(bdev, mode);
if (ret)
goto put_module;
if (mode & FMODE_EXCL) {
bd_finish_claiming(bdev, holder);
if (holder) {
bd_finish_claiming(bdev, holder, hops);
/*
* Block event polling for write claims if requested. Any write
@ -766,7 +822,7 @@ struct block_device *blkdev_get_by_dev(dev_t dev, fmode_t mode, void *holder)
* writeable reference is too fragile given the way @mode is
* used in blkdev_get/put().
*/
if ((mode & FMODE_WRITE) && !bdev->bd_write_holder &&
if ((mode & BLK_OPEN_WRITE) && !bdev->bd_write_holder &&
(disk->event_flags & DISK_EVENT_FLAG_BLOCK_ON_EXCL_WRITE)) {
bdev->bd_write_holder = true;
unblock_events = false;
@ -780,7 +836,7 @@ struct block_device *blkdev_get_by_dev(dev_t dev, fmode_t mode, void *holder)
put_module:
module_put(disk->fops->owner);
abort_claiming:
if (mode & FMODE_EXCL)
if (holder)
bd_abort_claiming(bdev, holder);
mutex_unlock(&disk->open_mutex);
disk_unblock_events(disk);
@ -793,13 +849,13 @@ EXPORT_SYMBOL(blkdev_get_by_dev);
/**
* blkdev_get_by_path - open a block device by name
* @path: path to the block device to open
* @mode: FMODE_* mask
* @mode: open mode (BLK_OPEN_*)
* @holder: exclusive holder identifier
* @hops: holder operations
*
* Open the block device described by the device file at @path. If @mode
* includes %FMODE_EXCL, the block device is opened with exclusive access.
* Specifying %FMODE_EXCL with a %NULL @holder is invalid. Exclusive opens may
* nest for the same @holder.
* Open the block device described by the device file at @path. If @holder is
* not %NULL, the block device is opened with exclusive access. Exclusive opens
* may nest for the same @holder.
*
* CONTEXT:
* Might sleep.
@ -807,8 +863,8 @@ EXPORT_SYMBOL(blkdev_get_by_dev);
* RETURNS:
* Reference to the block_device on success, ERR_PTR(-errno) on failure.
*/
struct block_device *blkdev_get_by_path(const char *path, fmode_t mode,
void *holder)
struct block_device *blkdev_get_by_path(const char *path, blk_mode_t mode,
void *holder, const struct blk_holder_ops *hops)
{
struct block_device *bdev;
dev_t dev;
@ -818,9 +874,9 @@ struct block_device *blkdev_get_by_path(const char *path, fmode_t mode,
if (error)
return ERR_PTR(error);
bdev = blkdev_get_by_dev(dev, mode, holder);
if (!IS_ERR(bdev) && (mode & FMODE_WRITE) && bdev_read_only(bdev)) {
blkdev_put(bdev, mode);
bdev = blkdev_get_by_dev(dev, mode, holder, hops);
if (!IS_ERR(bdev) && (mode & BLK_OPEN_WRITE) && bdev_read_only(bdev)) {
blkdev_put(bdev, holder);
return ERR_PTR(-EACCES);
}
@ -828,7 +884,7 @@ struct block_device *blkdev_get_by_path(const char *path, fmode_t mode,
}
EXPORT_SYMBOL(blkdev_get_by_path);
void blkdev_put(struct block_device *bdev, fmode_t mode)
void blkdev_put(struct block_device *bdev, void *holder)
{
struct gendisk *disk = bdev->bd_disk;
@ -843,36 +899,8 @@ void blkdev_put(struct block_device *bdev, fmode_t mode)
sync_blockdev(bdev);
mutex_lock(&disk->open_mutex);
if (mode & FMODE_EXCL) {
struct block_device *whole = bdev_whole(bdev);
bool bdev_free;
/*
* Release a claim on the device. The holder fields
* are protected with bdev_lock. open_mutex is to
* synchronize disk_holder unlinking.
*/
spin_lock(&bdev_lock);
WARN_ON_ONCE(--bdev->bd_holders < 0);
WARN_ON_ONCE(--whole->bd_holders < 0);
if ((bdev_free = !bdev->bd_holders))
bdev->bd_holder = NULL;
if (!whole->bd_holders)
whole->bd_holder = NULL;
spin_unlock(&bdev_lock);
/*
* If this was the last claim, remove holder link and
* unblock evpoll if it was a write holder.
*/
if (bdev_free && bdev->bd_write_holder) {
disk_unblock_events(disk);
bdev->bd_write_holder = false;
}
}
if (holder)
bd_end_claim(bdev, holder);
/*
* Trigger event checking and tell drivers to flush MEDIA_CHANGE
@ -882,9 +910,9 @@ void blkdev_put(struct block_device *bdev, fmode_t mode)
disk_flush_events(disk, DISK_EVENT_MEDIA_CHANGE);
if (bdev_is_partition(bdev))
blkdev_put_part(bdev, mode);
blkdev_put_part(bdev);
else
blkdev_put_whole(bdev, mode);
blkdev_put_whole(bdev);
mutex_unlock(&disk->open_mutex);
module_put(disk->fops->owner);

View File

@ -5403,6 +5403,10 @@ void bfq_put_queue(struct bfq_queue *bfqq)
if (bfqq->bfqd->last_completed_rq_bfqq == bfqq)
bfqq->bfqd->last_completed_rq_bfqq = NULL;
WARN_ON_ONCE(!list_empty(&bfqq->fifo));
WARN_ON_ONCE(!RB_EMPTY_ROOT(&bfqq->sort_list));
WARN_ON_ONCE(bfqq->dispatched);
kmem_cache_free(bfq_pool, bfqq);
bfqg_and_blkg_put(bfqg);
}
@ -7135,6 +7139,7 @@ static void bfq_exit_queue(struct elevator_queue *e)
{
struct bfq_data *bfqd = e->elevator_data;
struct bfq_queue *bfqq, *n;
unsigned int actuator;
hrtimer_cancel(&bfqd->idle_slice_timer);
@ -7143,6 +7148,10 @@ static void bfq_exit_queue(struct elevator_queue *e)
bfq_deactivate_bfqq(bfqd, bfqq, false, false);
spin_unlock_irq(&bfqd->lock);
for (actuator = 0; actuator < bfqd->num_actuators; actuator++)
WARN_ON_ONCE(bfqd->rq_in_driver[actuator]);
WARN_ON_ONCE(bfqd->tot_rq_in_driver);
hrtimer_cancel(&bfqd->idle_slice_timer);
/* release oom-queue reference to root group */

View File

@ -1138,6 +1138,14 @@ int bio_add_page(struct bio *bio, struct page *page,
}
EXPORT_SYMBOL(bio_add_page);
void bio_add_folio_nofail(struct bio *bio, struct folio *folio, size_t len,
size_t off)
{
WARN_ON_ONCE(len > UINT_MAX);
WARN_ON_ONCE(off > UINT_MAX);
__bio_add_page(bio, &folio->page, len, off);
}
/**
* bio_add_folio - Attempt to add part of a folio to a bio.
* @bio: BIO to add to.
@ -1169,7 +1177,7 @@ void __bio_release_pages(struct bio *bio, bool mark_dirty)
bio_for_each_segment_all(bvec, bio, iter_all) {
if (mark_dirty && !PageCompound(bvec->bv_page))
set_page_dirty_lock(bvec->bv_page);
put_page(bvec->bv_page);
bio_release_page(bio, bvec->bv_page);
}
}
EXPORT_SYMBOL_GPL(__bio_release_pages);
@ -1191,7 +1199,6 @@ void bio_iov_bvec_set(struct bio *bio, struct iov_iter *iter)
bio->bi_io_vec = (struct bio_vec *)iter->bvec;
bio->bi_iter.bi_bvec_done = iter->iov_offset;
bio->bi_iter.bi_size = size;
bio_set_flag(bio, BIO_NO_PAGE_REF);
bio_set_flag(bio, BIO_CLONED);
}
@ -1206,7 +1213,7 @@ static int bio_iov_add_page(struct bio *bio, struct page *page,
}
if (same_page)
put_page(page);
bio_release_page(bio, page);
return 0;
}
@ -1220,7 +1227,7 @@ static int bio_iov_add_zone_append_page(struct bio *bio, struct page *page,
queue_max_zone_append_sectors(q), &same_page) != len)
return -EINVAL;
if (same_page)
put_page(page);
bio_release_page(bio, page);
return 0;
}
@ -1231,10 +1238,10 @@ static int bio_iov_add_zone_append_page(struct bio *bio, struct page *page,
* @bio: bio to add pages to
* @iter: iov iterator describing the region to be mapped
*
* Pins pages from *iter and appends them to @bio's bvec array. The
* pages will have to be released using put_page() when done.
* For multi-segment *iter, this function only adds pages from the
* next non-empty segment of the iov iterator.
* Extracts pages from *iter and appends them to @bio's bvec array. The pages
* will have to be cleaned up in the way indicated by the BIO_PAGE_PINNED flag.
* For a multi-segment *iter, this function only adds pages from the next
* non-empty segment of the iov iterator.
*/
static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
{
@ -1266,9 +1273,9 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
* result to ensure the bio's total size is correct. The remainder of
* the iov data will be picked up in the next bio iteration.
*/
size = iov_iter_get_pages(iter, pages,
UINT_MAX - bio->bi_iter.bi_size,
nr_pages, &offset, extraction_flags);
size = iov_iter_extract_pages(iter, &pages,
UINT_MAX - bio->bi_iter.bi_size,
nr_pages, extraction_flags, &offset);
if (unlikely(size <= 0))
return size ? size : -EFAULT;
@ -1301,7 +1308,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
iov_iter_revert(iter, left);
out:
while (i < nr_pages)
put_page(pages[i++]);
bio_release_page(bio, pages[i++]);
return ret;
}
@ -1336,6 +1343,8 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
return 0;
}
if (iov_iter_extract_will_pin(iter))
bio_set_flag(bio, BIO_PAGE_PINNED);
do {
ret = __bio_iov_iter_get_pages(bio, iter);
} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
@ -1489,8 +1498,8 @@ void bio_set_pages_dirty(struct bio *bio)
* the BIO and re-dirty the pages in process context.
*
* It is expected that bio_check_pages_dirty() will wholly own the BIO from
* here on. It will run one put_page() against each page and will run one
* bio_put() against the BIO.
* here on. It will unpin each page and will run one bio_put() against the
* BIO.
*/
static void bio_dirty_fn(struct work_struct *work);

View File

@ -34,7 +34,7 @@ int blkcg_set_fc_appid(char *app_id, u64 cgrp_id, size_t app_id_len)
* the vmid from the fabric.
* Adding the overhead of a lock is not necessary.
*/
strlcpy(blkcg->fc_app_id, app_id, app_id_len);
strscpy(blkcg->fc_app_id, app_id, app_id_len);
css_put(css);
out_cgrp_put:
cgroup_put(cgrp);

View File

@ -624,8 +624,13 @@ static int blkcg_reset_stats(struct cgroup_subsys_state *css,
struct blkg_iostat_set *bis =
per_cpu_ptr(blkg->iostat_cpu, cpu);
memset(bis, 0, sizeof(*bis));
/* Re-initialize the cleared blkg_iostat_set */
u64_stats_init(&bis->sync);
bis->blkg = blkg;
}
memset(&blkg->iostat, 0, sizeof(blkg->iostat));
u64_stats_init(&blkg->iostat.sync);
for (i = 0; i < BLKCG_MAX_POLS; i++) {
struct blkcg_policy *pol = blkcg_policy[i];
@ -762,6 +767,13 @@ int blkg_conf_open_bdev(struct blkg_conf_ctx *ctx)
return -ENODEV;
}
mutex_lock(&bdev->bd_queue->rq_qos_mutex);
if (!disk_live(bdev->bd_disk)) {
blkdev_put_no_open(bdev);
mutex_unlock(&bdev->bd_queue->rq_qos_mutex);
return -ENODEV;
}
ctx->body = input;
ctx->bdev = bdev;
return 0;
@ -906,6 +918,7 @@ EXPORT_SYMBOL_GPL(blkg_conf_prep);
*/
void blkg_conf_exit(struct blkg_conf_ctx *ctx)
__releases(&ctx->bdev->bd_queue->queue_lock)
__releases(&ctx->bdev->bd_queue->rq_qos_mutex)
{
if (ctx->blkg) {
spin_unlock_irq(&bdev_get_queue(ctx->bdev)->queue_lock);
@ -913,6 +926,7 @@ void blkg_conf_exit(struct blkg_conf_ctx *ctx)
}
if (ctx->bdev) {
mutex_unlock(&ctx->bdev->bd_queue->rq_qos_mutex);
blkdev_put_no_open(ctx->bdev);
ctx->body = NULL;
ctx->bdev = NULL;

View File

@ -420,6 +420,7 @@ struct request_queue *blk_alloc_queue(int node_id)
mutex_init(&q->debugfs_mutex);
mutex_init(&q->sysfs_lock);
mutex_init(&q->sysfs_dir_lock);
mutex_init(&q->rq_qos_mutex);
spin_lock_init(&q->queue_lock);
init_waitqueue_head(&q->mq_freeze_wq);

View File

@ -188,7 +188,9 @@ static void blk_flush_complete_seq(struct request *rq,
case REQ_FSEQ_DATA:
list_move_tail(&rq->flush.list, &fq->flush_data_in_flight);
blk_mq_add_to_requeue_list(rq, BLK_MQ_INSERT_AT_HEAD);
spin_lock(&q->requeue_lock);
list_add_tail(&rq->queuelist, &q->flush_list);
spin_unlock(&q->requeue_lock);
blk_mq_kick_requeue_list(q);
break;
@ -346,7 +348,10 @@ static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq,
smp_wmb();
req_ref_set(flush_rq, 1);
blk_mq_add_to_requeue_list(flush_rq, 0);
spin_lock(&q->requeue_lock);
list_add_tail(&flush_rq->queuelist, &q->flush_list);
spin_unlock(&q->requeue_lock);
blk_mq_kick_requeue_list(q);
}
@ -376,22 +381,29 @@ static enum rq_end_io_ret mq_flush_data_end_io(struct request *rq,
return RQ_END_IO_NONE;
}
/**
* blk_insert_flush - insert a new PREFLUSH/FUA request
* @rq: request to insert
*
* To be called from __elv_add_request() for %ELEVATOR_INSERT_FLUSH insertions.
* or __blk_mq_run_hw_queue() to dispatch request.
* @rq is being submitted. Analyze what needs to be done and put it on the
* right queue.
static void blk_rq_init_flush(struct request *rq)
{
rq->flush.seq = 0;
INIT_LIST_HEAD(&rq->flush.list);
rq->rq_flags |= RQF_FLUSH_SEQ;
rq->flush.saved_end_io = rq->end_io; /* Usually NULL */
rq->end_io = mq_flush_data_end_io;
}
/*
* Insert a PREFLUSH/FUA request into the flush state machine.
* Returns true if the request has been consumed by the flush state machine,
* or false if the caller should continue to process it.
*/
void blk_insert_flush(struct request *rq)
bool blk_insert_flush(struct request *rq)
{
struct request_queue *q = rq->q;
unsigned long fflags = q->queue_flags; /* may change, cache */
unsigned int policy = blk_flush_policy(fflags, rq);
struct blk_flush_queue *fq = blk_get_flush_queue(q, rq->mq_ctx);
struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
/* FLUSH/FUA request must never be merged */
WARN_ON_ONCE(rq->bio != rq->biotail);
/*
* @policy now records what operations need to be done. Adjust
@ -408,45 +420,45 @@ void blk_insert_flush(struct request *rq)
*/
rq->cmd_flags |= REQ_SYNC;
/*
* An empty flush handed down from a stacking driver may
* translate into nothing if the underlying device does not
* advertise a write-back cache. In this case, simply
* complete the request.
*/
if (!policy) {
switch (policy) {
case 0:
/*
* An empty flush handed down from a stacking driver may
* translate into nothing if the underlying device does not
* advertise a write-back cache. In this case, simply
* complete the request.
*/
blk_mq_end_request(rq, 0);
return;
return true;
case REQ_FSEQ_DATA:
/*
* If there's data, but no flush is necessary, the request can
* be processed directly without going through flush machinery.
* Queue for normal execution.
*/
return false;
case REQ_FSEQ_DATA | REQ_FSEQ_POSTFLUSH:
/*
* Initialize the flush fields and completion handler to trigger
* the post flush, and then just pass the command on.
*/
blk_rq_init_flush(rq);
rq->flush.seq |= REQ_FSEQ_POSTFLUSH;
spin_lock_irq(&fq->mq_flush_lock);
list_move_tail(&rq->flush.list, &fq->flush_data_in_flight);
spin_unlock_irq(&fq->mq_flush_lock);
return false;
default:
/*
* Mark the request as part of a flush sequence and submit it
* for further processing to the flush state machine.
*/
blk_rq_init_flush(rq);
spin_lock_irq(&fq->mq_flush_lock);
blk_flush_complete_seq(rq, fq, REQ_FSEQ_ACTIONS & ~policy, 0);
spin_unlock_irq(&fq->mq_flush_lock);
return true;
}
BUG_ON(rq->bio != rq->biotail); /*assumes zero or single bio rq */
/*
* If there's data but flush is not necessary, the request can be
* processed directly without going through flush machinery. Queue
* for normal execution.
*/
if ((policy & REQ_FSEQ_DATA) &&
!(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) {
blk_mq_request_bypass_insert(rq, 0);
blk_mq_run_hw_queue(hctx, false);
return;
}
/*
* @rq should go through flush machinery. Mark it part of flush
* sequence and submit for further processing.
*/
memset(&rq->flush, 0, sizeof(rq->flush));
INIT_LIST_HEAD(&rq->flush.list);
rq->rq_flags |= RQF_FLUSH_SEQ;
rq->flush.saved_end_io = rq->end_io; /* Usually NULL */
rq->end_io = mq_flush_data_end_io;
spin_lock_irq(&fq->mq_flush_lock);
blk_flush_complete_seq(rq, fq, REQ_FSEQ_ACTIONS & ~policy, 0);
spin_unlock_irq(&fq->mq_flush_lock);
}
/**

View File

@ -77,6 +77,10 @@ static void ioc_destroy_icq(struct io_cq *icq)
struct elevator_type *et = q->elevator->type;
lockdep_assert_held(&ioc->lock);
lockdep_assert_held(&q->queue_lock);
if (icq->flags & ICQ_DESTROYED)
return;
radix_tree_delete(&ioc->icq_tree, icq->q->id);
hlist_del_init(&icq->ioc_node);
@ -128,12 +132,7 @@ static void ioc_release_fn(struct work_struct *work)
spin_lock(&q->queue_lock);
spin_lock(&ioc->lock);
/*
* The icq may have been destroyed when the ioc lock
* was released.
*/
if (!(icq->flags & ICQ_DESTROYED))
ioc_destroy_icq(icq);
ioc_destroy_icq(icq);
spin_unlock(&q->queue_lock);
rcu_read_unlock();
@ -171,23 +170,20 @@ static bool ioc_delay_free(struct io_context *ioc)
*/
void ioc_clear_queue(struct request_queue *q)
{
LIST_HEAD(icq_list);
spin_lock_irq(&q->queue_lock);
list_splice_init(&q->icq_list, &icq_list);
spin_unlock_irq(&q->queue_lock);
rcu_read_lock();
while (!list_empty(&icq_list)) {
while (!list_empty(&q->icq_list)) {
struct io_cq *icq =
list_entry(icq_list.next, struct io_cq, q_node);
list_first_entry(&q->icq_list, struct io_cq, q_node);
spin_lock_irq(&icq->ioc->lock);
if (!(icq->flags & ICQ_DESTROYED))
ioc_destroy_icq(icq);
spin_unlock_irq(&icq->ioc->lock);
/*
* Other context won't hold ioc lock to wait for queue_lock, see
* details in ioc_release_fn().
*/
spin_lock(&icq->ioc->lock);
ioc_destroy_icq(icq);
spin_unlock(&icq->ioc->lock);
}
rcu_read_unlock();
spin_unlock_irq(&q->queue_lock);
}
#else /* CONFIG_BLK_ICQ */
static inline void ioc_exit_icqs(struct io_context *ioc)

View File

@ -2455,6 +2455,7 @@ static u64 adjust_inuse_and_calc_cost(struct ioc_gq *iocg, u64 vtime,
u32 hwi, adj_step;
s64 margin;
u64 cost, new_inuse;
unsigned long flags;
current_hweight(iocg, NULL, &hwi);
old_hwi = hwi;
@ -2473,11 +2474,11 @@ static u64 adjust_inuse_and_calc_cost(struct ioc_gq *iocg, u64 vtime,
iocg->inuse == iocg->active)
return cost;
spin_lock_irq(&ioc->lock);
spin_lock_irqsave(&ioc->lock, flags);
/* we own inuse only when @iocg is in the normal active state */
if (iocg->abs_vdebt || list_empty(&iocg->active_list)) {
spin_unlock_irq(&ioc->lock);
spin_unlock_irqrestore(&ioc->lock, flags);
return cost;
}
@ -2498,7 +2499,7 @@ static u64 adjust_inuse_and_calc_cost(struct ioc_gq *iocg, u64 vtime,
} while (time_after64(vtime + cost, now->vnow) &&
iocg->inuse != iocg->active);
spin_unlock_irq(&ioc->lock);
spin_unlock_irqrestore(&ioc->lock, flags);
TRACE_IOCG_PATH(inuse_adjust, iocg, now,
old_inuse, iocg->inuse, old_hwi, hwi);

View File

@ -23,25 +23,28 @@
/**
* enum prio_policy - I/O priority class policy.
* @POLICY_NO_CHANGE: (default) do not modify the I/O priority class.
* @POLICY_NONE_TO_RT: modify IOPRIO_CLASS_NONE into IOPRIO_CLASS_RT.
* @POLICY_PROMOTE_TO_RT: modify no-IOPRIO_CLASS_RT to IOPRIO_CLASS_RT.
* @POLICY_RESTRICT_TO_BE: modify IOPRIO_CLASS_NONE and IOPRIO_CLASS_RT into
* IOPRIO_CLASS_BE.
* @POLICY_ALL_TO_IDLE: change the I/O priority class into IOPRIO_CLASS_IDLE.
* @POLICY_NONE_TO_RT: an alias for POLICY_PROMOTE_TO_RT.
*
* See also <linux/ioprio.h>.
*/
enum prio_policy {
POLICY_NO_CHANGE = 0,
POLICY_NONE_TO_RT = 1,
POLICY_PROMOTE_TO_RT = 1,
POLICY_RESTRICT_TO_BE = 2,
POLICY_ALL_TO_IDLE = 3,
POLICY_NONE_TO_RT = 4,
};
static const char *policy_name[] = {
[POLICY_NO_CHANGE] = "no-change",
[POLICY_NONE_TO_RT] = "none-to-rt",
[POLICY_PROMOTE_TO_RT] = "promote-to-rt",
[POLICY_RESTRICT_TO_BE] = "restrict-to-be",
[POLICY_ALL_TO_IDLE] = "idle",
[POLICY_NONE_TO_RT] = "none-to-rt",
};
static struct blkcg_policy ioprio_policy;
@ -189,6 +192,20 @@ void blkcg_set_ioprio(struct bio *bio)
if (!blkcg || blkcg->prio_policy == POLICY_NO_CHANGE)
return;
if (blkcg->prio_policy == POLICY_PROMOTE_TO_RT ||
blkcg->prio_policy == POLICY_NONE_TO_RT) {
/*
* For RT threads, the default priority level is 4 because
* task_nice is 0. By promoting non-RT io-priority to RT-class
* and default level 4, those requests that are already
* RT-class but need a higher io-priority can use ioprio_set()
* to achieve this.
*/
if (IOPRIO_PRIO_CLASS(bio->bi_ioprio) != IOPRIO_CLASS_RT)
bio->bi_ioprio = IOPRIO_PRIO_VALUE(IOPRIO_CLASS_RT, 4);
return;
}
/*
* Except for IOPRIO_CLASS_NONE, higher I/O priority numbers
* correspond to a lower priority. Hence, the max_t() below selects

View File

@ -281,21 +281,21 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
if (blk_queue_pci_p2pdma(rq->q))
extraction_flags |= ITER_ALLOW_P2PDMA;
if (iov_iter_extract_will_pin(iter))
bio_set_flag(bio, BIO_PAGE_PINNED);
while (iov_iter_count(iter)) {
struct page **pages, *stack_pages[UIO_FASTIOV];
struct page *stack_pages[UIO_FASTIOV];
struct page **pages = stack_pages;
ssize_t bytes;
size_t offs;
int npages;
if (nr_vecs <= ARRAY_SIZE(stack_pages)) {
pages = stack_pages;
bytes = iov_iter_get_pages(iter, pages, LONG_MAX,
nr_vecs, &offs, extraction_flags);
} else {
bytes = iov_iter_get_pages_alloc(iter, &pages,
LONG_MAX, &offs, extraction_flags);
}
if (nr_vecs > ARRAY_SIZE(stack_pages))
pages = NULL;
bytes = iov_iter_extract_pages(iter, &pages, LONG_MAX,
nr_vecs, extraction_flags, &offs);
if (unlikely(bytes <= 0)) {
ret = bytes ? bytes : -EFAULT;
goto out_unmap;
@ -317,7 +317,7 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
if (!bio_add_hw_page(rq->q, bio, page, n, offs,
max_sectors, &same_page)) {
if (same_page)
put_page(page);
bio_release_page(bio, page);
break;
}
@ -329,7 +329,7 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
* release the pages we didn't map into the bio, if any
*/
while (j < npages)
put_page(pages[j++]);
bio_release_page(bio, pages[j++]);
if (pages != stack_pages)
kvfree(pages);
/* couldn't stuff something into bio? */

View File

@ -88,6 +88,7 @@ static const char *const blk_queue_flag_name[] = {
QUEUE_FLAG_NAME(IO_STAT),
QUEUE_FLAG_NAME(NOXMERGES),
QUEUE_FLAG_NAME(ADD_RANDOM),
QUEUE_FLAG_NAME(SYNCHRONOUS),
QUEUE_FLAG_NAME(SAME_FORCE),
QUEUE_FLAG_NAME(INIT_DONE),
QUEUE_FLAG_NAME(STABLE_WRITES),
@ -103,6 +104,8 @@ static const char *const blk_queue_flag_name[] = {
QUEUE_FLAG_NAME(RQ_ALLOC_TIME),
QUEUE_FLAG_NAME(HCTX_ACTIVE),
QUEUE_FLAG_NAME(NOWAIT),
QUEUE_FLAG_NAME(SQ_SCHED),
QUEUE_FLAG_NAME(SKIP_TAGSET_QUIESCE),
};
#undef QUEUE_FLAG_NAME
@ -241,14 +244,14 @@ static const char *const cmd_flag_name[] = {
#define RQF_NAME(name) [ilog2((__force u32)RQF_##name)] = #name
static const char *const rqf_name[] = {
RQF_NAME(STARTED),
RQF_NAME(SOFTBARRIER),
RQF_NAME(FLUSH_SEQ),
RQF_NAME(MIXED_MERGE),
RQF_NAME(MQ_INFLIGHT),
RQF_NAME(DONTPREP),
RQF_NAME(SCHED_TAGS),
RQF_NAME(USE_SCHED),
RQF_NAME(FAILED),
RQF_NAME(QUIET),
RQF_NAME(ELVPRIV),
RQF_NAME(IO_STAT),
RQF_NAME(PM),
RQF_NAME(HASHED),
@ -256,7 +259,6 @@ static const char *const rqf_name[] = {
RQF_NAME(SPECIAL_PAYLOAD),
RQF_NAME(ZONE_WRITE_LOCKED),
RQF_NAME(TIMED_OUT),
RQF_NAME(ELV),
RQF_NAME(RESV),
};
#undef RQF_NAME
@ -399,7 +401,7 @@ static void blk_mq_debugfs_tags_show(struct seq_file *m,
seq_printf(m, "nr_tags=%u\n", tags->nr_tags);
seq_printf(m, "nr_reserved_tags=%u\n", tags->nr_reserved_tags);
seq_printf(m, "active_queues=%d\n",
atomic_read(&tags->active_queues));
READ_ONCE(tags->active_queues));
seq_puts(m, "\nbitmap_tags:\n");
sbitmap_queue_show(&tags->bitmap_tags, m);

View File

@ -37,7 +37,7 @@ static inline bool
blk_mq_sched_allow_merge(struct request_queue *q, struct request *rq,
struct bio *bio)
{
if (rq->rq_flags & RQF_ELV) {
if (rq->rq_flags & RQF_USE_SCHED) {
struct elevator_queue *e = q->elevator;
if (e->type->ops.allow_merge)
@ -48,7 +48,7 @@ blk_mq_sched_allow_merge(struct request_queue *q, struct request *rq,
static inline void blk_mq_sched_completed_request(struct request *rq, u64 now)
{
if (rq->rq_flags & RQF_ELV) {
if (rq->rq_flags & RQF_USE_SCHED) {
struct elevator_queue *e = rq->q->elevator;
if (e->type->ops.completed_request)
@ -58,11 +58,11 @@ static inline void blk_mq_sched_completed_request(struct request *rq, u64 now)
static inline void blk_mq_sched_requeue_request(struct request *rq)
{
if (rq->rq_flags & RQF_ELV) {
if (rq->rq_flags & RQF_USE_SCHED) {
struct request_queue *q = rq->q;
struct elevator_queue *e = q->elevator;
if ((rq->rq_flags & RQF_ELVPRIV) && e->type->ops.requeue_request)
if (e->type->ops.requeue_request)
e->type->ops.requeue_request(rq);
}
}

View File

@ -38,6 +38,7 @@ static void blk_mq_update_wake_batch(struct blk_mq_tags *tags,
void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx)
{
unsigned int users;
struct blk_mq_tags *tags = hctx->tags;
/*
* calling test_bit() prior to test_and_set_bit() is intentional,
@ -55,9 +56,11 @@ void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx)
return;
}
users = atomic_inc_return(&hctx->tags->active_queues);
blk_mq_update_wake_batch(hctx->tags, users);
spin_lock_irq(&tags->lock);
users = tags->active_queues + 1;
WRITE_ONCE(tags->active_queues, users);
blk_mq_update_wake_batch(tags, users);
spin_unlock_irq(&tags->lock);
}
/*
@ -90,9 +93,11 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx)
return;
}
users = atomic_dec_return(&tags->active_queues);
spin_lock_irq(&tags->lock);
users = tags->active_queues - 1;
WRITE_ONCE(tags->active_queues, users);
blk_mq_update_wake_batch(tags, users);
spin_unlock_irq(&tags->lock);
blk_mq_tag_wakeup_all(tags, false);
}

View File

@ -45,6 +45,8 @@
static DEFINE_PER_CPU(struct llist_head, blk_cpu_done);
static void blk_mq_insert_request(struct request *rq, blk_insert_t flags);
static void blk_mq_request_bypass_insert(struct request *rq,
blk_insert_t flags);
static void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
struct list_head *list);
@ -354,12 +356,12 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
data->rq_flags |= RQF_IO_STAT;
rq->rq_flags = data->rq_flags;
if (!(data->rq_flags & RQF_ELV)) {
rq->tag = tag;
rq->internal_tag = BLK_MQ_NO_TAG;
} else {
if (data->rq_flags & RQF_SCHED_TAGS) {
rq->tag = BLK_MQ_NO_TAG;
rq->internal_tag = tag;
} else {
rq->tag = tag;
rq->internal_tag = BLK_MQ_NO_TAG;
}
rq->timeout = 0;
@ -386,17 +388,14 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
WRITE_ONCE(rq->deadline, 0);
req_ref_set(rq, 1);
if (rq->rq_flags & RQF_ELV) {
if (rq->rq_flags & RQF_USE_SCHED) {
struct elevator_queue *e = data->q->elevator;
INIT_HLIST_NODE(&rq->hash);
RB_CLEAR_NODE(&rq->rb_node);
if (!op_is_flush(data->cmd_flags) &&
e->type->ops.prepare_request) {
if (e->type->ops.prepare_request)
e->type->ops.prepare_request(rq);
rq->rq_flags |= RQF_ELVPRIV;
}
}
return rq;
@ -449,26 +448,32 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data)
data->flags |= BLK_MQ_REQ_NOWAIT;
if (q->elevator) {
struct elevator_queue *e = q->elevator;
data->rq_flags |= RQF_ELV;
/*
* All requests use scheduler tags when an I/O scheduler is
* enabled for the queue.
*/
data->rq_flags |= RQF_SCHED_TAGS;
/*
* Flush/passthrough requests are special and go directly to the
* dispatch list. Don't include reserved tags in the
* limiting, as it isn't useful.
* dispatch list.
*/
if (!op_is_flush(data->cmd_flags) &&
!blk_op_is_passthrough(data->cmd_flags) &&
e->type->ops.limit_depth &&
!(data->flags & BLK_MQ_REQ_RESERVED))
e->type->ops.limit_depth(data->cmd_flags, data);
if ((data->cmd_flags & REQ_OP_MASK) != REQ_OP_FLUSH &&
!blk_op_is_passthrough(data->cmd_flags)) {
struct elevator_mq_ops *ops = &q->elevator->type->ops;
WARN_ON_ONCE(data->flags & BLK_MQ_REQ_RESERVED);
data->rq_flags |= RQF_USE_SCHED;
if (ops->limit_depth)
ops->limit_depth(data->cmd_flags, data);
}
}
retry:
data->ctx = blk_mq_get_ctx(q);
data->hctx = blk_mq_map_queue(q, data->cmd_flags, data->ctx);
if (!(data->rq_flags & RQF_ELV))
if (!(data->rq_flags & RQF_SCHED_TAGS))
blk_mq_tag_busy(data->hctx);
if (data->flags & BLK_MQ_REQ_RESERVED)
@ -648,10 +653,10 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
goto out_queue_exit;
data.ctx = __blk_mq_get_ctx(q, cpu);
if (!q->elevator)
blk_mq_tag_busy(data.hctx);
if (q->elevator)
data.rq_flags |= RQF_SCHED_TAGS;
else
data.rq_flags |= RQF_ELV;
blk_mq_tag_busy(data.hctx);
if (flags & BLK_MQ_REQ_RESERVED)
data.rq_flags |= RQF_RESV;
@ -699,7 +704,7 @@ void blk_mq_free_request(struct request *rq)
{
struct request_queue *q = rq->q;
if ((rq->rq_flags & RQF_ELVPRIV) &&
if ((rq->rq_flags & RQF_USE_SCHED) &&
q->elevator->type->ops.finish_request)
q->elevator->type->ops.finish_request(rq);
@ -957,6 +962,8 @@ EXPORT_SYMBOL_GPL(blk_update_request);
static inline void blk_account_io_done(struct request *req, u64 now)
{
trace_block_io_done(req);
/*
* Account IO completion. flush_rq isn't accounted as a
* normal IO on queueing nor completion. Accounting the
@ -976,6 +983,8 @@ static inline void blk_account_io_done(struct request *req, u64 now)
static inline void blk_account_io_start(struct request *req)
{
trace_block_io_start(req);
if (blk_do_io_stat(req)) {
/*
* All non-passthrough requests are created from a bio with one
@ -1176,8 +1185,9 @@ bool blk_mq_complete_request_remote(struct request *rq)
* or a polled request, always complete locally,
* it's pointless to redirect the completion.
*/
if (rq->mq_hctx->nr_ctx == 1 ||
rq->cmd_flags & REQ_POLLED)
if ((rq->mq_hctx->nr_ctx == 1 &&
rq->mq_ctx->cpu == raw_smp_processor_id()) ||
rq->cmd_flags & REQ_POLLED)
return false;
if (blk_mq_complete_need_ipi(rq)) {
@ -1270,7 +1280,7 @@ static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq)
if (!plug->multiple_queues && last && last->q != rq->q)
plug->multiple_queues = true;
if (!plug->has_elevator && (rq->rq_flags & RQF_ELV))
if (!plug->has_elevator && (rq->rq_flags & RQF_USE_SCHED))
plug->has_elevator = true;
rq->rq_next = NULL;
rq_list_add(&plug->mq_list, rq);
@ -1411,13 +1421,16 @@ static void __blk_mq_requeue_request(struct request *rq)
void blk_mq_requeue_request(struct request *rq, bool kick_requeue_list)
{
struct request_queue *q = rq->q;
unsigned long flags;
__blk_mq_requeue_request(rq);
/* this request will be re-inserted to io scheduler queue */
blk_mq_sched_requeue_request(rq);
blk_mq_add_to_requeue_list(rq, BLK_MQ_INSERT_AT_HEAD);
spin_lock_irqsave(&q->requeue_lock, flags);
list_add_tail(&rq->queuelist, &q->requeue_list);
spin_unlock_irqrestore(&q->requeue_lock, flags);
if (kick_requeue_list)
blk_mq_kick_requeue_list(q);
@ -1429,13 +1442,16 @@ static void blk_mq_requeue_work(struct work_struct *work)
struct request_queue *q =
container_of(work, struct request_queue, requeue_work.work);
LIST_HEAD(rq_list);
struct request *rq, *next;
LIST_HEAD(flush_list);
struct request *rq;
spin_lock_irq(&q->requeue_lock);
list_splice_init(&q->requeue_list, &rq_list);
list_splice_init(&q->flush_list, &flush_list);
spin_unlock_irq(&q->requeue_lock);
list_for_each_entry_safe(rq, next, &rq_list, queuelist) {
while (!list_empty(&rq_list)) {
rq = list_entry(rq_list.next, struct request, queuelist);
/*
* If RQF_DONTPREP ist set, the request has been started by the
* driver already and might have driver-specific data allocated
@ -1443,18 +1459,16 @@ static void blk_mq_requeue_work(struct work_struct *work)
* block layer merges for the request.
*/
if (rq->rq_flags & RQF_DONTPREP) {
rq->rq_flags &= ~RQF_SOFTBARRIER;
list_del_init(&rq->queuelist);
blk_mq_request_bypass_insert(rq, 0);
} else if (rq->rq_flags & RQF_SOFTBARRIER) {
rq->rq_flags &= ~RQF_SOFTBARRIER;
} else {
list_del_init(&rq->queuelist);
blk_mq_insert_request(rq, BLK_MQ_INSERT_AT_HEAD);
}
}
while (!list_empty(&rq_list)) {
rq = list_entry(rq_list.next, struct request, queuelist);
while (!list_empty(&flush_list)) {
rq = list_entry(flush_list.next, struct request, queuelist);
list_del_init(&rq->queuelist);
blk_mq_insert_request(rq, 0);
}
@ -1462,27 +1476,6 @@ static void blk_mq_requeue_work(struct work_struct *work)
blk_mq_run_hw_queues(q, false);
}
void blk_mq_add_to_requeue_list(struct request *rq, blk_insert_t insert_flags)
{
struct request_queue *q = rq->q;
unsigned long flags;
/*
* We abuse this flag that is otherwise used by the I/O scheduler to
* request head insertion from the workqueue.
*/
BUG_ON(rq->rq_flags & RQF_SOFTBARRIER);
spin_lock_irqsave(&q->requeue_lock, flags);
if (insert_flags & BLK_MQ_INSERT_AT_HEAD) {
rq->rq_flags |= RQF_SOFTBARRIER;
list_add(&rq->queuelist, &q->requeue_list);
} else {
list_add_tail(&rq->queuelist, &q->requeue_list);
}
spin_unlock_irqrestore(&q->requeue_lock, flags);
}
void blk_mq_kick_requeue_list(struct request_queue *q)
{
kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, &q->requeue_work, 0);
@ -2427,7 +2420,7 @@ static void blk_mq_run_work_fn(struct work_struct *work)
* Should only be used carefully, when the caller knows we want to
* bypass a potential IO scheduler on the target device.
*/
void blk_mq_request_bypass_insert(struct request *rq, blk_insert_t flags)
static void blk_mq_request_bypass_insert(struct request *rq, blk_insert_t flags)
{
struct blk_mq_hw_ctx *hctx = rq->mq_hctx;
@ -2492,7 +2485,7 @@ static void blk_mq_insert_request(struct request *rq, blk_insert_t flags)
* dispatch it given we prioritize requests in hctx->dispatch.
*/
blk_mq_request_bypass_insert(rq, flags);
} else if (rq->rq_flags & RQF_FLUSH_SEQ) {
} else if (req_op(rq) == REQ_OP_FLUSH) {
/*
* Firstly normal IO request is inserted to scheduler queue or
* sw queue, meantime we add flush request to dispatch queue(
@ -2622,7 +2615,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
return;
}
if ((rq->rq_flags & RQF_ELV) || !blk_mq_get_budget_and_tag(rq)) {
if ((rq->rq_flags & RQF_USE_SCHED) || !blk_mq_get_budget_and_tag(rq)) {
blk_mq_insert_request(rq, 0);
blk_mq_run_hw_queue(hctx, false);
return;
@ -2711,6 +2704,7 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched)
struct request *requeue_list = NULL;
struct request **requeue_lastp = &requeue_list;
unsigned int depth = 0;
bool is_passthrough = false;
LIST_HEAD(list);
do {
@ -2719,7 +2713,9 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched)
if (!this_hctx) {
this_hctx = rq->mq_hctx;
this_ctx = rq->mq_ctx;
} else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx) {
is_passthrough = blk_rq_is_passthrough(rq);
} else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx ||
is_passthrough != blk_rq_is_passthrough(rq)) {
rq_list_add_tail(&requeue_lastp, rq);
continue;
}
@ -2731,7 +2727,13 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched)
trace_block_unplug(this_hctx->queue, depth, !from_sched);
percpu_ref_get(&this_hctx->queue->q_usage_counter);
if (this_hctx->queue->elevator) {
/* passthrough requests should never be issued to the I/O scheduler */
if (is_passthrough) {
spin_lock(&this_hctx->lock);
list_splice_tail_init(&list, &this_hctx->dispatch);
spin_unlock(&this_hctx->lock);
blk_mq_run_hw_queue(this_hctx, from_sched);
} else if (this_hctx->queue->elevator) {
this_hctx->queue->elevator->type->ops.insert_requests(this_hctx,
&list, 0);
blk_mq_run_hw_queue(this_hctx, from_sched);
@ -2970,10 +2972,8 @@ void blk_mq_submit_bio(struct bio *bio)
return;
}
if (op_is_flush(bio->bi_opf)) {
blk_insert_flush(rq);
if (op_is_flush(bio->bi_opf) && blk_insert_flush(rq))
return;
}
if (plug) {
blk_add_rq_to_plug(plug, rq);
@ -2981,7 +2981,7 @@ void blk_mq_submit_bio(struct bio *bio)
}
hctx = rq->mq_hctx;
if ((rq->rq_flags & RQF_ELV) ||
if ((rq->rq_flags & RQF_USE_SCHED) ||
(hctx->dispatch_busy && (q->nr_hw_queues == 1 || !is_sync))) {
blk_mq_insert_request(rq, 0);
blk_mq_run_hw_queue(hctx, true);
@ -4232,6 +4232,7 @@ int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
blk_mq_update_poll_flag(q);
INIT_DELAYED_WORK(&q->requeue_work, blk_mq_requeue_work);
INIT_LIST_HEAD(&q->flush_list);
INIT_LIST_HEAD(&q->requeue_list);
spin_lock_init(&q->requeue_lock);
@ -4608,9 +4609,6 @@ static bool blk_mq_elv_switch_none(struct list_head *head,
{
struct blk_mq_qe_pair *qe;
if (!q->elevator)
return true;
qe = kmalloc(sizeof(*qe), GFP_NOIO | __GFP_NOWARN | __GFP_NORETRY);
if (!qe)
return false;
@ -4618,6 +4616,12 @@ static bool blk_mq_elv_switch_none(struct list_head *head,
/* q->elevator needs protection from ->sysfs_lock */
mutex_lock(&q->sysfs_lock);
/* the check has to be done with holding sysfs_lock */
if (!q->elevator) {
kfree(qe);
goto unlock;
}
INIT_LIST_HEAD(&qe->node);
qe->q = q;
qe->type = q->elevator->type;
@ -4625,6 +4629,7 @@ static bool blk_mq_elv_switch_none(struct list_head *head,
__elevator_get(qe->type);
list_add(&qe->node, head);
elevator_disable(q);
unlock:
mutex_unlock(&q->sysfs_lock);
return true;

View File

@ -47,7 +47,6 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr);
void blk_mq_wake_waiters(struct request_queue *q);
bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *,
unsigned int);
void blk_mq_add_to_requeue_list(struct request *rq, blk_insert_t insert_flags);
void blk_mq_flush_busy_ctxs(struct blk_mq_hw_ctx *hctx, struct list_head *list);
struct request *blk_mq_dequeue_from_ctx(struct blk_mq_hw_ctx *hctx,
struct blk_mq_ctx *start);
@ -64,10 +63,6 @@ struct blk_mq_tags *blk_mq_alloc_map_and_rqs(struct blk_mq_tag_set *set,
void blk_mq_free_map_and_rqs(struct blk_mq_tag_set *set,
struct blk_mq_tags *tags,
unsigned int hctx_idx);
/*
* Internal helpers for request insertion into sw queues
*/
void blk_mq_request_bypass_insert(struct request *rq, blk_insert_t flags);
/*
* CPU -> queue mappings
@ -226,9 +221,9 @@ static inline bool blk_mq_is_shared_tags(unsigned int flags)
static inline struct blk_mq_tags *blk_mq_tags_from_data(struct blk_mq_alloc_data *data)
{
if (!(data->rq_flags & RQF_ELV))
return data->hctx->tags;
return data->hctx->sched_tags;
if (data->rq_flags & RQF_SCHED_TAGS)
return data->hctx->sched_tags;
return data->hctx->tags;
}
static inline bool blk_mq_hctx_stopped(struct blk_mq_hw_ctx *hctx)
@ -417,8 +412,7 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx,
return true;
}
users = atomic_read(&hctx->tags->active_queues);
users = READ_ONCE(hctx->tags->active_queues);
if (!users)
return true;

View File

@ -288,11 +288,13 @@ void rq_qos_wait(struct rq_wait *rqw, void *private_data,
void rq_qos_exit(struct request_queue *q)
{
mutex_lock(&q->rq_qos_mutex);
while (q->rq_qos) {
struct rq_qos *rqos = q->rq_qos;
q->rq_qos = rqos->next;
rqos->ops->exit(rqos);
}
mutex_unlock(&q->rq_qos_mutex);
}
int rq_qos_add(struct rq_qos *rqos, struct gendisk *disk, enum rq_qos_id id,
@ -300,6 +302,8 @@ int rq_qos_add(struct rq_qos *rqos, struct gendisk *disk, enum rq_qos_id id,
{
struct request_queue *q = disk->queue;
lockdep_assert_held(&q->rq_qos_mutex);
rqos->disk = disk;
rqos->id = id;
rqos->ops = ops;
@ -307,18 +311,13 @@ int rq_qos_add(struct rq_qos *rqos, struct gendisk *disk, enum rq_qos_id id,
/*
* No IO can be in-flight when adding rqos, so freeze queue, which
* is fine since we only support rq_qos for blk-mq queue.
*
* Reuse ->queue_lock for protecting against other concurrent
* rq_qos adding/deleting
*/
blk_mq_freeze_queue(q);
spin_lock_irq(&q->queue_lock);
if (rq_qos_id(q, rqos->id))
goto ebusy;
rqos->next = q->rq_qos;
q->rq_qos = rqos;
spin_unlock_irq(&q->queue_lock);
blk_mq_unfreeze_queue(q);
@ -330,7 +329,6 @@ int rq_qos_add(struct rq_qos *rqos, struct gendisk *disk, enum rq_qos_id id,
return 0;
ebusy:
spin_unlock_irq(&q->queue_lock);
blk_mq_unfreeze_queue(q);
return -EBUSY;
}
@ -340,21 +338,15 @@ void rq_qos_del(struct rq_qos *rqos)
struct request_queue *q = rqos->disk->queue;
struct rq_qos **cur;
/*
* See comment in rq_qos_add() about freezing queue & using
* ->queue_lock.
*/
blk_mq_freeze_queue(q);
lockdep_assert_held(&q->rq_qos_mutex);
spin_lock_irq(&q->queue_lock);
blk_mq_freeze_queue(q);
for (cur = &q->rq_qos; *cur; cur = &(*cur)->next) {
if (*cur == rqos) {
*cur = rqos->next;
break;
}
}
spin_unlock_irq(&q->queue_lock);
blk_mq_unfreeze_queue(q);
mutex_lock(&q->debugfs_mutex);

View File

@ -944,7 +944,9 @@ int wbt_init(struct gendisk *disk)
/*
* Assign rwb and add the stats callback.
*/
mutex_lock(&q->rq_qos_mutex);
ret = rq_qos_add(&rwb->rqos, disk, RQ_QOS_WBT, &wbt_rqos_ops);
mutex_unlock(&q->rq_qos_mutex);
if (ret)
goto err_free;

View File

@ -57,16 +57,10 @@ EXPORT_SYMBOL_GPL(blk_zone_cond_str);
*/
bool blk_req_needs_zone_write_lock(struct request *rq)
{
if (blk_rq_is_passthrough(rq))
return false;
if (!rq->q->disk->seq_zones_wlock)
return false;
if (bdev_op_is_zoned_write(rq->q->disk->part0, req_op(rq)))
return blk_rq_zone_is_seq(rq);
return false;
return blk_rq_is_seq_zoned_write(rq);
}
EXPORT_SYMBOL_GPL(blk_req_needs_zone_write_lock);
@ -329,8 +323,8 @@ static int blkdev_copy_zone_to_user(struct blk_zone *zone, unsigned int idx,
* BLKREPORTZONE ioctl processing.
* Called from blkdev_ioctl.
*/
int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
unsigned int cmd, unsigned long arg)
int blkdev_report_zones_ioctl(struct block_device *bdev, unsigned int cmd,
unsigned long arg)
{
void __user *argp = (void __user *)arg;
struct zone_report_args args;
@ -362,8 +356,8 @@ int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode,
return 0;
}
static int blkdev_truncate_zone_range(struct block_device *bdev, fmode_t mode,
const struct blk_zone_range *zrange)
static int blkdev_truncate_zone_range(struct block_device *bdev,
blk_mode_t mode, const struct blk_zone_range *zrange)
{
loff_t start, end;
@ -382,7 +376,7 @@ static int blkdev_truncate_zone_range(struct block_device *bdev, fmode_t mode,
* BLKRESETZONE, BLKOPENZONE, BLKCLOSEZONE and BLKFINISHZONE ioctl processing.
* Called from blkdev_ioctl.
*/
int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long arg)
{
void __user *argp = (void __user *)arg;
@ -396,7 +390,7 @@ int blkdev_zone_mgmt_ioctl(struct block_device *bdev, fmode_t mode,
if (!bdev_is_zoned(bdev))
return -ENOTTY;
if (!(mode & FMODE_WRITE))
if (!(mode & BLK_OPEN_WRITE))
return -EBADF;
if (copy_from_user(&zrange, argp, sizeof(struct blk_zone_range)))

View File

@ -269,7 +269,7 @@ bool blk_bio_list_merge(struct request_queue *q, struct list_head *list,
*/
#define ELV_ON_HASH(rq) ((rq)->rq_flags & RQF_HASHED)
void blk_insert_flush(struct request *rq);
bool blk_insert_flush(struct request *rq);
int elevator_switch(struct request_queue *q, struct elevator_type *new_e);
void elevator_disable(struct request_queue *q);
@ -394,10 +394,27 @@ static inline struct bio *blk_queue_bounce(struct bio *bio,
#ifdef CONFIG_BLK_DEV_ZONED
void disk_free_zone_bitmaps(struct gendisk *disk);
void disk_clear_zone_settings(struct gendisk *disk);
#else
int blkdev_report_zones_ioctl(struct block_device *bdev, unsigned int cmd,
unsigned long arg);
int blkdev_zone_mgmt_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long arg);
#else /* CONFIG_BLK_DEV_ZONED */
static inline void disk_free_zone_bitmaps(struct gendisk *disk) {}
static inline void disk_clear_zone_settings(struct gendisk *disk) {}
#endif
static inline int blkdev_report_zones_ioctl(struct block_device *bdev,
unsigned int cmd, unsigned long arg)
{
return -ENOTTY;
}
static inline int blkdev_zone_mgmt_ioctl(struct block_device *bdev,
blk_mode_t mode, unsigned int cmd, unsigned long arg)
{
return -ENOTTY;
}
#endif /* CONFIG_BLK_DEV_ZONED */
struct block_device *bdev_alloc(struct gendisk *disk, u8 partno);
void bdev_add(struct block_device *bdev, dev_t dev);
int blk_alloc_ext_minor(void);
void blk_free_ext_minor(unsigned int minor);
@ -409,7 +426,7 @@ int bdev_add_partition(struct gendisk *disk, int partno, sector_t start,
int bdev_del_partition(struct gendisk *disk, int partno);
int bdev_resize_partition(struct gendisk *disk, int partno, sector_t start,
sector_t length);
void blk_drop_partitions(struct gendisk *disk);
void drop_partition(struct block_device *part);
void bdev_set_nr_sectors(struct block_device *bdev, sector_t sectors);
@ -420,9 +437,19 @@ int bio_add_hw_page(struct request_queue *q, struct bio *bio,
struct page *page, unsigned int len, unsigned int offset,
unsigned int max_sectors, bool *same_page);
/*
* Clean up a page appropriately, where the page may be pinned, may have a
* ref taken on it or neither.
*/
static inline void bio_release_page(struct bio *bio, struct page *page)
{
if (bio_flagged(bio, BIO_PAGE_PINNED))
unpin_user_page(page);
}
struct request_queue *blk_alloc_queue(int node_id);
int disk_scan_partitions(struct gendisk *disk, fmode_t mode);
int disk_scan_partitions(struct gendisk *disk, blk_mode_t mode);
int disk_alloc_events(struct gendisk *disk);
void disk_add_events(struct gendisk *disk);
@ -437,6 +464,9 @@ extern struct device_attribute dev_attr_events_poll_msecs;
extern struct attribute_group blk_trace_attr_group;
blk_mode_t file_to_blk_mode(struct file *file);
int truncate_bdev_range(struct block_device *bdev, blk_mode_t mode,
loff_t lstart, loff_t lend);
long blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg);
long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg);

View File

@ -26,7 +26,7 @@ struct bsg_set {
};
static int bsg_transport_sg_io_fn(struct request_queue *q, struct sg_io_v4 *hdr,
fmode_t mode, unsigned int timeout)
bool open_for_write, unsigned int timeout)
{
struct bsg_job *job;
struct request *rq;

View File

@ -39,7 +39,7 @@ static inline struct bsg_device *to_bsg_device(struct inode *inode)
#define BSG_MAX_DEVS 32768
static DEFINE_IDA(bsg_minor_ida);
static struct class *bsg_class;
static const struct class bsg_class;
static int bsg_major;
static unsigned int bsg_timeout(struct bsg_device *bd, struct sg_io_v4 *hdr)
@ -54,7 +54,8 @@ static unsigned int bsg_timeout(struct bsg_device *bd, struct sg_io_v4 *hdr)
return max_t(unsigned int, timeout, BLK_MIN_SG_TIMEOUT);
}
static int bsg_sg_io(struct bsg_device *bd, fmode_t mode, void __user *uarg)
static int bsg_sg_io(struct bsg_device *bd, bool open_for_write,
void __user *uarg)
{
struct sg_io_v4 hdr;
int ret;
@ -63,7 +64,8 @@ static int bsg_sg_io(struct bsg_device *bd, fmode_t mode, void __user *uarg)
return -EFAULT;
if (hdr.guard != 'Q')
return -EINVAL;
ret = bd->sg_io_fn(bd->queue, &hdr, mode, bsg_timeout(bd, &hdr));
ret = bd->sg_io_fn(bd->queue, &hdr, open_for_write,
bsg_timeout(bd, &hdr));
if (!ret && copy_to_user(uarg, &hdr, sizeof(hdr)))
return -EFAULT;
return ret;
@ -146,7 +148,7 @@ static long bsg_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
case SG_EMULATED_HOST:
return put_user(1, intp);
case SG_IO:
return bsg_sg_io(bd, file->f_mode, uarg);
return bsg_sg_io(bd, file->f_mode & FMODE_WRITE, uarg);
case SCSI_IOCTL_SEND_COMMAND:
pr_warn_ratelimited("%s: calling unsupported SCSI_IOCTL_SEND_COMMAND\n",
current->comm);
@ -206,7 +208,7 @@ struct bsg_device *bsg_register_queue(struct request_queue *q,
return ERR_PTR(ret);
}
bd->device.devt = MKDEV(bsg_major, ret);
bd->device.class = bsg_class;
bd->device.class = &bsg_class;
bd->device.parent = parent;
bd->device.release = bsg_device_release;
dev_set_name(&bd->device, "%s", name);
@ -240,15 +242,19 @@ static char *bsg_devnode(const struct device *dev, umode_t *mode)
return kasprintf(GFP_KERNEL, "bsg/%s", dev_name(dev));
}
static const struct class bsg_class = {
.name = "bsg",
.devnode = bsg_devnode,
};
static int __init bsg_init(void)
{
dev_t devid;
int ret;
bsg_class = class_create("bsg");
if (IS_ERR(bsg_class))
return PTR_ERR(bsg_class);
bsg_class->devnode = bsg_devnode;
ret = class_register(&bsg_class);
if (ret)
return ret;
ret = alloc_chrdev_region(&devid, 0, BSG_MAX_DEVS, "bsg");
if (ret)
@ -260,7 +266,7 @@ static int __init bsg_init(void)
return 0;
destroy_bsg_class:
class_destroy(bsg_class);
class_unregister(&bsg_class);
return ret;
}

View File

@ -263,31 +263,31 @@ static unsigned int disk_clear_events(struct gendisk *disk, unsigned int mask)
}
/**
* bdev_check_media_change - check if a removable media has been changed
* @bdev: block device to check
* disk_check_media_change - check if a removable media has been changed
* @disk: gendisk to check
*
* Check whether a removable media has been changed, and attempt to free all
* dentries and inodes and invalidates all block device page cache entries in
* that case.
*
* Returns %true if the block device changed, or %false if not.
* Returns %true if the media has changed, or %false if not.
*/
bool bdev_check_media_change(struct block_device *bdev)
bool disk_check_media_change(struct gendisk *disk)
{
unsigned int events;
events = disk_clear_events(bdev->bd_disk, DISK_EVENT_MEDIA_CHANGE |
events = disk_clear_events(disk, DISK_EVENT_MEDIA_CHANGE |
DISK_EVENT_EJECT_REQUEST);
if (!(events & DISK_EVENT_MEDIA_CHANGE))
return false;
if (__invalidate_device(bdev, true))
if (__invalidate_device(disk->part0, true))
pr_warn("VFS: busy inodes on changed media %s\n",
bdev->bd_disk->disk_name);
set_bit(GD_NEED_PART_SCAN, &bdev->bd_disk->state);
disk->disk_name);
set_bit(GD_NEED_PART_SCAN, &disk->state);
return true;
}
EXPORT_SYMBOL(bdev_check_media_change);
EXPORT_SYMBOL(disk_check_media_change);
/**
* disk_force_media_change - force a media change event
@ -307,6 +307,7 @@ bool disk_force_media_change(struct gendisk *disk, unsigned int events)
if (!(events & DISK_EVENT_MEDIA_CHANGE))
return false;
inc_diskseq(disk);
if (__invalidate_device(disk->part0, true))
pr_warn("VFS: busy inodes on changed media %s\n",
disk->disk_name);

316
block/early-lookup.c Normal file
View File

@ -0,0 +1,316 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* Code for looking up block devices in the early boot code before mounting the
* root file system.
*/
#include <linux/blkdev.h>
#include <linux/ctype.h>
struct uuidcmp {
const char *uuid;
int len;
};
/**
* match_dev_by_uuid - callback for finding a partition using its uuid
* @dev: device passed in by the caller
* @data: opaque pointer to the desired struct uuidcmp to match
*
* Returns 1 if the device matches, and 0 otherwise.
*/
static int __init match_dev_by_uuid(struct device *dev, const void *data)
{
struct block_device *bdev = dev_to_bdev(dev);
const struct uuidcmp *cmp = data;
if (!bdev->bd_meta_info ||
strncasecmp(cmp->uuid, bdev->bd_meta_info->uuid, cmp->len))
return 0;
return 1;
}
/**
* devt_from_partuuid - looks up the dev_t of a partition by its UUID
* @uuid_str: char array containing ascii UUID
* @devt: dev_t result
*
* The function will return the first partition which contains a matching
* UUID value in its partition_meta_info struct. This does not search
* by filesystem UUIDs.
*
* If @uuid_str is followed by a "/PARTNROFF=%d", then the number will be
* extracted and used as an offset from the partition identified by the UUID.
*
* Returns 0 on success or a negative error code on failure.
*/
static int __init devt_from_partuuid(const char *uuid_str, dev_t *devt)
{
struct uuidcmp cmp;
struct device *dev = NULL;
int offset = 0;
char *slash;
cmp.uuid = uuid_str;
slash = strchr(uuid_str, '/');
/* Check for optional partition number offset attributes. */
if (slash) {
char c = 0;
/* Explicitly fail on poor PARTUUID syntax. */
if (sscanf(slash + 1, "PARTNROFF=%d%c", &offset, &c) != 1)
goto out_invalid;
cmp.len = slash - uuid_str;
} else {
cmp.len = strlen(uuid_str);
}
if (!cmp.len)
goto out_invalid;
dev = class_find_device(&block_class, NULL, &cmp, &match_dev_by_uuid);
if (!dev)
return -ENODEV;
if (offset) {
/*
* Attempt to find the requested partition by adding an offset
* to the partition number found by UUID.
*/
*devt = part_devt(dev_to_disk(dev),
dev_to_bdev(dev)->bd_partno + offset);
} else {
*devt = dev->devt;
}
put_device(dev);
return 0;
out_invalid:
pr_err("VFS: PARTUUID= is invalid.\n"
"Expected PARTUUID=<valid-uuid-id>[/PARTNROFF=%%d]\n");
return -EINVAL;
}
/**
* match_dev_by_label - callback for finding a partition using its label
* @dev: device passed in by the caller
* @data: opaque pointer to the label to match
*
* Returns 1 if the device matches, and 0 otherwise.
*/
static int __init match_dev_by_label(struct device *dev, const void *data)
{
struct block_device *bdev = dev_to_bdev(dev);
const char *label = data;
if (!bdev->bd_meta_info || strcmp(label, bdev->bd_meta_info->volname))
return 0;
return 1;
}
static int __init devt_from_partlabel(const char *label, dev_t *devt)
{
struct device *dev;
dev = class_find_device(&block_class, NULL, label, &match_dev_by_label);
if (!dev)
return -ENODEV;
*devt = dev->devt;
put_device(dev);
return 0;
}
static dev_t __init blk_lookup_devt(const char *name, int partno)
{
dev_t devt = MKDEV(0, 0);
struct class_dev_iter iter;
struct device *dev;
class_dev_iter_init(&iter, &block_class, NULL, &disk_type);
while ((dev = class_dev_iter_next(&iter))) {
struct gendisk *disk = dev_to_disk(dev);
if (strcmp(dev_name(dev), name))
continue;
if (partno < disk->minors) {
/* We need to return the right devno, even
* if the partition doesn't exist yet.
*/
devt = MKDEV(MAJOR(dev->devt),
MINOR(dev->devt) + partno);
} else {
devt = part_devt(disk, partno);
if (devt)
break;
}
}
class_dev_iter_exit(&iter);
return devt;
}
static int __init devt_from_devname(const char *name, dev_t *devt)
{
int part;
char s[32];
char *p;
if (strlen(name) > 31)
return -EINVAL;
strcpy(s, name);
for (p = s; *p; p++) {
if (*p == '/')
*p = '!';
}
*devt = blk_lookup_devt(s, 0);
if (*devt)
return 0;
/*
* Try non-existent, but valid partition, which may only exist after
* opening the device, like partitioned md devices.
*/
while (p > s && isdigit(p[-1]))
p--;
if (p == s || !*p || *p == '0')
return -ENODEV;
/* try disk name without <part number> */
part = simple_strtoul(p, NULL, 10);
*p = '\0';
*devt = blk_lookup_devt(s, part);
if (*devt)
return 0;
/* try disk name without p<part number> */
if (p < s + 2 || !isdigit(p[-2]) || p[-1] != 'p')
return -ENODEV;
p[-1] = '\0';
*devt = blk_lookup_devt(s, part);
if (*devt)
return 0;
return -ENODEV;
}
static int __init devt_from_devnum(const char *name, dev_t *devt)
{
unsigned maj, min, offset;
char *p, dummy;
if (sscanf(name, "%u:%u%c", &maj, &min, &dummy) == 2 ||
sscanf(name, "%u:%u:%u:%c", &maj, &min, &offset, &dummy) == 3) {
*devt = MKDEV(maj, min);
if (maj != MAJOR(*devt) || min != MINOR(*devt))
return -EINVAL;
} else {
*devt = new_decode_dev(simple_strtoul(name, &p, 16));
if (*p)
return -EINVAL;
}
return 0;
}
/*
* Convert a name into device number. We accept the following variants:
*
* 1) <hex_major><hex_minor> device number in hexadecimal represents itself
* no leading 0x, for example b302.
* 3) /dev/<disk_name> represents the device number of disk
* 4) /dev/<disk_name><decimal> represents the device number
* of partition - device number of disk plus the partition number
* 5) /dev/<disk_name>p<decimal> - same as the above, that form is
* used when disk name of partitioned disk ends on a digit.
* 6) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the
* unique id of a partition if the partition table provides it.
* The UUID may be either an EFI/GPT UUID, or refer to an MSDOS
* partition using the format SSSSSSSS-PP, where SSSSSSSS is a zero-
* filled hex representation of the 32-bit "NT disk signature", and PP
* is a zero-filled hex representation of the 1-based partition number.
* 7) PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to
* a partition with a known unique id.
* 8) <major>:<minor> major and minor number of the device separated by
* a colon.
* 9) PARTLABEL=<name> with name being the GPT partition label.
* MSDOS partitions do not support labels!
*
* If name doesn't have fall into the categories above, we return (0,0).
* block_class is used to check if something is a disk name. If the disk
* name contains slashes, the device name has them replaced with
* bangs.
*/
int __init early_lookup_bdev(const char *name, dev_t *devt)
{
if (strncmp(name, "PARTUUID=", 9) == 0)
return devt_from_partuuid(name + 9, devt);
if (strncmp(name, "PARTLABEL=", 10) == 0)
return devt_from_partlabel(name + 10, devt);
if (strncmp(name, "/dev/", 5) == 0)
return devt_from_devname(name + 5, devt);
return devt_from_devnum(name, devt);
}
static char __init *bdevt_str(dev_t devt, char *buf)
{
if (MAJOR(devt) <= 0xff && MINOR(devt) <= 0xff) {
char tbuf[BDEVT_SIZE];
snprintf(tbuf, BDEVT_SIZE, "%02x%02x", MAJOR(devt), MINOR(devt));
snprintf(buf, BDEVT_SIZE, "%-9s", tbuf);
} else
snprintf(buf, BDEVT_SIZE, "%03x:%05x", MAJOR(devt), MINOR(devt));
return buf;
}
/*
* print a full list of all partitions - intended for places where the root
* filesystem can't be mounted and thus to give the victim some idea of what
* went wrong
*/
void __init printk_all_partitions(void)
{
struct class_dev_iter iter;
struct device *dev;
class_dev_iter_init(&iter, &block_class, NULL, &disk_type);
while ((dev = class_dev_iter_next(&iter))) {
struct gendisk *disk = dev_to_disk(dev);
struct block_device *part;
char devt_buf[BDEVT_SIZE];
unsigned long idx;
/*
* Don't show empty devices or things that have been
* suppressed
*/
if (get_capacity(disk) == 0 || (disk->flags & GENHD_FL_HIDDEN))
continue;
/*
* Note, unlike /proc/partitions, I am showing the numbers in
* hex - the same format as the root= option takes.
*/
rcu_read_lock();
xa_for_each(&disk->part_tbl, idx, part) {
if (!bdev_nr_sectors(part))
continue;
printk("%s%s %10llu %pg %s",
bdev_is_partition(part) ? " " : "",
bdevt_str(part->bd_dev, devt_buf),
bdev_nr_sectors(part) >> 1, part,
part->bd_meta_info ?
part->bd_meta_info->uuid : "");
if (bdev_is_partition(part))
printk("\n");
else if (dev->parent && dev->parent->driver)
printk(" driver: %s\n",
dev->parent->driver->name);
else
printk(" (driver?)\n");
}
rcu_read_unlock();
}
class_dev_iter_exit(&iter);
}

View File

@ -751,7 +751,7 @@ ssize_t elv_iosched_store(struct request_queue *q, const char *buf,
if (!elv_support_iosched(q))
return count;
strlcpy(elevator_name, buf, sizeof(elevator_name));
strscpy(elevator_name, buf, sizeof(elevator_name));
ret = elevator_change(q, strstrip(elevator_name));
if (!ret)
return count;

View File

@ -54,7 +54,7 @@ static bool blkdev_dio_unaligned(struct block_device *bdev, loff_t pos,
static ssize_t __blkdev_direct_IO_simple(struct kiocb *iocb,
struct iov_iter *iter, unsigned int nr_pages)
{
struct block_device *bdev = iocb->ki_filp->private_data;
struct block_device *bdev = I_BDEV(iocb->ki_filp->f_mapping->host);
struct bio_vec inline_vecs[DIO_INLINE_BIO_VECS], *vecs;
loff_t pos = iocb->ki_pos;
bool should_dirty = false;
@ -170,7 +170,7 @@ static void blkdev_bio_end_io(struct bio *bio)
static ssize_t __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
unsigned int nr_pages)
{
struct block_device *bdev = iocb->ki_filp->private_data;
struct block_device *bdev = I_BDEV(iocb->ki_filp->f_mapping->host);
struct blk_plug plug;
struct blkdev_dio *dio;
struct bio *bio;
@ -310,7 +310,7 @@ static ssize_t __blkdev_direct_IO_async(struct kiocb *iocb,
struct iov_iter *iter,
unsigned int nr_pages)
{
struct block_device *bdev = iocb->ki_filp->private_data;
struct block_device *bdev = I_BDEV(iocb->ki_filp->f_mapping->host);
bool is_read = iov_iter_rw(iter) == READ;
blk_opf_t opf = is_read ? REQ_OP_READ : dio_bio_write_op(iocb);
struct blkdev_dio *dio;
@ -451,7 +451,7 @@ static loff_t blkdev_llseek(struct file *file, loff_t offset, int whence)
static int blkdev_fsync(struct file *filp, loff_t start, loff_t end,
int datasync)
{
struct block_device *bdev = filp->private_data;
struct block_device *bdev = I_BDEV(filp->f_mapping->host);
int error;
error = file_write_and_wait_range(filp, start, end);
@ -470,6 +470,30 @@ static int blkdev_fsync(struct file *filp, loff_t start, loff_t end,
return error;
}
blk_mode_t file_to_blk_mode(struct file *file)
{
blk_mode_t mode = 0;
if (file->f_mode & FMODE_READ)
mode |= BLK_OPEN_READ;
if (file->f_mode & FMODE_WRITE)
mode |= BLK_OPEN_WRITE;
if (file->private_data)
mode |= BLK_OPEN_EXCL;
if (file->f_flags & O_NDELAY)
mode |= BLK_OPEN_NDELAY;
/*
* If all bits in O_ACCMODE set (aka O_RDWR | O_WRONLY), the floppy
* driver has historically allowed ioctls as if the file was opened for
* writing, but does not allow and actual reads or writes.
*/
if ((file->f_flags & O_ACCMODE) == (O_RDWR | O_WRONLY))
mode |= BLK_OPEN_WRITE_IOCTL;
return mode;
}
static int blkdev_open(struct inode *inode, struct file *filp)
{
struct block_device *bdev;
@ -483,31 +507,29 @@ static int blkdev_open(struct inode *inode, struct file *filp)
filp->f_flags |= O_LARGEFILE;
filp->f_mode |= FMODE_BUF_RASYNC;
if (filp->f_flags & O_NDELAY)
filp->f_mode |= FMODE_NDELAY;
/*
* Use the file private data to store the holder for exclusive openes.
* file_to_blk_mode relies on it being present to set BLK_OPEN_EXCL.
*/
if (filp->f_flags & O_EXCL)
filp->f_mode |= FMODE_EXCL;
if ((filp->f_flags & O_ACCMODE) == 3)
filp->f_mode |= FMODE_WRITE_IOCTL;
filp->private_data = filp;
bdev = blkdev_get_by_dev(inode->i_rdev, filp->f_mode, filp);
bdev = blkdev_get_by_dev(inode->i_rdev, file_to_blk_mode(filp),
filp->private_data, NULL);
if (IS_ERR(bdev))
return PTR_ERR(bdev);
if (bdev_nowait(bdev))
filp->f_mode |= FMODE_NOWAIT;
filp->private_data = bdev;
filp->f_mapping = bdev->bd_inode->i_mapping;
filp->f_wb_err = filemap_sample_wb_err(filp->f_mapping);
return 0;
}
static int blkdev_close(struct inode *inode, struct file *filp)
static int blkdev_release(struct inode *inode, struct file *filp)
{
struct block_device *bdev = filp->private_data;
blkdev_put(bdev, filp->f_mode);
blkdev_put(I_BDEV(filp->f_mapping->host), filp->private_data);
return 0;
}
@ -520,10 +542,9 @@ static int blkdev_close(struct inode *inode, struct file *filp)
*/
static ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
{
struct block_device *bdev = iocb->ki_filp->private_data;
struct block_device *bdev = I_BDEV(iocb->ki_filp->f_mapping->host);
struct inode *bd_inode = bdev->bd_inode;
loff_t size = bdev_nr_bytes(bdev);
struct blk_plug plug;
size_t shorted = 0;
ssize_t ret;
@ -548,18 +569,16 @@ static ssize_t blkdev_write_iter(struct kiocb *iocb, struct iov_iter *from)
iov_iter_truncate(from, size);
}
blk_start_plug(&plug);
ret = __generic_file_write_iter(iocb, from);
if (ret > 0)
ret = generic_write_sync(iocb, ret);
iov_iter_reexpand(from, iov_iter_count(from) + shorted);
blk_finish_plug(&plug);
return ret;
}
static ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to)
{
struct block_device *bdev = iocb->ki_filp->private_data;
struct block_device *bdev = I_BDEV(iocb->ki_filp->f_mapping->host);
loff_t size = bdev_nr_bytes(bdev);
loff_t pos = iocb->ki_pos;
size_t shorted = 0;
@ -652,7 +671,7 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start,
filemap_invalidate_lock(inode->i_mapping);
/* Invalidate the page cache, including dirty pages. */
error = truncate_bdev_range(bdev, file->f_mode, start, end);
error = truncate_bdev_range(bdev, file_to_blk_mode(file), start, end);
if (error)
goto fail;
@ -693,7 +712,7 @@ static int blkdev_mmap(struct file *file, struct vm_area_struct *vma)
const struct file_operations def_blk_fops = {
.open = blkdev_open,
.release = blkdev_close,
.release = blkdev_release,
.llseek = blkdev_llseek,
.read_iter = blkdev_read_iter,
.write_iter = blkdev_write_iter,

View File

@ -25,8 +25,9 @@
#include <linux/pm_runtime.h>
#include <linux/badblocks.h>
#include <linux/part_stat.h>
#include "blk-throttle.h"
#include <linux/blktrace_api.h>
#include "blk-throttle.h"
#include "blk.h"
#include "blk-mq-sched.h"
#include "blk-rq-qos.h"
@ -253,7 +254,7 @@ int __register_blkdev(unsigned int major, const char *name,
#ifdef CONFIG_BLOCK_LEGACY_AUTOLOAD
p->probe = probe;
#endif
strlcpy(p->name, name, sizeof(p->name));
strscpy(p->name, name, sizeof(p->name));
p->next = NULL;
index = major_to_index(major);
@ -318,18 +319,6 @@ void blk_free_ext_minor(unsigned int minor)
ida_free(&ext_devt_ida, minor);
}
static char *bdevt_str(dev_t devt, char *buf)
{
if (MAJOR(devt) <= 0xff && MINOR(devt) <= 0xff) {
char tbuf[BDEVT_SIZE];
snprintf(tbuf, BDEVT_SIZE, "%02x%02x", MAJOR(devt), MINOR(devt));
snprintf(buf, BDEVT_SIZE, "%-9s", tbuf);
} else
snprintf(buf, BDEVT_SIZE, "%03x:%05x", MAJOR(devt), MINOR(devt));
return buf;
}
void disk_uevent(struct gendisk *disk, enum kobject_action action)
{
struct block_device *part;
@ -351,7 +340,7 @@ void disk_uevent(struct gendisk *disk, enum kobject_action action)
}
EXPORT_SYMBOL_GPL(disk_uevent);
int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
int disk_scan_partitions(struct gendisk *disk, blk_mode_t mode)
{
struct block_device *bdev;
int ret = 0;
@ -369,18 +358,20 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
* synchronize with other exclusive openers and other partition
* scanners.
*/
if (!(mode & FMODE_EXCL)) {
ret = bd_prepare_to_claim(disk->part0, disk_scan_partitions);
if (!(mode & BLK_OPEN_EXCL)) {
ret = bd_prepare_to_claim(disk->part0, disk_scan_partitions,
NULL);
if (ret)
return ret;
}
set_bit(GD_NEED_PART_SCAN, &disk->state);
bdev = blkdev_get_by_dev(disk_devt(disk), mode & ~FMODE_EXCL, NULL);
bdev = blkdev_get_by_dev(disk_devt(disk), mode & ~BLK_OPEN_EXCL, NULL,
NULL);
if (IS_ERR(bdev))
ret = PTR_ERR(bdev);
else
blkdev_put(bdev, mode & ~FMODE_EXCL);
blkdev_put(bdev, NULL);
/*
* If blkdev_get_by_dev() failed early, GD_NEED_PART_SCAN is still set,
@ -388,7 +379,7 @@ int disk_scan_partitions(struct gendisk *disk, fmode_t mode)
* creat partition for underlying disk.
*/
clear_bit(GD_NEED_PART_SCAN, &disk->state);
if (!(mode & FMODE_EXCL))
if (!(mode & BLK_OPEN_EXCL))
bd_abort_claiming(disk->part0, disk_scan_partitions);
return ret;
}
@ -516,7 +507,7 @@ int __must_check device_add_disk(struct device *parent, struct gendisk *disk,
bdev_add(disk->part0, ddev->devt);
if (get_capacity(disk))
disk_scan_partitions(disk, FMODE_READ);
disk_scan_partitions(disk, BLK_OPEN_READ);
/*
* Announce the disk and partitions after all partitions are
@ -563,6 +554,28 @@ int __must_check device_add_disk(struct device *parent, struct gendisk *disk,
}
EXPORT_SYMBOL(device_add_disk);
static void blk_report_disk_dead(struct gendisk *disk)
{
struct block_device *bdev;
unsigned long idx;
rcu_read_lock();
xa_for_each(&disk->part_tbl, idx, bdev) {
if (!kobject_get_unless_zero(&bdev->bd_device.kobj))
continue;
rcu_read_unlock();
mutex_lock(&bdev->bd_holder_lock);
if (bdev->bd_holder_ops && bdev->bd_holder_ops->mark_dead)
bdev->bd_holder_ops->mark_dead(bdev);
mutex_unlock(&bdev->bd_holder_lock);
put_device(&bdev->bd_device);
rcu_read_lock();
}
rcu_read_unlock();
}
/**
* blk_mark_disk_dead - mark a disk as dead
* @disk: disk to mark as dead
@ -572,13 +585,26 @@ EXPORT_SYMBOL(device_add_disk);
*/
void blk_mark_disk_dead(struct gendisk *disk)
{
set_bit(GD_DEAD, &disk->state);
blk_queue_start_drain(disk->queue);
/*
* Fail any new I/O.
*/
if (test_and_set_bit(GD_DEAD, &disk->state))
return;
if (test_bit(GD_OWNS_QUEUE, &disk->state))
blk_queue_flag_set(QUEUE_FLAG_DYING, disk->queue);
/*
* Stop buffered writers from dirtying pages that can't be written out.
*/
set_capacity_and_notify(disk, 0);
set_capacity(disk, 0);
/*
* Prevent new I/O from crossing bio_queue_enter().
*/
blk_queue_start_drain(disk->queue);
blk_report_disk_dead(disk);
}
EXPORT_SYMBOL_GPL(blk_mark_disk_dead);
@ -604,6 +630,8 @@ EXPORT_SYMBOL_GPL(blk_mark_disk_dead);
void del_gendisk(struct gendisk *disk)
{
struct request_queue *q = disk->queue;
struct block_device *part;
unsigned long idx;
might_sleep();
@ -612,26 +640,27 @@ void del_gendisk(struct gendisk *disk)
disk_del_events(disk);
/*
* Prevent new openers by unlinked the bdev inode, and write out
* dirty data before marking the disk dead and stopping all I/O.
*/
mutex_lock(&disk->open_mutex);
remove_inode_hash(disk->part0->bd_inode);
blk_drop_partitions(disk);
xa_for_each(&disk->part_tbl, idx, part) {
remove_inode_hash(part->bd_inode);
fsync_bdev(part);
__invalidate_device(part, true);
}
mutex_unlock(&disk->open_mutex);
fsync_bdev(disk->part0);
__invalidate_device(disk->part0, true);
blk_mark_disk_dead(disk);
/*
* Fail any new I/O.
* Drop all partitions now that the disk is marked dead.
*/
set_bit(GD_DEAD, &disk->state);
if (test_bit(GD_OWNS_QUEUE, &disk->state))
blk_queue_flag_set(QUEUE_FLAG_DYING, q);
set_capacity(disk, 0);
/*
* Prevent new I/O from crossing bio_queue_enter().
*/
blk_queue_start_drain(q);
mutex_lock(&disk->open_mutex);
xa_for_each_start(&disk->part_tbl, idx, part, 1)
drop_partition(part);
mutex_unlock(&disk->open_mutex);
if (!(disk->flags & GENHD_FL_HIDDEN)) {
sysfs_remove_link(&disk_to_dev(disk)->kobj, "bdi");
@ -755,57 +784,6 @@ void blk_request_module(dev_t devt)
}
#endif /* CONFIG_BLOCK_LEGACY_AUTOLOAD */
/*
* print a full list of all partitions - intended for places where the root
* filesystem can't be mounted and thus to give the victim some idea of what
* went wrong
*/
void __init printk_all_partitions(void)
{
struct class_dev_iter iter;
struct device *dev;
class_dev_iter_init(&iter, &block_class, NULL, &disk_type);
while ((dev = class_dev_iter_next(&iter))) {
struct gendisk *disk = dev_to_disk(dev);
struct block_device *part;
char devt_buf[BDEVT_SIZE];
unsigned long idx;
/*
* Don't show empty devices or things that have been
* suppressed
*/
if (get_capacity(disk) == 0 || (disk->flags & GENHD_FL_HIDDEN))
continue;
/*
* Note, unlike /proc/partitions, I am showing the numbers in
* hex - the same format as the root= option takes.
*/
rcu_read_lock();
xa_for_each(&disk->part_tbl, idx, part) {
if (!bdev_nr_sectors(part))
continue;
printk("%s%s %10llu %pg %s",
bdev_is_partition(part) ? " " : "",
bdevt_str(part->bd_dev, devt_buf),
bdev_nr_sectors(part) >> 1, part,
part->bd_meta_info ?
part->bd_meta_info->uuid : "");
if (bdev_is_partition(part))
printk("\n");
else if (dev->parent && dev->parent->driver)
printk(" driver: %s\n",
dev->parent->driver->name);
else
printk(" (driver?)\n");
}
rcu_read_unlock();
}
class_dev_iter_exit(&iter);
}
#ifdef CONFIG_PROC_FS
/* iterator */
static void *disk_seqf_start(struct seq_file *seqf, loff_t *pos)
@ -1171,6 +1149,8 @@ static void disk_release(struct device *dev)
might_sleep();
WARN_ON_ONCE(disk_live(disk));
blk_trace_remove(disk->queue);
/*
* To undo the all initialization from blk_mq_init_allocated_queue in
* case of a probe failure where add_disk is never called we have to
@ -1339,35 +1319,6 @@ dev_t part_devt(struct gendisk *disk, u8 partno)
return devt;
}
dev_t blk_lookup_devt(const char *name, int partno)
{
dev_t devt = MKDEV(0, 0);
struct class_dev_iter iter;
struct device *dev;
class_dev_iter_init(&iter, &block_class, NULL, &disk_type);
while ((dev = class_dev_iter_next(&iter))) {
struct gendisk *disk = dev_to_disk(dev);
if (strcmp(dev_name(dev), name))
continue;
if (partno < disk->minors) {
/* We need to return the right devno, even
* if the partition doesn't exist yet.
*/
devt = MKDEV(MAJOR(dev->devt),
MINOR(dev->devt) + partno);
} else {
devt = part_devt(disk, partno);
if (devt)
break;
}
}
class_dev_iter_exit(&iter);
return devt;
}
struct gendisk *__alloc_disk_node(struct request_queue *q, int node_id,
struct lock_class_key *lkclass)
{

View File

@ -82,7 +82,7 @@ static int compat_blkpg_ioctl(struct block_device *bdev,
}
#endif
static int blk_ioctl_discard(struct block_device *bdev, fmode_t mode,
static int blk_ioctl_discard(struct block_device *bdev, blk_mode_t mode,
unsigned long arg)
{
uint64_t range[2];
@ -90,7 +90,7 @@ static int blk_ioctl_discard(struct block_device *bdev, fmode_t mode,
struct inode *inode = bdev->bd_inode;
int err;
if (!(mode & FMODE_WRITE))
if (!(mode & BLK_OPEN_WRITE))
return -EBADF;
if (!bdev_max_discard_sectors(bdev))
@ -120,14 +120,14 @@ static int blk_ioctl_discard(struct block_device *bdev, fmode_t mode,
return err;
}
static int blk_ioctl_secure_erase(struct block_device *bdev, fmode_t mode,
static int blk_ioctl_secure_erase(struct block_device *bdev, blk_mode_t mode,
void __user *argp)
{
uint64_t start, len;
uint64_t range[2];
int err;
if (!(mode & FMODE_WRITE))
if (!(mode & BLK_OPEN_WRITE))
return -EBADF;
if (!bdev_max_secure_erase_sectors(bdev))
return -EOPNOTSUPP;
@ -151,7 +151,7 @@ static int blk_ioctl_secure_erase(struct block_device *bdev, fmode_t mode,
}
static int blk_ioctl_zeroout(struct block_device *bdev, fmode_t mode,
static int blk_ioctl_zeroout(struct block_device *bdev, blk_mode_t mode,
unsigned long arg)
{
uint64_t range[2];
@ -159,7 +159,7 @@ static int blk_ioctl_zeroout(struct block_device *bdev, fmode_t mode,
struct inode *inode = bdev->bd_inode;
int err;
if (!(mode & FMODE_WRITE))
if (!(mode & BLK_OPEN_WRITE))
return -EBADF;
if (copy_from_user(range, (void __user *)arg, sizeof(range)))
@ -240,7 +240,7 @@ static int compat_put_ulong(compat_ulong_t __user *argp, compat_ulong_t val)
* drivers that implement only commands that are completely compatible
* between 32-bit and 64-bit user space
*/
int blkdev_compat_ptr_ioctl(struct block_device *bdev, fmode_t mode,
int blkdev_compat_ptr_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned cmd, unsigned long arg)
{
struct gendisk *disk = bdev->bd_disk;
@ -254,13 +254,28 @@ int blkdev_compat_ptr_ioctl(struct block_device *bdev, fmode_t mode,
EXPORT_SYMBOL(blkdev_compat_ptr_ioctl);
#endif
static int blkdev_pr_register(struct block_device *bdev,
static bool blkdev_pr_allowed(struct block_device *bdev, blk_mode_t mode)
{
/* no sense to make reservations for partitions */
if (bdev_is_partition(bdev))
return false;
if (capable(CAP_SYS_ADMIN))
return true;
/*
* Only allow unprivileged reservations if the file descriptor is open
* for writing.
*/
return mode & BLK_OPEN_WRITE;
}
static int blkdev_pr_register(struct block_device *bdev, blk_mode_t mode,
struct pr_registration __user *arg)
{
const struct pr_ops *ops = bdev->bd_disk->fops->pr_ops;
struct pr_registration reg;
if (!capable(CAP_SYS_ADMIN))
if (!blkdev_pr_allowed(bdev, mode))
return -EPERM;
if (!ops || !ops->pr_register)
return -EOPNOTSUPP;
@ -272,13 +287,13 @@ static int blkdev_pr_register(struct block_device *bdev,
return ops->pr_register(bdev, reg.old_key, reg.new_key, reg.flags);
}
static int blkdev_pr_reserve(struct block_device *bdev,
static int blkdev_pr_reserve(struct block_device *bdev, blk_mode_t mode,
struct pr_reservation __user *arg)
{
const struct pr_ops *ops = bdev->bd_disk->fops->pr_ops;
struct pr_reservation rsv;
if (!capable(CAP_SYS_ADMIN))
if (!blkdev_pr_allowed(bdev, mode))
return -EPERM;
if (!ops || !ops->pr_reserve)
return -EOPNOTSUPP;
@ -290,13 +305,13 @@ static int blkdev_pr_reserve(struct block_device *bdev,
return ops->pr_reserve(bdev, rsv.key, rsv.type, rsv.flags);
}
static int blkdev_pr_release(struct block_device *bdev,
static int blkdev_pr_release(struct block_device *bdev, blk_mode_t mode,
struct pr_reservation __user *arg)
{
const struct pr_ops *ops = bdev->bd_disk->fops->pr_ops;
struct pr_reservation rsv;
if (!capable(CAP_SYS_ADMIN))
if (!blkdev_pr_allowed(bdev, mode))
return -EPERM;
if (!ops || !ops->pr_release)
return -EOPNOTSUPP;
@ -308,13 +323,13 @@ static int blkdev_pr_release(struct block_device *bdev,
return ops->pr_release(bdev, rsv.key, rsv.type);
}
static int blkdev_pr_preempt(struct block_device *bdev,
static int blkdev_pr_preempt(struct block_device *bdev, blk_mode_t mode,
struct pr_preempt __user *arg, bool abort)
{
const struct pr_ops *ops = bdev->bd_disk->fops->pr_ops;
struct pr_preempt p;
if (!capable(CAP_SYS_ADMIN))
if (!blkdev_pr_allowed(bdev, mode))
return -EPERM;
if (!ops || !ops->pr_preempt)
return -EOPNOTSUPP;
@ -326,13 +341,13 @@ static int blkdev_pr_preempt(struct block_device *bdev,
return ops->pr_preempt(bdev, p.old_key, p.new_key, p.type, abort);
}
static int blkdev_pr_clear(struct block_device *bdev,
static int blkdev_pr_clear(struct block_device *bdev, blk_mode_t mode,
struct pr_clear __user *arg)
{
const struct pr_ops *ops = bdev->bd_disk->fops->pr_ops;
struct pr_clear c;
if (!capable(CAP_SYS_ADMIN))
if (!blkdev_pr_allowed(bdev, mode))
return -EPERM;
if (!ops || !ops->pr_clear)
return -EOPNOTSUPP;
@ -344,8 +359,8 @@ static int blkdev_pr_clear(struct block_device *bdev,
return ops->pr_clear(bdev, c.key);
}
static int blkdev_flushbuf(struct block_device *bdev, fmode_t mode,
unsigned cmd, unsigned long arg)
static int blkdev_flushbuf(struct block_device *bdev, unsigned cmd,
unsigned long arg)
{
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
@ -354,8 +369,8 @@ static int blkdev_flushbuf(struct block_device *bdev, fmode_t mode,
return 0;
}
static int blkdev_roset(struct block_device *bdev, fmode_t mode,
unsigned cmd, unsigned long arg)
static int blkdev_roset(struct block_device *bdev, unsigned cmd,
unsigned long arg)
{
int ret, n;
@ -439,7 +454,7 @@ static int compat_hdio_getgeo(struct block_device *bdev,
#endif
/* set the logical block size */
static int blkdev_bszset(struct block_device *bdev, fmode_t mode,
static int blkdev_bszset(struct block_device *bdev, blk_mode_t mode,
int __user *argp)
{
int ret, n;
@ -451,13 +466,13 @@ static int blkdev_bszset(struct block_device *bdev, fmode_t mode,
if (get_user(n, argp))
return -EFAULT;
if (mode & FMODE_EXCL)
if (mode & BLK_OPEN_EXCL)
return set_blocksize(bdev, n);
if (IS_ERR(blkdev_get_by_dev(bdev->bd_dev, mode | FMODE_EXCL, &bdev)))
if (IS_ERR(blkdev_get_by_dev(bdev->bd_dev, mode, &bdev, NULL)))
return -EBUSY;
ret = set_blocksize(bdev, n);
blkdev_put(bdev, mode | FMODE_EXCL);
blkdev_put(bdev, &bdev);
return ret;
}
@ -467,7 +482,7 @@ static int blkdev_bszset(struct block_device *bdev, fmode_t mode,
* user space. Note the separate arg/argp parameters that are needed
* to deal with the compat_ptr() conversion.
*/
static int blkdev_common_ioctl(struct block_device *bdev, fmode_t mode,
static int blkdev_common_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long arg,
void __user *argp)
{
@ -475,9 +490,9 @@ static int blkdev_common_ioctl(struct block_device *bdev, fmode_t mode,
switch (cmd) {
case BLKFLSBUF:
return blkdev_flushbuf(bdev, mode, cmd, arg);
return blkdev_flushbuf(bdev, cmd, arg);
case BLKROSET:
return blkdev_roset(bdev, mode, cmd, arg);
return blkdev_roset(bdev, cmd, arg);
case BLKDISCARD:
return blk_ioctl_discard(bdev, mode, arg);
case BLKSECDISCARD:
@ -487,7 +502,7 @@ static int blkdev_common_ioctl(struct block_device *bdev, fmode_t mode,
case BLKGETDISKSEQ:
return put_u64(argp, bdev->bd_disk->diskseq);
case BLKREPORTZONE:
return blkdev_report_zones_ioctl(bdev, mode, cmd, arg);
return blkdev_report_zones_ioctl(bdev, cmd, arg);
case BLKRESETZONE:
case BLKOPENZONE:
case BLKCLOSEZONE:
@ -534,17 +549,17 @@ static int blkdev_common_ioctl(struct block_device *bdev, fmode_t mode,
case BLKTRACETEARDOWN:
return blk_trace_ioctl(bdev, cmd, argp);
case IOC_PR_REGISTER:
return blkdev_pr_register(bdev, argp);
return blkdev_pr_register(bdev, mode, argp);
case IOC_PR_RESERVE:
return blkdev_pr_reserve(bdev, argp);
return blkdev_pr_reserve(bdev, mode, argp);
case IOC_PR_RELEASE:
return blkdev_pr_release(bdev, argp);
return blkdev_pr_release(bdev, mode, argp);
case IOC_PR_PREEMPT:
return blkdev_pr_preempt(bdev, argp, false);
return blkdev_pr_preempt(bdev, mode, argp, false);
case IOC_PR_PREEMPT_ABORT:
return blkdev_pr_preempt(bdev, argp, true);
return blkdev_pr_preempt(bdev, mode, argp, true);
case IOC_PR_CLEAR:
return blkdev_pr_clear(bdev, argp);
return blkdev_pr_clear(bdev, mode, argp);
default:
return -ENOIOCTLCMD;
}
@ -560,18 +575,9 @@ long blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg)
{
struct block_device *bdev = I_BDEV(file->f_mapping->host);
void __user *argp = (void __user *)arg;
fmode_t mode = file->f_mode;
blk_mode_t mode = file_to_blk_mode(file);
int ret;
/*
* O_NDELAY can be altered using fcntl(.., F_SETFL, ..), so we have
* to updated it before every ioctl.
*/
if (file->f_flags & O_NDELAY)
mode |= FMODE_NDELAY;
else
mode &= ~FMODE_NDELAY;
switch (cmd) {
/* These need separate implementations for the data structure */
case HDIO_GETGEO:
@ -630,16 +636,7 @@ long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg)
void __user *argp = compat_ptr(arg);
struct block_device *bdev = I_BDEV(file->f_mapping->host);
struct gendisk *disk = bdev->bd_disk;
fmode_t mode = file->f_mode;
/*
* O_NDELAY can be altered using fcntl(.., F_SETFL, ..), so we have
* to updated it before every ioctl.
*/
if (file->f_flags & O_NDELAY)
mode |= FMODE_NDELAY;
else
mode &= ~FMODE_NDELAY;
blk_mode_t mode = file_to_blk_mode(file);
switch (cmd) {
/* These need separate implementations for the data structure */

View File

@ -74,8 +74,8 @@ struct dd_per_prio {
struct list_head dispatch;
struct rb_root sort_list[DD_DIR_COUNT];
struct list_head fifo_list[DD_DIR_COUNT];
/* Next request in FIFO order. Read, write or both are NULL. */
struct request *next_rq[DD_DIR_COUNT];
/* Position of the most recently dispatched request. */
sector_t latest_pos[DD_DIR_COUNT];
struct io_stats_per_prio stats;
};
@ -156,6 +156,40 @@ deadline_latter_request(struct request *rq)
return NULL;
}
/*
* Return the first request for which blk_rq_pos() >= @pos. For zoned devices,
* return the first request after the start of the zone containing @pos.
*/
static inline struct request *deadline_from_pos(struct dd_per_prio *per_prio,
enum dd_data_dir data_dir, sector_t pos)
{
struct rb_node *node = per_prio->sort_list[data_dir].rb_node;
struct request *rq, *res = NULL;
if (!node)
return NULL;
rq = rb_entry_rq(node);
/*
* A zoned write may have been requeued with a starting position that
* is below that of the most recently dispatched request. Hence, for
* zoned writes, start searching from the start of a zone.
*/
if (blk_rq_is_seq_zoned_write(rq))
pos -= round_down(pos, rq->q->limits.chunk_sectors);
while (node) {
rq = rb_entry_rq(node);
if (blk_rq_pos(rq) >= pos) {
res = rq;
node = node->rb_left;
} else {
node = node->rb_right;
}
}
return res;
}
static void
deadline_add_rq_rb(struct dd_per_prio *per_prio, struct request *rq)
{
@ -167,11 +201,6 @@ deadline_add_rq_rb(struct dd_per_prio *per_prio, struct request *rq)
static inline void
deadline_del_rq_rb(struct dd_per_prio *per_prio, struct request *rq)
{
const enum dd_data_dir data_dir = rq_data_dir(rq);
if (per_prio->next_rq[data_dir] == rq)
per_prio->next_rq[data_dir] = deadline_latter_request(rq);
elv_rb_del(deadline_rb_root(per_prio, rq), rq);
}
@ -251,10 +280,6 @@ static void
deadline_move_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
struct request *rq)
{
const enum dd_data_dir data_dir = rq_data_dir(rq);
per_prio->next_rq[data_dir] = deadline_latter_request(rq);
/*
* take it off the sort and fifo list
*/
@ -272,21 +297,15 @@ static u32 dd_queued(struct deadline_data *dd, enum dd_prio prio)
}
/*
* deadline_check_fifo returns 0 if there are no expired requests on the fifo,
* 1 otherwise. Requires !list_empty(&dd->fifo_list[data_dir])
* deadline_check_fifo returns true if and only if there are expired requests
* in the FIFO list. Requires !list_empty(&dd->fifo_list[data_dir]).
*/
static inline int deadline_check_fifo(struct dd_per_prio *per_prio,
enum dd_data_dir data_dir)
static inline bool deadline_check_fifo(struct dd_per_prio *per_prio,
enum dd_data_dir data_dir)
{
struct request *rq = rq_entry_fifo(per_prio->fifo_list[data_dir].next);
/*
* rq is expired!
*/
if (time_after_eq(jiffies, (unsigned long)rq->fifo_time))
return 1;
return 0;
return time_is_before_eq_jiffies((unsigned long)rq->fifo_time);
}
/*
@ -310,14 +329,11 @@ static struct request *deadline_skip_seq_writes(struct deadline_data *dd,
struct request *rq)
{
sector_t pos = blk_rq_pos(rq);
sector_t skipped_sectors = 0;
while (rq) {
if (blk_rq_pos(rq) != pos + skipped_sectors)
break;
skipped_sectors += blk_rq_sectors(rq);
do {
pos += blk_rq_sectors(rq);
rq = deadline_latter_request(rq);
}
} while (rq && blk_rq_pos(rq) == pos);
return rq;
}
@ -330,7 +346,7 @@ static struct request *
deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
enum dd_data_dir data_dir)
{
struct request *rq;
struct request *rq, *rb_rq, *next;
unsigned long flags;
if (list_empty(&per_prio->fifo_list[data_dir]))
@ -348,7 +364,12 @@ deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
* zones and these zones are unlocked.
*/
spin_lock_irqsave(&dd->zone_lock, flags);
list_for_each_entry(rq, &per_prio->fifo_list[DD_WRITE], queuelist) {
list_for_each_entry_safe(rq, next, &per_prio->fifo_list[DD_WRITE],
queuelist) {
/* Check whether a prior request exists for the same zone. */
rb_rq = deadline_from_pos(per_prio, data_dir, blk_rq_pos(rq));
if (rb_rq && blk_rq_pos(rb_rq) < blk_rq_pos(rq))
rq = rb_rq;
if (blk_req_can_dispatch_to_zone(rq) &&
(blk_queue_nonrot(rq->q) ||
!deadline_is_seq_write(dd, rq)))
@ -372,7 +393,8 @@ deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
struct request *rq;
unsigned long flags;
rq = per_prio->next_rq[data_dir];
rq = deadline_from_pos(per_prio, data_dir,
per_prio->latest_pos[data_dir]);
if (!rq)
return NULL;
@ -435,6 +457,7 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
if (started_after(dd, rq, latest_start))
return NULL;
list_del_init(&rq->queuelist);
data_dir = rq_data_dir(rq);
goto done;
}
@ -442,9 +465,11 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
* batches are currently reads XOR writes
*/
rq = deadline_next_request(dd, per_prio, dd->last_dir);
if (rq && dd->batching < dd->fifo_batch)
/* we have a next request are still entitled to batch */
if (rq && dd->batching < dd->fifo_batch) {
/* we have a next request and are still entitled to batch */
data_dir = rq_data_dir(rq);
goto dispatch_request;
}
/*
* at this point we are not running a batch. select the appropriate
@ -522,6 +547,7 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
done:
ioprio_class = dd_rq_ioclass(rq);
prio = ioprio_class_to_prio[ioprio_class];
dd->per_prio[prio].latest_pos[data_dir] = blk_rq_pos(rq);
dd->per_prio[prio].stats.dispatched++;
/*
* If the request needs its target zone locked, do it.
@ -766,7 +792,7 @@ static bool dd_bio_merge(struct request_queue *q, struct bio *bio,
* add rq to rbtree and fifo
*/
static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
blk_insert_t flags)
blk_insert_t flags, struct list_head *free)
{
struct request_queue *q = hctx->queue;
struct deadline_data *dd = q->elevator->elevator_data;
@ -775,7 +801,6 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
u8 ioprio_class = IOPRIO_PRIO_CLASS(ioprio);
struct dd_per_prio *per_prio;
enum dd_prio prio;
LIST_HEAD(free);
lockdep_assert_held(&dd->lock);
@ -792,10 +817,8 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
rq->elv.priv[0] = (void *)(uintptr_t)1;
}
if (blk_mq_sched_try_insert_merge(q, rq, &free)) {
blk_mq_free_requests(&free);
if (blk_mq_sched_try_insert_merge(q, rq, free))
return;
}
trace_block_rq_insert(rq);
@ -803,6 +826,8 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
list_add(&rq->queuelist, &per_prio->dispatch);
rq->fifo_time = jiffies;
} else {
struct list_head *insert_before;
deadline_add_rq_rb(per_prio, rq);
if (rq_mergeable(rq)) {
@ -815,7 +840,20 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
* set expire time and add to fifo list
*/
rq->fifo_time = jiffies + dd->fifo_expire[data_dir];
list_add_tail(&rq->queuelist, &per_prio->fifo_list[data_dir]);
insert_before = &per_prio->fifo_list[data_dir];
#ifdef CONFIG_BLK_DEV_ZONED
/*
* Insert zoned writes such that requests are sorted by
* position per zone.
*/
if (blk_rq_is_seq_zoned_write(rq)) {
struct request *rq2 = deadline_latter_request(rq);
if (rq2 && blk_rq_zone_no(rq2) == blk_rq_zone_no(rq))
insert_before = &rq2->queuelist;
}
#endif
list_add_tail(&rq->queuelist, insert_before);
}
}
@ -828,6 +866,7 @@ static void dd_insert_requests(struct blk_mq_hw_ctx *hctx,
{
struct request_queue *q = hctx->queue;
struct deadline_data *dd = q->elevator->elevator_data;
LIST_HEAD(free);
spin_lock(&dd->lock);
while (!list_empty(list)) {
@ -835,9 +874,11 @@ static void dd_insert_requests(struct blk_mq_hw_ctx *hctx,
rq = list_first_entry(list, struct request, queuelist);
list_del_init(&rq->queuelist);
dd_insert_request(hctx, rq, flags);
dd_insert_request(hctx, rq, flags, &free);
}
spin_unlock(&dd->lock);
blk_mq_free_requests(&free);
}
/* Callback from inside blk_mq_rq_ctx_init(). */
@ -1035,8 +1076,10 @@ static int deadline_##name##_next_rq_show(void *data, \
struct request_queue *q = data; \
struct deadline_data *dd = q->elevator->elevator_data; \
struct dd_per_prio *per_prio = &dd->per_prio[prio]; \
struct request *rq = per_prio->next_rq[data_dir]; \
struct request *rq; \
\
rq = deadline_from_pos(per_prio, data_dir, \
per_prio->latest_pos[data_dir]); \
if (rq) \
__blk_mq_debugfs_rq_show(m, rq); \
return 0; \

View File

@ -11,10 +11,18 @@
#define pr_fmt(fmt) fmt
#include <linux/types.h>
#include <linux/mm_types.h>
#include <linux/overflow.h>
#include <linux/affs_hardblocks.h>
#include "check.h"
/* magic offsets in partition DosEnvVec */
#define NR_HD 3
#define NR_SECT 5
#define LO_CYL 9
#define HI_CYL 10
static __inline__ u32
checksum_block(__be32 *m, int size)
{
@ -31,8 +39,12 @@ int amiga_partition(struct parsed_partitions *state)
unsigned char *data;
struct RigidDiskBlock *rdb;
struct PartitionBlock *pb;
int start_sect, nr_sects, blk, part, res = 0;
int blksize = 1; /* Multiplier for disk block size */
u64 start_sect, nr_sects;
sector_t blk, end_sect;
u32 cylblk; /* rdb_CylBlocks = nr_heads*sect_per_track */
u32 nr_hd, nr_sect, lo_cyl, hi_cyl;
int part, res = 0;
unsigned int blksize = 1; /* Multiplier for disk block size */
int slot = 1;
for (blk = 0; ; blk++, put_dev_sector(sect)) {
@ -40,7 +52,7 @@ int amiga_partition(struct parsed_partitions *state)
goto rdb_done;
data = read_part_sector(state, blk, &sect);
if (!data) {
pr_err("Dev %s: unable to read RDB block %d\n",
pr_err("Dev %s: unable to read RDB block %llu\n",
state->disk->disk_name, blk);
res = -1;
goto rdb_done;
@ -57,12 +69,12 @@ int amiga_partition(struct parsed_partitions *state)
*(__be32 *)(data+0xdc) = 0;
if (checksum_block((__be32 *)data,
be32_to_cpu(rdb->rdb_SummedLongs) & 0x7F)==0) {
pr_err("Trashed word at 0xd0 in block %d ignored in checksum calculation\n",
pr_err("Trashed word at 0xd0 in block %llu ignored in checksum calculation\n",
blk);
break;
}
pr_err("Dev %s: RDB in block %d has bad checksum\n",
pr_err("Dev %s: RDB in block %llu has bad checksum\n",
state->disk->disk_name, blk);
}
@ -79,10 +91,15 @@ int amiga_partition(struct parsed_partitions *state)
blk = be32_to_cpu(rdb->rdb_PartitionList);
put_dev_sector(sect);
for (part = 1; blk>0 && part<=16; part++, put_dev_sector(sect)) {
blk *= blksize; /* Read in terms partition table understands */
/* Read in terms partition table understands */
if (check_mul_overflow(blk, (sector_t) blksize, &blk)) {
pr_err("Dev %s: overflow calculating partition block %llu! Skipping partitions %u and beyond\n",
state->disk->disk_name, blk, part);
break;
}
data = read_part_sector(state, blk, &sect);
if (!data) {
pr_err("Dev %s: unable to read partition block %d\n",
pr_err("Dev %s: unable to read partition block %llu\n",
state->disk->disk_name, blk);
res = -1;
goto rdb_done;
@ -94,19 +111,70 @@ int amiga_partition(struct parsed_partitions *state)
if (checksum_block((__be32 *)pb, be32_to_cpu(pb->pb_SummedLongs) & 0x7F) != 0 )
continue;
/* Tell Kernel about it */
/* RDB gives us more than enough rope to hang ourselves with,
* many times over (2^128 bytes if all fields max out).
* Some careful checks are in order, so check for potential
* overflows.
* We are multiplying four 32 bit numbers to one sector_t!
*/
nr_hd = be32_to_cpu(pb->pb_Environment[NR_HD]);
nr_sect = be32_to_cpu(pb->pb_Environment[NR_SECT]);
/* CylBlocks is total number of blocks per cylinder */
if (check_mul_overflow(nr_hd, nr_sect, &cylblk)) {
pr_err("Dev %s: heads*sects %u overflows u32, skipping partition!\n",
state->disk->disk_name, cylblk);
continue;
}
/* check for consistency with RDB defined CylBlocks */
if (cylblk > be32_to_cpu(rdb->rdb_CylBlocks)) {
pr_warn("Dev %s: cylblk %u > rdb_CylBlocks %u!\n",
state->disk->disk_name, cylblk,
be32_to_cpu(rdb->rdb_CylBlocks));
}
/* RDB allows for variable logical block size -
* normalize to 512 byte blocks and check result.
*/
if (check_mul_overflow(cylblk, blksize, &cylblk)) {
pr_err("Dev %s: partition %u bytes per cyl. overflows u32, skipping partition!\n",
state->disk->disk_name, part);
continue;
}
/* Calculate partition start and end. Limit of 32 bit on cylblk
* guarantees no overflow occurs if LBD support is enabled.
*/
lo_cyl = be32_to_cpu(pb->pb_Environment[LO_CYL]);
start_sect = ((u64) lo_cyl * cylblk);
hi_cyl = be32_to_cpu(pb->pb_Environment[HI_CYL]);
nr_sects = (((u64) hi_cyl - lo_cyl + 1) * cylblk);
nr_sects = (be32_to_cpu(pb->pb_Environment[10]) + 1 -
be32_to_cpu(pb->pb_Environment[9])) *
be32_to_cpu(pb->pb_Environment[3]) *
be32_to_cpu(pb->pb_Environment[5]) *
blksize;
if (!nr_sects)
continue;
start_sect = be32_to_cpu(pb->pb_Environment[9]) *
be32_to_cpu(pb->pb_Environment[3]) *
be32_to_cpu(pb->pb_Environment[5]) *
blksize;
/* Warn user if partition end overflows u32 (AmigaDOS limit) */
if ((start_sect + nr_sects) > UINT_MAX) {
pr_warn("Dev %s: partition %u (%llu-%llu) needs 64 bit device support!\n",
state->disk->disk_name, part,
start_sect, start_sect + nr_sects);
}
if (check_add_overflow(start_sect, nr_sects, &end_sect)) {
pr_err("Dev %s: partition %u (%llu-%llu) needs LBD device support, skipping partition!\n",
state->disk->disk_name, part,
start_sect, end_sect);
continue;
}
/* Tell Kernel about it */
put_partition(state,slot++,start_sect,nr_sects);
{
/* Be even more informative to aid mounting */

View File

@ -12,7 +12,7 @@
#include <linux/raid/detect.h>
#include "check.h"
static int (*check_part[])(struct parsed_partitions *) = {
static int (*const check_part[])(struct parsed_partitions *) = {
/*
* Probe partition formats with tables at disk address 0
* that also have an ADFS boot block at 0xdc0.
@ -228,7 +228,7 @@ static struct attribute *part_attrs[] = {
NULL
};
static struct attribute_group part_attr_group = {
static const struct attribute_group part_attr_group = {
.attrs = part_attrs,
};
@ -256,31 +256,36 @@ static int part_uevent(const struct device *dev, struct kobj_uevent_env *env)
return 0;
}
struct device_type part_type = {
const struct device_type part_type = {
.name = "partition",
.groups = part_attr_groups,
.release = part_release,
.uevent = part_uevent,
};
static void delete_partition(struct block_device *part)
void drop_partition(struct block_device *part)
{
lockdep_assert_held(&part->bd_disk->open_mutex);
fsync_bdev(part);
__invalidate_device(part, true);
xa_erase(&part->bd_disk->part_tbl, part->bd_partno);
kobject_put(part->bd_holder_dir);
device_del(&part->bd_device);
device_del(&part->bd_device);
put_device(&part->bd_device);
}
static void delete_partition(struct block_device *part)
{
/*
* Remove the block device from the inode hash, so that it cannot be
* looked up any more even when openers still hold references.
*/
remove_inode_hash(part->bd_inode);
put_device(&part->bd_device);
fsync_bdev(part);
__invalidate_device(part, true);
drop_partition(part);
}
static ssize_t whole_disk_show(struct device *dev,
@ -288,7 +293,7 @@ static ssize_t whole_disk_show(struct device *dev,
{
return 0;
}
static DEVICE_ATTR(whole_disk, 0444, whole_disk_show, NULL);
static const DEVICE_ATTR(whole_disk, 0444, whole_disk_show, NULL);
/*
* Must be called either with open_mutex held, before a disk can be opened or
@ -436,10 +441,21 @@ static bool partition_overlaps(struct gendisk *disk, sector_t start,
int bdev_add_partition(struct gendisk *disk, int partno, sector_t start,
sector_t length)
{
sector_t capacity = get_capacity(disk), end;
struct block_device *part;
int ret;
mutex_lock(&disk->open_mutex);
if (check_add_overflow(start, length, &end)) {
ret = -EINVAL;
goto out;
}
if (start >= capacity || end > capacity) {
ret = -EINVAL;
goto out;
}
if (!disk_live(disk)) {
ret = -ENXIO;
goto out;
@ -519,17 +535,6 @@ static bool disk_unlock_native_capacity(struct gendisk *disk)
return true;
}
void blk_drop_partitions(struct gendisk *disk)
{
struct block_device *part;
unsigned long idx;
lockdep_assert_held(&disk->open_mutex);
xa_for_each_start(&disk->part_tbl, idx, part, 1)
delete_partition(part);
}
static bool blk_add_partition(struct gendisk *disk,
struct parsed_partitions *state, int p)
{
@ -646,6 +651,8 @@ static int blk_add_partitions(struct gendisk *disk)
int bdev_disk_changed(struct gendisk *disk, bool invalidate)
{
struct block_device *part;
unsigned long idx;
int ret = 0;
lockdep_assert_held(&disk->open_mutex);
@ -658,8 +665,9 @@ int bdev_disk_changed(struct gendisk *disk, bool invalidate)
return -EBUSY;
sync_blockdev(disk->part0);
invalidate_bdev(disk->part0);
blk_drop_partitions(disk);
xa_for_each_start(&disk->part_tbl, idx, part, 1)
delete_partition(part);
clear_bit(GD_NEED_PART_SCAN, &disk->state);
/*

View File

@ -751,14 +751,12 @@ static int really_probe_debug(struct device *dev, struct device_driver *drv)
*
* Should somehow figure out how to use a semaphore, not an atomic variable...
*/
int driver_probe_done(void)
bool __init driver_probe_done(void)
{
int local_probe_count = atomic_read(&probe_count);
pr_debug("%s: probe_count = %d\n", __func__, local_probe_count);
if (local_probe_count)
return -EBUSY;
return 0;
return !local_probe_count;
}
/**

View File

@ -1532,7 +1532,7 @@ static int fd_getgeo(struct block_device *bdev, struct hd_geometry *geo)
return 0;
}
static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode,
static int fd_locked_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long param)
{
struct amiga_floppy_struct *p = bdev->bd_disk->private_data;
@ -1607,7 +1607,7 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode,
return 0;
}
static int fd_ioctl(struct block_device *bdev, fmode_t mode,
static int fd_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long param)
{
int ret;
@ -1654,10 +1654,10 @@ static void fd_probe(int dev)
* /dev/PS0 etc), and disallows simultaneous access to the same
* drive with different device numbers.
*/
static int floppy_open(struct block_device *bdev, fmode_t mode)
static int floppy_open(struct gendisk *disk, blk_mode_t mode)
{
int drive = MINOR(bdev->bd_dev) & 3;
int system = (MINOR(bdev->bd_dev) & 4) >> 2;
int drive = disk->first_minor & 3;
int system = (disk->first_minor & 4) >> 2;
int old_dev;
unsigned long flags;
@ -1673,10 +1673,9 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
mutex_unlock(&amiflop_mutex);
return -ENXIO;
}
if (mode & (FMODE_READ|FMODE_WRITE)) {
bdev_check_media_change(bdev);
if (mode & FMODE_WRITE) {
if (mode & (BLK_OPEN_READ | BLK_OPEN_WRITE)) {
disk_check_media_change(disk);
if (mode & BLK_OPEN_WRITE) {
int wrprot;
get_fdc(drive);
@ -1691,7 +1690,6 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
}
}
}
local_irq_save(flags);
fd_ref[drive]++;
fd_device[drive] = system;
@ -1709,7 +1707,7 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
return 0;
}
static void floppy_release(struct gendisk *disk, fmode_t mode)
static void floppy_release(struct gendisk *disk)
{
struct amiga_floppy_struct *p = disk->private_data;
int drive = p - unit;

View File

@ -204,9 +204,9 @@ aoedisk_rm_debugfs(struct aoedev *d)
}
static int
aoeblk_open(struct block_device *bdev, fmode_t mode)
aoeblk_open(struct gendisk *disk, blk_mode_t mode)
{
struct aoedev *d = bdev->bd_disk->private_data;
struct aoedev *d = disk->private_data;
ulong flags;
if (!virt_addr_valid(d)) {
@ -232,7 +232,7 @@ aoeblk_open(struct block_device *bdev, fmode_t mode)
}
static void
aoeblk_release(struct gendisk *disk, fmode_t mode)
aoeblk_release(struct gendisk *disk)
{
struct aoedev *d = disk->private_data;
ulong flags;
@ -285,7 +285,7 @@ aoeblk_getgeo(struct block_device *bdev, struct hd_geometry *geo)
}
static int
aoeblk_ioctl(struct block_device *bdev, fmode_t mode, uint cmd, ulong arg)
aoeblk_ioctl(struct block_device *bdev, blk_mode_t mode, uint cmd, ulong arg)
{
struct aoedev *d;

View File

@ -49,7 +49,7 @@ static int emsgs_head_idx, emsgs_tail_idx;
static struct completion emsgs_comp;
static spinlock_t emsgs_lock;
static int nblocked_emsgs_readers;
static struct class *aoe_class;
static struct aoe_chardev chardevs[] = {
{ MINOR_ERR, "err" },
{ MINOR_DISCOVER, "discover" },
@ -58,6 +58,16 @@ static struct aoe_chardev chardevs[] = {
{ MINOR_FLUSH, "flush" },
};
static char *aoe_devnode(const struct device *dev, umode_t *mode)
{
return kasprintf(GFP_KERNEL, "etherd/%s", dev_name(dev));
}
static const struct class aoe_class = {
.name = "aoe",
.devnode = aoe_devnode,
};
static int
discover(void)
{
@ -273,11 +283,6 @@ static const struct file_operations aoe_fops = {
.llseek = noop_llseek,
};
static char *aoe_devnode(const struct device *dev, umode_t *mode)
{
return kasprintf(GFP_KERNEL, "etherd/%s", dev_name(dev));
}
int __init
aoechr_init(void)
{
@ -290,15 +295,14 @@ aoechr_init(void)
}
init_completion(&emsgs_comp);
spin_lock_init(&emsgs_lock);
aoe_class = class_create("aoe");
if (IS_ERR(aoe_class)) {
n = class_register(&aoe_class);
if (n) {
unregister_chrdev(AOE_MAJOR, "aoechr");
return PTR_ERR(aoe_class);
return n;
}
aoe_class->devnode = aoe_devnode;
for (i = 0; i < ARRAY_SIZE(chardevs); ++i)
device_create(aoe_class, NULL,
device_create(&aoe_class, NULL,
MKDEV(AOE_MAJOR, chardevs[i].minor), NULL,
chardevs[i].name);
@ -311,8 +315,8 @@ aoechr_exit(void)
int i;
for (i = 0; i < ARRAY_SIZE(chardevs); ++i)
device_destroy(aoe_class, MKDEV(AOE_MAJOR, chardevs[i].minor));
class_destroy(aoe_class);
device_destroy(&aoe_class, MKDEV(AOE_MAJOR, chardevs[i].minor));
class_unregister(&aoe_class);
unregister_chrdev(AOE_MAJOR, "aoechr");
}

View File

@ -442,13 +442,13 @@ static void fd_times_out(struct timer_list *unused);
static void finish_fdc( void );
static void finish_fdc_done( int dummy );
static void setup_req_params( int drive );
static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int
cmd, unsigned long param);
static int fd_locked_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long param);
static void fd_probe( int drive );
static int fd_test_drive_present( int drive );
static void config_types( void );
static int floppy_open(struct block_device *bdev, fmode_t mode);
static void floppy_release(struct gendisk *disk, fmode_t mode);
static int floppy_open(struct gendisk *disk, blk_mode_t mode);
static void floppy_release(struct gendisk *disk);
/************************* End of Prototypes **************************/
@ -1581,7 +1581,7 @@ static blk_status_t ataflop_queue_rq(struct blk_mq_hw_ctx *hctx,
return BLK_STS_OK;
}
static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode,
static int fd_locked_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long param)
{
struct gendisk *disk = bdev->bd_disk;
@ -1760,15 +1760,15 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode,
/* invalidate the buffer track to force a reread */
BufferDrive = -1;
set_bit(drive, &fake_change);
if (bdev_check_media_change(bdev))
floppy_revalidate(bdev->bd_disk);
if (disk_check_media_change(disk))
floppy_revalidate(disk);
return 0;
default:
return -EINVAL;
}
}
static int fd_ioctl(struct block_device *bdev, fmode_t mode,
static int fd_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long arg)
{
int ret;
@ -1915,32 +1915,31 @@ static void __init config_types( void )
* drive with different device numbers.
*/
static int floppy_open(struct block_device *bdev, fmode_t mode)
static int floppy_open(struct gendisk *disk, blk_mode_t mode)
{
struct atari_floppy_struct *p = bdev->bd_disk->private_data;
int type = MINOR(bdev->bd_dev) >> 2;
struct atari_floppy_struct *p = disk->private_data;
int type = disk->first_minor >> 2;
DPRINT(("fd_open: type=%d\n",type));
if (p->ref && p->type != type)
return -EBUSY;
if (p->ref == -1 || (p->ref && mode & FMODE_EXCL))
if (p->ref == -1 || (p->ref && mode & BLK_OPEN_EXCL))
return -EBUSY;
if (mode & FMODE_EXCL)
if (mode & BLK_OPEN_EXCL)
p->ref = -1;
else
p->ref++;
p->type = type;
if (mode & FMODE_NDELAY)
if (mode & BLK_OPEN_NDELAY)
return 0;
if (mode & (FMODE_READ|FMODE_WRITE)) {
if (bdev_check_media_change(bdev))
floppy_revalidate(bdev->bd_disk);
if (mode & FMODE_WRITE) {
if (mode & (BLK_OPEN_READ | BLK_OPEN_WRITE)) {
if (disk_check_media_change(disk))
floppy_revalidate(disk);
if (mode & BLK_OPEN_WRITE) {
if (p->wpstat) {
if (p->ref < 0)
p->ref = 0;
@ -1953,18 +1952,18 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
return 0;
}
static int floppy_unlocked_open(struct block_device *bdev, fmode_t mode)
static int floppy_unlocked_open(struct gendisk *disk, blk_mode_t mode)
{
int ret;
mutex_lock(&ataflop_mutex);
ret = floppy_open(bdev, mode);
ret = floppy_open(disk, mode);
mutex_unlock(&ataflop_mutex);
return ret;
}
static void floppy_release(struct gendisk *disk, fmode_t mode)
static void floppy_release(struct gendisk *disk)
{
struct atari_floppy_struct *p = disk->private_data;
mutex_lock(&ataflop_mutex);

View File

@ -19,7 +19,7 @@
#include <linux/highmem.h>
#include <linux/mutex.h>
#include <linux/pagemap.h>
#include <linux/radix-tree.h>
#include <linux/xarray.h>
#include <linux/fs.h>
#include <linux/slab.h>
#include <linux/backing-dev.h>
@ -28,7 +28,7 @@
#include <linux/uaccess.h>
/*
* Each block ramdisk device has a radix_tree brd_pages of pages that stores
* Each block ramdisk device has a xarray brd_pages of pages that stores
* the pages containing the block device's contents. A brd page's ->index is
* its offset in PAGE_SIZE units. This is similar to, but in no way connected
* with, the kernel's pagecache or buffer cache (which sit above our block
@ -40,11 +40,9 @@ struct brd_device {
struct list_head brd_list;
/*
* Backing store of pages and lock to protect it. This is the contents
* of the block device.
* Backing store of pages. This is the contents of the block device.
*/
spinlock_t brd_lock;
struct radix_tree_root brd_pages;
struct xarray brd_pages;
u64 brd_nr_pages;
};
@ -56,21 +54,8 @@ static struct page *brd_lookup_page(struct brd_device *brd, sector_t sector)
pgoff_t idx;
struct page *page;
/*
* The page lifetime is protected by the fact that we have opened the
* device node -- brd pages will never be deleted under us, so we
* don't need any further locking or refcounting.
*
* This is strictly true for the radix-tree nodes as well (ie. we
* don't actually need the rcu_read_lock()), however that is not a
* documented feature of the radix-tree API so it is better to be
* safe here (we don't have total exclusion from radix tree updates
* here, only deletes).
*/
rcu_read_lock();
idx = sector >> PAGE_SECTORS_SHIFT; /* sector to page index */
page = radix_tree_lookup(&brd->brd_pages, idx);
rcu_read_unlock();
page = xa_load(&brd->brd_pages, idx);
BUG_ON(page && page->index != idx);
@ -83,7 +68,7 @@ static struct page *brd_lookup_page(struct brd_device *brd, sector_t sector)
static int brd_insert_page(struct brd_device *brd, sector_t sector, gfp_t gfp)
{
pgoff_t idx;
struct page *page;
struct page *page, *cur;
int ret = 0;
page = brd_lookup_page(brd, sector);
@ -94,71 +79,42 @@ static int brd_insert_page(struct brd_device *brd, sector_t sector, gfp_t gfp)
if (!page)
return -ENOMEM;
if (radix_tree_maybe_preload(gfp)) {
__free_page(page);
return -ENOMEM;
}
xa_lock(&brd->brd_pages);
spin_lock(&brd->brd_lock);
idx = sector >> PAGE_SECTORS_SHIFT;
page->index = idx;
if (radix_tree_insert(&brd->brd_pages, idx, page)) {
cur = __xa_cmpxchg(&brd->brd_pages, idx, NULL, page, gfp);
if (unlikely(cur)) {
__free_page(page);
page = radix_tree_lookup(&brd->brd_pages, idx);
if (!page)
ret = -ENOMEM;
else if (page->index != idx)
ret = xa_err(cur);
if (!ret && (cur->index != idx))
ret = -EIO;
} else {
brd->brd_nr_pages++;
}
spin_unlock(&brd->brd_lock);
radix_tree_preload_end();
xa_unlock(&brd->brd_pages);
return ret;
}
/*
* Free all backing store pages and radix tree. This must only be called when
* Free all backing store pages and xarray. This must only be called when
* there are no other users of the device.
*/
#define FREE_BATCH 16
static void brd_free_pages(struct brd_device *brd)
{
unsigned long pos = 0;
struct page *pages[FREE_BATCH];
int nr_pages;
struct page *page;
pgoff_t idx;
do {
int i;
nr_pages = radix_tree_gang_lookup(&brd->brd_pages,
(void **)pages, pos, FREE_BATCH);
for (i = 0; i < nr_pages; i++) {
void *ret;
BUG_ON(pages[i]->index < pos);
pos = pages[i]->index;
ret = radix_tree_delete(&brd->brd_pages, pos);
BUG_ON(!ret || ret != pages[i]);
__free_page(pages[i]);
}
pos++;
/*
* It takes 3.4 seconds to remove 80GiB ramdisk.
* So, we need cond_resched to avoid stalling the CPU.
*/
xa_for_each(&brd->brd_pages, idx, page) {
__free_page(page);
cond_resched();
}
/*
* This assumes radix_tree_gang_lookup always returns as
* many pages as possible. If the radix-tree code changes,
* so will this have to.
*/
} while (nr_pages == FREE_BATCH);
xa_destroy(&brd->brd_pages);
}
/*
@ -372,8 +328,7 @@ static int brd_alloc(int i)
brd->brd_number = i;
list_add_tail(&brd->brd_list, &brd_devices);
spin_lock_init(&brd->brd_lock);
INIT_RADIX_TREE(&brd->brd_pages, GFP_ATOMIC);
xa_init(&brd->brd_pages);
snprintf(buf, DISK_NAME_LEN, "ram%d", i);
if (!IS_ERR_OR_NULL(brd_debugfs_dir))

View File

@ -1043,9 +1043,7 @@ static void bm_page_io_async(struct drbd_bm_aio_ctx *ctx, int page_nr) __must_ho
bio = bio_alloc_bioset(device->ldev->md_bdev, 1, op, GFP_NOIO,
&drbd_md_io_bio_set);
bio->bi_iter.bi_sector = on_disk_sector;
/* bio_add_page of a single page to an empty bio will always succeed,
* according to api. Do we want to assert that? */
bio_add_page(bio, page, len, 0);
__bio_add_page(bio, page, len, 0);
bio->bi_private = ctx;
bio->bi_end_io = drbd_bm_endio;

View File

@ -37,7 +37,6 @@
#include <linux/notifier.h>
#include <linux/kthread.h>
#include <linux/workqueue.h>
#define __KERNEL_SYSCALLS__
#include <linux/unistd.h>
#include <linux/vmalloc.h>
#include <linux/sched/signal.h>
@ -50,8 +49,8 @@
#include "drbd_debugfs.h"
static DEFINE_MUTEX(drbd_main_mutex);
static int drbd_open(struct block_device *bdev, fmode_t mode);
static void drbd_release(struct gendisk *gd, fmode_t mode);
static int drbd_open(struct gendisk *disk, blk_mode_t mode);
static void drbd_release(struct gendisk *gd);
static void md_sync_timer_fn(struct timer_list *t);
static int w_bitmap_io(struct drbd_work *w, int unused);
@ -1883,9 +1882,9 @@ int drbd_send_all(struct drbd_connection *connection, struct socket *sock, void
return 0;
}
static int drbd_open(struct block_device *bdev, fmode_t mode)
static int drbd_open(struct gendisk *disk, blk_mode_t mode)
{
struct drbd_device *device = bdev->bd_disk->private_data;
struct drbd_device *device = disk->private_data;
unsigned long flags;
int rv = 0;
@ -1895,7 +1894,7 @@ static int drbd_open(struct block_device *bdev, fmode_t mode)
* and no race with updating open_cnt */
if (device->state.role != R_PRIMARY) {
if (mode & FMODE_WRITE)
if (mode & BLK_OPEN_WRITE)
rv = -EROFS;
else if (!drbd_allow_oos)
rv = -EMEDIUMTYPE;
@ -1909,9 +1908,10 @@ static int drbd_open(struct block_device *bdev, fmode_t mode)
return rv;
}
static void drbd_release(struct gendisk *gd, fmode_t mode)
static void drbd_release(struct gendisk *gd)
{
struct drbd_device *device = gd->private_data;
mutex_lock(&drbd_main_mutex);
device->open_cnt--;
mutex_unlock(&drbd_main_mutex);

View File

@ -1640,8 +1640,8 @@ static struct block_device *open_backing_dev(struct drbd_device *device,
struct block_device *bdev;
int err = 0;
bdev = blkdev_get_by_path(bdev_path,
FMODE_READ | FMODE_WRITE | FMODE_EXCL, claim_ptr);
bdev = blkdev_get_by_path(bdev_path, BLK_OPEN_READ | BLK_OPEN_WRITE,
claim_ptr, NULL);
if (IS_ERR(bdev)) {
drbd_err(device, "open(\"%s\") failed with %ld\n",
bdev_path, PTR_ERR(bdev));
@ -1653,7 +1653,7 @@ static struct block_device *open_backing_dev(struct drbd_device *device,
err = bd_link_disk_holder(bdev, device->vdisk);
if (err) {
blkdev_put(bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL);
blkdev_put(bdev, claim_ptr);
drbd_err(device, "bd_link_disk_holder(\"%s\", ...) failed with %d\n",
bdev_path, err);
bdev = ERR_PTR(err);
@ -1695,13 +1695,13 @@ static int open_backing_devices(struct drbd_device *device,
}
static void close_backing_dev(struct drbd_device *device, struct block_device *bdev,
bool do_bd_unlink)
void *claim_ptr, bool do_bd_unlink)
{
if (!bdev)
return;
if (do_bd_unlink)
bd_unlink_disk_holder(bdev, device->vdisk);
blkdev_put(bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL);
blkdev_put(bdev, claim_ptr);
}
void drbd_backing_dev_free(struct drbd_device *device, struct drbd_backing_dev *ldev)
@ -1709,8 +1709,11 @@ void drbd_backing_dev_free(struct drbd_device *device, struct drbd_backing_dev *
if (ldev == NULL)
return;
close_backing_dev(device, ldev->md_bdev, ldev->md_bdev != ldev->backing_bdev);
close_backing_dev(device, ldev->backing_bdev, true);
close_backing_dev(device, ldev->md_bdev,
ldev->md.meta_dev_idx < 0 ?
(void *)device : (void *)drbd_m_holder,
ldev->md_bdev != ldev->backing_bdev);
close_backing_dev(device, ldev->backing_bdev, device, true);
kfree(ldev->disk_conf);
kfree(ldev);
@ -2126,8 +2129,11 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info)
fail:
conn_reconfig_done(connection);
if (nbc) {
close_backing_dev(device, nbc->md_bdev, nbc->md_bdev != nbc->backing_bdev);
close_backing_dev(device, nbc->backing_bdev, true);
close_backing_dev(device, nbc->md_bdev,
nbc->disk_conf->meta_dev_idx < 0 ?
(void *)device : (void *)drbd_m_holder,
nbc->md_bdev != nbc->backing_bdev);
close_backing_dev(device, nbc->backing_bdev, device, true);
kfree(nbc);
}
kfree(new_disk_conf);

View File

@ -27,7 +27,6 @@
#include <uapi/linux/sched/types.h>
#include <linux/sched/signal.h>
#include <linux/pkt_sched.h>
#define __KERNEL_SYSCALLS__
#include <linux/unistd.h>
#include <linux/vmalloc.h>
#include <linux/random.h>

View File

@ -402,7 +402,7 @@ static struct floppy_drive_struct drive_state[N_DRIVE];
static struct floppy_write_errors write_errors[N_DRIVE];
static struct timer_list motor_off_timer[N_DRIVE];
static struct blk_mq_tag_set tag_sets[N_DRIVE];
static struct block_device *opened_bdev[N_DRIVE];
static struct gendisk *opened_disk[N_DRIVE];
static DEFINE_MUTEX(open_lock);
static struct floppy_raw_cmd *raw_cmd, default_raw_cmd;
@ -3210,13 +3210,13 @@ static int floppy_raw_cmd_ioctl(int type, int drive, int cmd,
#endif
static int invalidate_drive(struct block_device *bdev)
static int invalidate_drive(struct gendisk *disk)
{
/* invalidate the buffer track to force a reread */
set_bit((long)bdev->bd_disk->private_data, &fake_change);
set_bit((long)disk->private_data, &fake_change);
process_fd_request();
if (bdev_check_media_change(bdev))
floppy_revalidate(bdev->bd_disk);
if (disk_check_media_change(disk))
floppy_revalidate(disk);
return 0;
}
@ -3251,10 +3251,11 @@ static int set_geometry(unsigned int cmd, struct floppy_struct *g,
floppy_type[type].size + 1;
process_fd_request();
for (cnt = 0; cnt < N_DRIVE; cnt++) {
struct block_device *bdev = opened_bdev[cnt];
if (!bdev || ITYPE(drive_state[cnt].fd_device) != type)
struct gendisk *disk = opened_disk[cnt];
if (!disk || ITYPE(drive_state[cnt].fd_device) != type)
continue;
__invalidate_device(bdev, true);
__invalidate_device(disk->part0, true);
}
mutex_unlock(&open_lock);
} else {
@ -3287,7 +3288,7 @@ static int set_geometry(unsigned int cmd, struct floppy_struct *g,
drive_state[current_drive].maxtrack ||
((user_params[drive].sect ^ oldStretch) &
(FD_SWAPSIDES | FD_SECTBASEMASK)))
invalidate_drive(bdev);
invalidate_drive(bdev->bd_disk);
else
process_fd_request();
}
@ -3393,8 +3394,8 @@ static bool valid_floppy_drive_params(const short autodetect[FD_AUTODETECT_SIZE]
return true;
}
static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int cmd,
unsigned long param)
static int fd_locked_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long param)
{
int drive = (long)bdev->bd_disk->private_data;
int type = ITYPE(drive_state[drive].fd_device);
@ -3427,7 +3428,8 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int
return ret;
/* permission checks */
if (((cmd & 0x40) && !(mode & (FMODE_WRITE | FMODE_WRITE_IOCTL))) ||
if (((cmd & 0x40) &&
!(mode & (BLK_OPEN_WRITE | BLK_OPEN_WRITE_IOCTL))) ||
((cmd & 0x80) && !capable(CAP_SYS_ADMIN)))
return -EPERM;
@ -3464,7 +3466,7 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int
current_type[drive] = NULL;
floppy_sizes[drive] = MAX_DISK_SIZE << 1;
drive_state[drive].keep_data = 0;
return invalidate_drive(bdev);
return invalidate_drive(bdev->bd_disk);
case FDSETPRM:
case FDDEFPRM:
return set_geometry(cmd, &inparam.g, drive, type, bdev);
@ -3503,7 +3505,7 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int
case FDFLUSH:
if (lock_fdc(drive))
return -EINTR;
return invalidate_drive(bdev);
return invalidate_drive(bdev->bd_disk);
case FDSETEMSGTRESH:
drive_params[drive].max_errors.reporting = (unsigned short)(param & 0x0f);
return 0;
@ -3565,7 +3567,7 @@ static int fd_locked_ioctl(struct block_device *bdev, fmode_t mode, unsigned int
return 0;
}
static int fd_ioctl(struct block_device *bdev, fmode_t mode,
static int fd_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long param)
{
int ret;
@ -3653,8 +3655,8 @@ struct compat_floppy_write_errors {
#define FDGETFDCSTAT32 _IOR(2, 0x15, struct compat_floppy_fdc_state)
#define FDWERRORGET32 _IOR(2, 0x17, struct compat_floppy_write_errors)
static int compat_set_geometry(struct block_device *bdev, fmode_t mode, unsigned int cmd,
struct compat_floppy_struct __user *arg)
static int compat_set_geometry(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, struct compat_floppy_struct __user *arg)
{
struct floppy_struct v;
int drive, type;
@ -3663,7 +3665,7 @@ static int compat_set_geometry(struct block_device *bdev, fmode_t mode, unsigned
BUILD_BUG_ON(offsetof(struct floppy_struct, name) !=
offsetof(struct compat_floppy_struct, name));
if (!(mode & (FMODE_WRITE | FMODE_WRITE_IOCTL)))
if (!(mode & (BLK_OPEN_WRITE | BLK_OPEN_WRITE_IOCTL)))
return -EPERM;
memset(&v, 0, sizeof(struct floppy_struct));
@ -3860,8 +3862,8 @@ static int compat_werrorget(int drive,
return 0;
}
static int fd_compat_ioctl(struct block_device *bdev, fmode_t mode, unsigned int cmd,
unsigned long param)
static int fd_compat_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long param)
{
int drive = (long)bdev->bd_disk->private_data;
switch (cmd) {
@ -3962,7 +3964,7 @@ static void __init config_types(void)
pr_cont("\n");
}
static void floppy_release(struct gendisk *disk, fmode_t mode)
static void floppy_release(struct gendisk *disk)
{
int drive = (long)disk->private_data;
@ -3973,7 +3975,7 @@ static void floppy_release(struct gendisk *disk, fmode_t mode)
drive_state[drive].fd_ref = 0;
}
if (!drive_state[drive].fd_ref)
opened_bdev[drive] = NULL;
opened_disk[drive] = NULL;
mutex_unlock(&open_lock);
mutex_unlock(&floppy_mutex);
}
@ -3983,9 +3985,9 @@ static void floppy_release(struct gendisk *disk, fmode_t mode)
* /dev/PS0 etc), and disallows simultaneous access to the same
* drive with different device numbers.
*/
static int floppy_open(struct block_device *bdev, fmode_t mode)
static int floppy_open(struct gendisk *disk, blk_mode_t mode)
{
int drive = (long)bdev->bd_disk->private_data;
int drive = (long)disk->private_data;
int old_dev, new_dev;
int try;
int res = -EBUSY;
@ -3994,7 +3996,7 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
mutex_lock(&floppy_mutex);
mutex_lock(&open_lock);
old_dev = drive_state[drive].fd_device;
if (opened_bdev[drive] && opened_bdev[drive] != bdev)
if (opened_disk[drive] && opened_disk[drive] != disk)
goto out2;
if (!drive_state[drive].fd_ref && (drive_params[drive].flags & FD_BROKEN_DCL)) {
@ -4004,7 +4006,7 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
drive_state[drive].fd_ref++;
opened_bdev[drive] = bdev;
opened_disk[drive] = disk;
res = -ENXIO;
@ -4038,7 +4040,7 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
}
}
new_dev = MINOR(bdev->bd_dev);
new_dev = disk->first_minor;
drive_state[drive].fd_device = new_dev;
set_capacity(disks[drive][ITYPE(new_dev)], floppy_sizes[new_dev]);
if (old_dev != -1 && old_dev != new_dev) {
@ -4048,21 +4050,20 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
if (fdc_state[FDC(drive)].rawcmd == 1)
fdc_state[FDC(drive)].rawcmd = 2;
if (!(mode & FMODE_NDELAY)) {
if (mode & (FMODE_READ|FMODE_WRITE)) {
if (!(mode & BLK_OPEN_NDELAY)) {
if (mode & (BLK_OPEN_READ | BLK_OPEN_WRITE)) {
drive_state[drive].last_checked = 0;
clear_bit(FD_OPEN_SHOULD_FAIL_BIT,
&drive_state[drive].flags);
if (bdev_check_media_change(bdev))
floppy_revalidate(bdev->bd_disk);
if (disk_check_media_change(disk))
floppy_revalidate(disk);
if (test_bit(FD_DISK_CHANGED_BIT, &drive_state[drive].flags))
goto out;
if (test_bit(FD_OPEN_SHOULD_FAIL_BIT, &drive_state[drive].flags))
goto out;
}
res = -EROFS;
if ((mode & FMODE_WRITE) &&
if ((mode & BLK_OPEN_WRITE) &&
!test_bit(FD_DISK_WRITABLE_BIT, &drive_state[drive].flags))
goto out;
}
@ -4073,7 +4074,7 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
drive_state[drive].fd_ref--;
if (!drive_state[drive].fd_ref)
opened_bdev[drive] = NULL;
opened_disk[drive] = NULL;
out2:
mutex_unlock(&open_lock);
mutex_unlock(&floppy_mutex);
@ -4147,7 +4148,7 @@ static int __floppy_read_block_0(struct block_device *bdev, int drive)
cbdata.drive = drive;
bio_init(&bio, bdev, &bio_vec, 1, REQ_OP_READ);
bio_add_page(&bio, page, block_size(bdev), 0);
__bio_add_page(&bio, page, block_size(bdev), 0);
bio.bi_iter.bi_sector = 0;
bio.bi_flags |= (1 << BIO_QUIET);
@ -4203,7 +4204,8 @@ static int floppy_revalidate(struct gendisk *disk)
drive_state[drive].generation++;
if (drive_no_geom(drive)) {
/* auto-sensing */
res = __floppy_read_block_0(opened_bdev[drive], drive);
res = __floppy_read_block_0(opened_disk[drive]->part0,
drive);
} else {
if (cf)
poll_drive(false, FD_RAW_NEED_DISK);

View File

@ -990,7 +990,7 @@ loop_set_status_from_info(struct loop_device *lo,
return 0;
}
static int loop_configure(struct loop_device *lo, fmode_t mode,
static int loop_configure(struct loop_device *lo, blk_mode_t mode,
struct block_device *bdev,
const struct loop_config *config)
{
@ -1014,8 +1014,8 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
* If we don't hold exclusive handle for the device, upgrade to it
* here to avoid changing device under exclusive owner.
*/
if (!(mode & FMODE_EXCL)) {
error = bd_prepare_to_claim(bdev, loop_configure);
if (!(mode & BLK_OPEN_EXCL)) {
error = bd_prepare_to_claim(bdev, loop_configure, NULL);
if (error)
goto out_putf;
}
@ -1050,7 +1050,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
if (error)
goto out_unlock;
if (!(file->f_mode & FMODE_WRITE) || !(mode & FMODE_WRITE) ||
if (!(file->f_mode & FMODE_WRITE) || !(mode & BLK_OPEN_WRITE) ||
!file->f_op->write_iter)
lo->lo_flags |= LO_FLAGS_READ_ONLY;
@ -1116,7 +1116,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
if (partscan)
loop_reread_partitions(lo);
if (!(mode & FMODE_EXCL))
if (!(mode & BLK_OPEN_EXCL))
bd_abort_claiming(bdev, loop_configure);
return 0;
@ -1124,7 +1124,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
out_unlock:
loop_global_unlock(lo, is_loop);
out_bdev:
if (!(mode & FMODE_EXCL))
if (!(mode & BLK_OPEN_EXCL))
bd_abort_claiming(bdev, loop_configure);
out_putf:
fput(file);
@ -1528,7 +1528,7 @@ static int lo_simple_ioctl(struct loop_device *lo, unsigned int cmd,
return err;
}
static int lo_ioctl(struct block_device *bdev, fmode_t mode,
static int lo_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long arg)
{
struct loop_device *lo = bdev->bd_disk->private_data;
@ -1563,24 +1563,22 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode,
return loop_clr_fd(lo);
case LOOP_SET_STATUS:
err = -EPERM;
if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) {
if ((mode & BLK_OPEN_WRITE) || capable(CAP_SYS_ADMIN))
err = loop_set_status_old(lo, argp);
}
break;
case LOOP_GET_STATUS:
return loop_get_status_old(lo, argp);
case LOOP_SET_STATUS64:
err = -EPERM;
if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) {
if ((mode & BLK_OPEN_WRITE) || capable(CAP_SYS_ADMIN))
err = loop_set_status64(lo, argp);
}
break;
case LOOP_GET_STATUS64:
return loop_get_status64(lo, argp);
case LOOP_SET_CAPACITY:
case LOOP_SET_DIRECT_IO:
case LOOP_SET_BLOCK_SIZE:
if (!(mode & FMODE_WRITE) && !capable(CAP_SYS_ADMIN))
if (!(mode & BLK_OPEN_WRITE) && !capable(CAP_SYS_ADMIN))
return -EPERM;
fallthrough;
default:
@ -1691,7 +1689,7 @@ loop_get_status_compat(struct loop_device *lo,
return err;
}
static int lo_compat_ioctl(struct block_device *bdev, fmode_t mode,
static int lo_compat_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long arg)
{
struct loop_device *lo = bdev->bd_disk->private_data;
@ -1727,7 +1725,7 @@ static int lo_compat_ioctl(struct block_device *bdev, fmode_t mode,
}
#endif
static void lo_release(struct gendisk *disk, fmode_t mode)
static void lo_release(struct gendisk *disk)
{
struct loop_device *lo = disk->private_data;

View File

@ -3041,7 +3041,7 @@ static int rssd_disk_name_format(char *prefix,
* structure pointer.
*/
static int mtip_block_ioctl(struct block_device *dev,
fmode_t mode,
blk_mode_t mode,
unsigned cmd,
unsigned long arg)
{
@ -3079,7 +3079,7 @@ static int mtip_block_ioctl(struct block_device *dev,
* structure pointer.
*/
static int mtip_block_compat_ioctl(struct block_device *dev,
fmode_t mode,
blk_mode_t mode,
unsigned cmd,
unsigned long arg)
{

View File

@ -1502,7 +1502,7 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd,
return -ENOTTY;
}
static int nbd_ioctl(struct block_device *bdev, fmode_t mode,
static int nbd_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long arg)
{
struct nbd_device *nbd = bdev->bd_disk->private_data;
@ -1553,13 +1553,13 @@ static struct nbd_config *nbd_alloc_config(void)
return config;
}
static int nbd_open(struct block_device *bdev, fmode_t mode)
static int nbd_open(struct gendisk *disk, blk_mode_t mode)
{
struct nbd_device *nbd;
int ret = 0;
mutex_lock(&nbd_index_mutex);
nbd = bdev->bd_disk->private_data;
nbd = disk->private_data;
if (!nbd) {
ret = -ENXIO;
goto out;
@ -1587,17 +1587,17 @@ static int nbd_open(struct block_device *bdev, fmode_t mode)
refcount_inc(&nbd->refs);
mutex_unlock(&nbd->config_lock);
if (max_part)
set_bit(GD_NEED_PART_SCAN, &bdev->bd_disk->state);
set_bit(GD_NEED_PART_SCAN, &disk->state);
} else if (nbd_disconnected(nbd->config)) {
if (max_part)
set_bit(GD_NEED_PART_SCAN, &bdev->bd_disk->state);
set_bit(GD_NEED_PART_SCAN, &disk->state);
}
out:
mutex_unlock(&nbd_index_mutex);
return ret;
}
static void nbd_release(struct gendisk *disk, fmode_t mode)
static void nbd_release(struct gendisk *disk)
{
struct nbd_device *nbd = disk->private_data;
@ -1776,7 +1776,8 @@ static struct nbd_device *nbd_dev_add(int index, unsigned int refs)
if (err == -ENOSPC)
err = -EEXIST;
} else {
err = idr_alloc(&nbd_index_idr, nbd, 0, 0, GFP_KERNEL);
err = idr_alloc(&nbd_index_idr, nbd, 0,
(MINORMASK >> part_shift) + 1, GFP_KERNEL);
if (err >= 0)
index = err;
}

File diff suppressed because it is too large Load Diff

View File

@ -660,9 +660,9 @@ static bool pending_result_dec(struct pending_result *pending, int *result)
return true;
}
static int rbd_open(struct block_device *bdev, fmode_t mode)
static int rbd_open(struct gendisk *disk, blk_mode_t mode)
{
struct rbd_device *rbd_dev = bdev->bd_disk->private_data;
struct rbd_device *rbd_dev = disk->private_data;
bool removing = false;
spin_lock_irq(&rbd_dev->lock);
@ -679,7 +679,7 @@ static int rbd_open(struct block_device *bdev, fmode_t mode)
return 0;
}
static void rbd_release(struct gendisk *disk, fmode_t mode)
static void rbd_release(struct gendisk *disk)
{
struct rbd_device *rbd_dev = disk->private_data;
unsigned long open_count_before;

View File

@ -3,13 +3,11 @@
ccflags-y := -I$(srctree)/drivers/infiniband/ulp/rtrs
rnbd-client-y := rnbd-clt.o \
rnbd-clt-sysfs.o \
rnbd-common.o
rnbd-clt-sysfs.o
CFLAGS_rnbd-srv-trace.o = -I$(src)
rnbd-server-y := rnbd-common.o \
rnbd-srv.o \
rnbd-server-y := rnbd-srv.o \
rnbd-srv-sysfs.o \
rnbd-srv-trace.o

View File

@ -24,7 +24,9 @@
#include "rnbd-clt.h"
static struct device *rnbd_dev;
static struct class *rnbd_dev_class;
static const struct class rnbd_dev_class = {
.name = "rnbd_client",
};
static struct kobject *rnbd_devs_kobj;
enum {
@ -278,7 +280,7 @@ static ssize_t access_mode_show(struct kobject *kobj,
dev = container_of(kobj, struct rnbd_clt_dev, kobj);
return sysfs_emit(page, "%s\n", rnbd_access_mode_str(dev->access_mode));
return sysfs_emit(page, "%s\n", rnbd_access_modes[dev->access_mode].str);
}
static struct kobj_attribute rnbd_clt_access_mode =
@ -596,7 +598,7 @@ static ssize_t rnbd_clt_map_device_store(struct kobject *kobj,
pr_info("Mapping device %s on session %s, (access_mode: %s, nr_poll_queues: %d)\n",
pathname, sessname,
rnbd_access_mode_str(access_mode),
rnbd_access_modes[access_mode].str,
nr_poll_queues);
dev = rnbd_clt_map_device(sessname, paths, path_cnt, port_nr, pathname,
@ -646,11 +648,11 @@ int rnbd_clt_create_sysfs_files(void)
{
int err;
rnbd_dev_class = class_create("rnbd-client");
if (IS_ERR(rnbd_dev_class))
return PTR_ERR(rnbd_dev_class);
err = class_register(&rnbd_dev_class);
if (err)
return err;
rnbd_dev = device_create_with_groups(rnbd_dev_class, NULL,
rnbd_dev = device_create_with_groups(&rnbd_dev_class, NULL,
MKDEV(0, 0), NULL,
default_attr_groups, "ctl");
if (IS_ERR(rnbd_dev)) {
@ -666,9 +668,9 @@ int rnbd_clt_create_sysfs_files(void)
return 0;
dev_destroy:
device_destroy(rnbd_dev_class, MKDEV(0, 0));
device_destroy(&rnbd_dev_class, MKDEV(0, 0));
cls_destroy:
class_destroy(rnbd_dev_class);
class_unregister(&rnbd_dev_class);
return err;
}
@ -678,6 +680,6 @@ void rnbd_clt_destroy_sysfs_files(void)
sysfs_remove_group(&rnbd_dev->kobj, &default_attr_group);
kobject_del(rnbd_devs_kobj);
kobject_put(rnbd_devs_kobj);
device_destroy(rnbd_dev_class, MKDEV(0, 0));
class_destroy(rnbd_dev_class);
device_destroy(&rnbd_dev_class, MKDEV(0, 0));
class_unregister(&rnbd_dev_class);
}

View File

@ -921,11 +921,11 @@ rnbd_clt_session *find_or_create_sess(const char *sessname, bool *first)
return sess;
}
static int rnbd_client_open(struct block_device *block_device, fmode_t mode)
static int rnbd_client_open(struct gendisk *disk, blk_mode_t mode)
{
struct rnbd_clt_dev *dev = block_device->bd_disk->private_data;
struct rnbd_clt_dev *dev = disk->private_data;
if (get_disk_ro(dev->gd) && (mode & FMODE_WRITE))
if (get_disk_ro(dev->gd) && (mode & BLK_OPEN_WRITE))
return -EPERM;
if (dev->dev_state == DEV_STATE_UNMAPPED ||
@ -935,7 +935,7 @@ static int rnbd_client_open(struct block_device *block_device, fmode_t mode)
return 0;
}
static void rnbd_client_release(struct gendisk *gen, fmode_t mode)
static void rnbd_client_release(struct gendisk *gen)
{
struct rnbd_clt_dev *dev = gen->private_data;

View File

@ -1,23 +0,0 @@
// SPDX-License-Identifier: GPL-2.0-or-later
/*
* RDMA Network Block Driver
*
* Copyright (c) 2014 - 2018 ProfitBricks GmbH. All rights reserved.
* Copyright (c) 2018 - 2019 1&1 IONOS Cloud GmbH. All rights reserved.
* Copyright (c) 2019 - 2020 1&1 IONOS SE. All rights reserved.
*/
#include "rnbd-proto.h"
const char *rnbd_access_mode_str(enum rnbd_access_mode mode)
{
switch (mode) {
case RNBD_ACCESS_RO:
return "ro";
case RNBD_ACCESS_RW:
return "rw";
case RNBD_ACCESS_MIGRATION:
return "migration";
default:
return "unknown";
}
}

View File

@ -61,6 +61,15 @@ enum rnbd_access_mode {
RNBD_ACCESS_MIGRATION,
};
static const __maybe_unused struct {
enum rnbd_access_mode mode;
const char *str;
} rnbd_access_modes[] = {
[RNBD_ACCESS_RO] = {RNBD_ACCESS_RO, "ro"},
[RNBD_ACCESS_RW] = {RNBD_ACCESS_RW, "rw"},
[RNBD_ACCESS_MIGRATION] = {RNBD_ACCESS_MIGRATION, "migration"},
};
/**
* struct rnbd_msg_sess_info - initial session info from client to server
* @hdr: message header
@ -185,7 +194,6 @@ struct rnbd_msg_io {
enum rnbd_io_flags {
/* Operations */
RNBD_OP_READ = 0,
RNBD_OP_WRITE = 1,
RNBD_OP_FLUSH = 2,
@ -193,15 +201,9 @@ enum rnbd_io_flags {
RNBD_OP_SECURE_ERASE = 4,
RNBD_OP_WRITE_SAME = 5,
RNBD_OP_LAST,
/* Flags */
RNBD_F_SYNC = 1<<(RNBD_OP_BITS + 0),
RNBD_F_FUA = 1<<(RNBD_OP_BITS + 1),
RNBD_F_ALL = (RNBD_F_SYNC | RNBD_F_FUA)
};
static inline u32 rnbd_op(u32 flags)
@ -214,21 +216,6 @@ static inline u32 rnbd_flags(u32 flags)
return flags & ~RNBD_OP_MASK;
}
static inline bool rnbd_flags_supported(u32 flags)
{
u32 op;
op = rnbd_op(flags);
flags = rnbd_flags(flags);
if (op >= RNBD_OP_LAST)
return false;
if (flags & ~RNBD_F_ALL)
return false;
return true;
}
static inline blk_opf_t rnbd_to_bio_flags(u32 rnbd_opf)
{
blk_opf_t bio_opf;

View File

@ -9,7 +9,6 @@
#undef pr_fmt
#define pr_fmt(fmt) KBUILD_MODNAME " L" __stringify(__LINE__) ": " fmt
#include <uapi/linux/limits.h>
#include <linux/kobject.h>
#include <linux/sysfs.h>
#include <linux/stat.h>
@ -20,7 +19,9 @@
#include "rnbd-srv.h"
static struct device *rnbd_dev;
static struct class *rnbd_dev_class;
static const struct class rnbd_dev_class = {
.name = "rnbd-server",
};
static struct kobject *rnbd_devs_kobj;
static void rnbd_srv_dev_release(struct kobject *kobj)
@ -88,8 +89,7 @@ static ssize_t read_only_show(struct kobject *kobj, struct kobj_attribute *attr,
sess_dev = container_of(kobj, struct rnbd_srv_sess_dev, kobj);
return sysfs_emit(page, "%d\n",
!(sess_dev->open_flags & FMODE_WRITE));
return sysfs_emit(page, "%d\n", sess_dev->readonly);
}
static struct kobj_attribute rnbd_srv_dev_session_ro_attr =
@ -104,7 +104,7 @@ static ssize_t access_mode_show(struct kobject *kobj,
sess_dev = container_of(kobj, struct rnbd_srv_sess_dev, kobj);
return sysfs_emit(page, "%s\n",
rnbd_access_mode_str(sess_dev->access_mode));
rnbd_access_modes[sess_dev->access_mode].str);
}
static struct kobj_attribute rnbd_srv_dev_session_access_mode_attr =
@ -215,12 +215,12 @@ int rnbd_srv_create_sysfs_files(void)
{
int err;
rnbd_dev_class = class_create("rnbd-server");
if (IS_ERR(rnbd_dev_class))
return PTR_ERR(rnbd_dev_class);
err = class_register(&rnbd_dev_class);
if (err)
return err;
rnbd_dev = device_create(rnbd_dev_class, NULL,
MKDEV(0, 0), NULL, "ctl");
rnbd_dev = device_create(&rnbd_dev_class, NULL,
MKDEV(0, 0), NULL, "ctl");
if (IS_ERR(rnbd_dev)) {
err = PTR_ERR(rnbd_dev);
goto cls_destroy;
@ -234,9 +234,9 @@ int rnbd_srv_create_sysfs_files(void)
return 0;
dev_destroy:
device_destroy(rnbd_dev_class, MKDEV(0, 0));
device_destroy(&rnbd_dev_class, MKDEV(0, 0));
cls_destroy:
class_destroy(rnbd_dev_class);
class_unregister(&rnbd_dev_class);
return err;
}
@ -245,6 +245,6 @@ void rnbd_srv_destroy_sysfs_files(void)
{
kobject_del(rnbd_devs_kobj);
kobject_put(rnbd_devs_kobj);
device_destroy(rnbd_dev_class, MKDEV(0, 0));
class_destroy(rnbd_dev_class);
device_destroy(&rnbd_dev_class, MKDEV(0, 0));
class_unregister(&rnbd_dev_class);
}

View File

@ -96,7 +96,7 @@ rnbd_get_sess_dev(int dev_id, struct rnbd_srv_session *srv_sess)
ret = kref_get_unless_zero(&sess_dev->kref);
rcu_read_unlock();
if (!sess_dev || !ret)
if (!ret)
return ERR_PTR(-ENXIO);
return sess_dev;
@ -180,7 +180,7 @@ static void destroy_device(struct kref *kref)
WARN_ONCE(!list_empty(&dev->sess_dev_list),
"Device %s is being destroyed but still in use!\n",
dev->id);
dev->name);
spin_lock(&dev_lock);
list_del(&dev->list);
@ -219,10 +219,10 @@ void rnbd_destroy_sess_dev(struct rnbd_srv_sess_dev *sess_dev, bool keep_id)
rnbd_put_sess_dev(sess_dev);
wait_for_completion(&dc); /* wait for inflights to drop to zero */
blkdev_put(sess_dev->bdev, sess_dev->open_flags);
blkdev_put(sess_dev->bdev, NULL);
mutex_lock(&sess_dev->dev->lock);
list_del(&sess_dev->dev_list);
if (sess_dev->open_flags & FMODE_WRITE)
if (!sess_dev->readonly)
sess_dev->dev->open_write_cnt--;
mutex_unlock(&sess_dev->dev->lock);
@ -356,7 +356,7 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
const void *msg, size_t len,
void *data, size_t datalen);
static int process_msg_sess_info(struct rnbd_srv_session *srv_sess,
static void process_msg_sess_info(struct rnbd_srv_session *srv_sess,
const void *msg, size_t len,
void *data, size_t datalen);
@ -384,8 +384,7 @@ static int rnbd_srv_rdma_ev(void *priv, struct rtrs_srv_op *id,
ret = process_msg_open(srv_sess, usr, usrlen, data, datalen);
break;
case RNBD_MSG_SESS_INFO:
ret = process_msg_sess_info(srv_sess, usr, usrlen, data,
datalen);
process_msg_sess_info(srv_sess, usr, usrlen, data, datalen);
break;
default:
pr_warn("Received unexpected message type %d from session %s\n",
@ -431,7 +430,7 @@ static struct rnbd_srv_dev *rnbd_srv_init_srv_dev(struct block_device *bdev)
if (!dev)
return ERR_PTR(-ENOMEM);
snprintf(dev->id, sizeof(dev->id), "%pg", bdev);
snprintf(dev->name, sizeof(dev->name), "%pg", bdev);
kref_init(&dev->kref);
INIT_LIST_HEAD(&dev->sess_dev_list);
mutex_init(&dev->lock);
@ -446,7 +445,7 @@ rnbd_srv_find_or_add_srv_dev(struct rnbd_srv_dev *new_dev)
spin_lock(&dev_lock);
list_for_each_entry(dev, &dev_list, list) {
if (!strncmp(dev->id, new_dev->id, sizeof(dev->id))) {
if (!strncmp(dev->name, new_dev->name, sizeof(dev->name))) {
if (!kref_get_unless_zero(&dev->kref))
/*
* We lost the race, device is almost dead.
@ -467,39 +466,38 @@ static int rnbd_srv_check_update_open_perm(struct rnbd_srv_dev *srv_dev,
struct rnbd_srv_session *srv_sess,
enum rnbd_access_mode access_mode)
{
int ret = -EPERM;
int ret = 0;
mutex_lock(&srv_dev->lock);
switch (access_mode) {
case RNBD_ACCESS_RO:
ret = 0;
break;
case RNBD_ACCESS_RW:
if (srv_dev->open_write_cnt == 0) {
srv_dev->open_write_cnt++;
ret = 0;
} else {
pr_err("Mapping device '%s' for session %s with RW permissions failed. Device already opened as 'RW' by %d client(s), access mode %s.\n",
srv_dev->id, srv_sess->sessname,
srv_dev->name, srv_sess->sessname,
srv_dev->open_write_cnt,
rnbd_access_mode_str(access_mode));
rnbd_access_modes[access_mode].str);
ret = -EPERM;
}
break;
case RNBD_ACCESS_MIGRATION:
if (srv_dev->open_write_cnt < 2) {
srv_dev->open_write_cnt++;
ret = 0;
} else {
pr_err("Mapping device '%s' for session %s with migration permissions failed. Device already opened as 'RW' by %d client(s), access mode %s.\n",
srv_dev->id, srv_sess->sessname,
srv_dev->name, srv_sess->sessname,
srv_dev->open_write_cnt,
rnbd_access_mode_str(access_mode));
rnbd_access_modes[access_mode].str);
ret = -EPERM;
}
break;
default:
pr_err("Received mapping request for device '%s' on session %s with invalid access mode: %d\n",
srv_dev->id, srv_sess->sessname, access_mode);
srv_dev->name, srv_sess->sessname, access_mode);
ret = -EINVAL;
}
@ -561,7 +559,7 @@ static void rnbd_srv_fill_msg_open_rsp(struct rnbd_msg_open_rsp *rsp,
static struct rnbd_srv_sess_dev *
rnbd_srv_create_set_sess_dev(struct rnbd_srv_session *srv_sess,
const struct rnbd_msg_open *open_msg,
struct block_device *bdev, fmode_t open_flags,
struct block_device *bdev, bool readonly,
struct rnbd_srv_dev *srv_dev)
{
struct rnbd_srv_sess_dev *sdev = rnbd_sess_dev_alloc(srv_sess);
@ -576,7 +574,7 @@ rnbd_srv_create_set_sess_dev(struct rnbd_srv_session *srv_sess,
sdev->bdev = bdev;
sdev->sess = srv_sess;
sdev->dev = srv_dev;
sdev->open_flags = open_flags;
sdev->readonly = readonly;
sdev->access_mode = open_msg->access_mode;
return sdev;
@ -631,7 +629,7 @@ static char *rnbd_srv_get_full_path(struct rnbd_srv_session *srv_sess,
return full_path;
}
static int process_msg_sess_info(struct rnbd_srv_session *srv_sess,
static void process_msg_sess_info(struct rnbd_srv_session *srv_sess,
const void *msg, size_t len,
void *data, size_t datalen)
{
@ -644,8 +642,6 @@ static int process_msg_sess_info(struct rnbd_srv_session *srv_sess,
rsp->hdr.type = cpu_to_le16(RNBD_MSG_SESS_INFO_RSP);
rsp->ver = srv_sess->ver;
return 0;
}
/**
@ -681,15 +677,14 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
struct rnbd_srv_sess_dev *srv_sess_dev;
const struct rnbd_msg_open *open_msg = msg;
struct block_device *bdev;
fmode_t open_flags;
blk_mode_t open_flags = BLK_OPEN_READ;
char *full_path;
struct rnbd_msg_open_rsp *rsp = data;
trace_process_msg_open(srv_sess, open_msg);
open_flags = FMODE_READ;
if (open_msg->access_mode != RNBD_ACCESS_RO)
open_flags |= FMODE_WRITE;
open_flags |= BLK_OPEN_WRITE;
mutex_lock(&srv_sess->lock);
@ -719,7 +714,7 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
goto reject;
}
bdev = blkdev_get_by_path(full_path, open_flags, THIS_MODULE);
bdev = blkdev_get_by_path(full_path, open_flags, NULL, NULL);
if (IS_ERR(bdev)) {
ret = PTR_ERR(bdev);
pr_err("Opening device '%s' on session %s failed, failed to open the block device, err: %d\n",
@ -736,9 +731,9 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
goto blkdev_put;
}
srv_sess_dev = rnbd_srv_create_set_sess_dev(srv_sess, open_msg,
bdev, open_flags,
srv_dev);
srv_sess_dev = rnbd_srv_create_set_sess_dev(srv_sess, open_msg, bdev,
open_msg->access_mode == RNBD_ACCESS_RO,
srv_dev);
if (IS_ERR(srv_sess_dev)) {
pr_err("Opening device '%s' on session %s failed, creating sess_dev failed, err: %ld\n",
full_path, srv_sess->sessname, PTR_ERR(srv_sess_dev));
@ -774,7 +769,7 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
list_add(&srv_sess_dev->dev_list, &srv_dev->sess_dev_list);
mutex_unlock(&srv_dev->lock);
rnbd_srv_info(srv_sess_dev, "Opened device '%s'\n", srv_dev->id);
rnbd_srv_info(srv_sess_dev, "Opened device '%s'\n", srv_dev->name);
kfree(full_path);
@ -795,7 +790,7 @@ static int process_msg_open(struct rnbd_srv_session *srv_sess,
}
rnbd_put_srv_dev(srv_dev);
blkdev_put:
blkdev_put(bdev, open_flags);
blkdev_put(bdev, NULL);
free_path:
kfree(full_path);
reject:
@ -808,7 +803,7 @@ static struct rtrs_srv_ctx *rtrs_ctx;
static struct rtrs_srv_ops rtrs_ops;
static int __init rnbd_srv_init_module(void)
{
int err;
int err = 0;
BUILD_BUG_ON(sizeof(struct rnbd_msg_hdr) != 4);
BUILD_BUG_ON(sizeof(struct rnbd_msg_sess_info) != 36);
@ -822,19 +817,17 @@ static int __init rnbd_srv_init_module(void)
};
rtrs_ctx = rtrs_srv_open(&rtrs_ops, port_nr);
if (IS_ERR(rtrs_ctx)) {
err = PTR_ERR(rtrs_ctx);
pr_err("rtrs_srv_open(), err: %d\n", err);
return err;
return PTR_ERR(rtrs_ctx);
}
err = rnbd_srv_create_sysfs_files();
if (err) {
pr_err("rnbd_srv_create_sysfs_files(), err: %d\n", err);
rtrs_srv_close(rtrs_ctx);
return err;
}
return 0;
return err;
}
static void __exit rnbd_srv_cleanup_module(void)

View File

@ -35,7 +35,7 @@ struct rnbd_srv_dev {
struct kobject dev_kobj;
struct kobject *dev_sessions_kobj;
struct kref kref;
char id[NAME_MAX];
char name[NAME_MAX];
/* List of rnbd_srv_sess_dev structs */
struct list_head sess_dev_list;
struct mutex lock;
@ -52,7 +52,7 @@ struct rnbd_srv_sess_dev {
struct kobject kobj;
u32 device_id;
bool keep_id;
fmode_t open_flags;
bool readonly;
struct kref kref;
struct completion *destroy_comp;
char pathname[NAME_MAX];

View File

@ -139,7 +139,7 @@ static int vdc_getgeo(struct block_device *bdev, struct hd_geometry *geo)
* when vdisk_mtype is VD_MEDIA_TYPE_CD or VD_MEDIA_TYPE_DVD.
* Needed to be able to install inside an ldom from an iso image.
*/
static int vdc_ioctl(struct block_device *bdev, fmode_t mode,
static int vdc_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned command, unsigned long argument)
{
struct vdc_port *port = bdev->bd_disk->private_data;

View File

@ -608,20 +608,18 @@ static void setup_medium(struct floppy_state *fs)
}
}
static int floppy_open(struct block_device *bdev, fmode_t mode)
static int floppy_open(struct gendisk *disk, blk_mode_t mode)
{
struct floppy_state *fs = bdev->bd_disk->private_data;
struct floppy_state *fs = disk->private_data;
struct swim __iomem *base = fs->swd->base;
int err;
if (fs->ref_count == -1 || (fs->ref_count && mode & FMODE_EXCL))
if (fs->ref_count == -1 || (fs->ref_count && mode & BLK_OPEN_EXCL))
return -EBUSY;
if (mode & FMODE_EXCL)
if (mode & BLK_OPEN_EXCL)
fs->ref_count = -1;
else
fs->ref_count++;
swim_write(base, setup, S_IBM_DRIVE | S_FCLK_DIV2);
udelay(10);
swim_drive(base, fs->location);
@ -636,13 +634,13 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
set_capacity(fs->disk, fs->total_secs);
if (mode & FMODE_NDELAY)
if (mode & BLK_OPEN_NDELAY)
return 0;
if (mode & (FMODE_READ|FMODE_WRITE)) {
if (bdev_check_media_change(bdev) && fs->disk_in)
if (mode & (BLK_OPEN_READ | BLK_OPEN_WRITE)) {
if (disk_check_media_change(disk) && fs->disk_in)
fs->ejected = 0;
if ((mode & FMODE_WRITE) && fs->write_protected) {
if ((mode & BLK_OPEN_WRITE) && fs->write_protected) {
err = -EROFS;
goto out;
}
@ -659,18 +657,18 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
return err;
}
static int floppy_unlocked_open(struct block_device *bdev, fmode_t mode)
static int floppy_unlocked_open(struct gendisk *disk, blk_mode_t mode)
{
int ret;
mutex_lock(&swim_mutex);
ret = floppy_open(bdev, mode);
ret = floppy_open(disk, mode);
mutex_unlock(&swim_mutex);
return ret;
}
static void floppy_release(struct gendisk *disk, fmode_t mode)
static void floppy_release(struct gendisk *disk)
{
struct floppy_state *fs = disk->private_data;
struct swim __iomem *base = fs->swd->base;
@ -686,7 +684,7 @@ static void floppy_release(struct gendisk *disk, fmode_t mode)
mutex_unlock(&swim_mutex);
}
static int floppy_ioctl(struct block_device *bdev, fmode_t mode,
static int floppy_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long param)
{
struct floppy_state *fs = bdev->bd_disk->private_data;

View File

@ -246,10 +246,9 @@ static int grab_drive(struct floppy_state *fs, enum swim_state state,
int interruptible);
static void release_drive(struct floppy_state *fs);
static int fd_eject(struct floppy_state *fs);
static int floppy_ioctl(struct block_device *bdev, fmode_t mode,
static int floppy_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long param);
static int floppy_open(struct block_device *bdev, fmode_t mode);
static void floppy_release(struct gendisk *disk, fmode_t mode);
static int floppy_open(struct gendisk *disk, blk_mode_t mode);
static unsigned int floppy_check_events(struct gendisk *disk,
unsigned int clearing);
static int floppy_revalidate(struct gendisk *disk);
@ -883,7 +882,7 @@ static int fd_eject(struct floppy_state *fs)
static struct floppy_struct floppy_type =
{ 2880,18,2,80,0,0x1B,0x00,0xCF,0x6C,NULL }; /* 7 1.44MB 3.5" */
static int floppy_locked_ioctl(struct block_device *bdev, fmode_t mode,
static int floppy_locked_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long param)
{
struct floppy_state *fs = bdev->bd_disk->private_data;
@ -911,7 +910,7 @@ static int floppy_locked_ioctl(struct block_device *bdev, fmode_t mode,
return -ENOTTY;
}
static int floppy_ioctl(struct block_device *bdev, fmode_t mode,
static int floppy_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long param)
{
int ret;
@ -923,9 +922,9 @@ static int floppy_ioctl(struct block_device *bdev, fmode_t mode,
return ret;
}
static int floppy_open(struct block_device *bdev, fmode_t mode)
static int floppy_open(struct gendisk *disk, blk_mode_t mode)
{
struct floppy_state *fs = bdev->bd_disk->private_data;
struct floppy_state *fs = disk->private_data;
struct swim3 __iomem *sw = fs->swim3;
int n, err = 0;
@ -958,18 +957,18 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
swim3_action(fs, SETMFM);
swim3_select(fs, RELAX);
} else if (fs->ref_count == -1 || mode & FMODE_EXCL)
} else if (fs->ref_count == -1 || mode & BLK_OPEN_EXCL)
return -EBUSY;
if (err == 0 && (mode & FMODE_NDELAY) == 0
&& (mode & (FMODE_READ|FMODE_WRITE))) {
if (bdev_check_media_change(bdev))
floppy_revalidate(bdev->bd_disk);
if (err == 0 && !(mode & BLK_OPEN_NDELAY) &&
(mode & (BLK_OPEN_READ | BLK_OPEN_WRITE))) {
if (disk_check_media_change(disk))
floppy_revalidate(disk);
if (fs->ejected)
err = -ENXIO;
}
if (err == 0 && (mode & FMODE_WRITE)) {
if (err == 0 && (mode & BLK_OPEN_WRITE)) {
if (fs->write_prot < 0)
fs->write_prot = swim3_readbit(fs, WRITE_PROT);
if (fs->write_prot)
@ -985,7 +984,7 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
return err;
}
if (mode & FMODE_EXCL)
if (mode & BLK_OPEN_EXCL)
fs->ref_count = -1;
else
++fs->ref_count;
@ -993,18 +992,18 @@ static int floppy_open(struct block_device *bdev, fmode_t mode)
return 0;
}
static int floppy_unlocked_open(struct block_device *bdev, fmode_t mode)
static int floppy_unlocked_open(struct gendisk *disk, blk_mode_t mode)
{
int ret;
mutex_lock(&swim3_mutex);
ret = floppy_open(bdev, mode);
ret = floppy_open(disk, mode);
mutex_unlock(&swim3_mutex);
return ret;
}
static void floppy_release(struct gendisk *disk, fmode_t mode)
static void floppy_release(struct gendisk *disk)
{
struct floppy_state *fs = disk->private_data;
struct swim3 __iomem *sw = fs->swim3;

View File

@ -43,6 +43,7 @@
#include <asm/page.h>
#include <linux/task_work.h>
#include <linux/namei.h>
#include <linux/kref.h>
#include <uapi/linux/ublk_cmd.h>
#define UBLK_MINORS (1U << MINORBITS)
@ -54,7 +55,8 @@
| UBLK_F_USER_RECOVERY \
| UBLK_F_USER_RECOVERY_REISSUE \
| UBLK_F_UNPRIVILEGED_DEV \
| UBLK_F_CMD_IOCTL_ENCODE)
| UBLK_F_CMD_IOCTL_ENCODE \
| UBLK_F_USER_COPY)
/* All UBLK_PARAM_TYPE_* should be included here */
#define UBLK_PARAM_TYPE_ALL (UBLK_PARAM_TYPE_BASIC | \
@ -62,7 +64,8 @@
struct ublk_rq_data {
struct llist_node node;
struct callback_head work;
struct kref ref;
};
struct ublk_uring_cmd_pdu {
@ -182,8 +185,13 @@ struct ublk_params_header {
__u32 types;
};
static inline void __ublk_complete_rq(struct request *req);
static void ublk_complete_rq(struct kref *ref);
static dev_t ublk_chr_devt;
static struct class *ublk_chr_class;
static const struct class ublk_chr_class = {
.name = "ublk-char",
};
static DEFINE_IDR(ublk_index_idr);
static DEFINE_SPINLOCK(ublk_idr_lock);
@ -202,6 +210,23 @@ static unsigned int ublks_added; /* protected by ublk_ctl_mutex */
static struct miscdevice ublk_misc;
static inline unsigned ublk_pos_to_hwq(loff_t pos)
{
return ((pos - UBLKSRV_IO_BUF_OFFSET) >> UBLK_QID_OFF) &
UBLK_QID_BITS_MASK;
}
static inline unsigned ublk_pos_to_buf_off(loff_t pos)
{
return (pos - UBLKSRV_IO_BUF_OFFSET) & UBLK_IO_BUF_BITS_MASK;
}
static inline unsigned ublk_pos_to_tag(loff_t pos)
{
return ((pos - UBLKSRV_IO_BUF_OFFSET) >> UBLK_TAG_OFF) &
UBLK_TAG_BITS_MASK;
}
static void ublk_dev_param_basic_apply(struct ublk_device *ub)
{
struct request_queue *q = ub->ub_disk->queue;
@ -290,12 +315,52 @@ static int ublk_apply_params(struct ublk_device *ub)
return 0;
}
static inline bool ublk_can_use_task_work(const struct ublk_queue *ubq)
static inline bool ublk_support_user_copy(const struct ublk_queue *ubq)
{
if (IS_BUILTIN(CONFIG_BLK_DEV_UBLK) &&
!(ubq->flags & UBLK_F_URING_CMD_COMP_IN_TASK))
return true;
return false;
return ubq->flags & UBLK_F_USER_COPY;
}
static inline bool ublk_need_req_ref(const struct ublk_queue *ubq)
{
/*
* read()/write() is involved in user copy, so request reference
* has to be grabbed
*/
return ublk_support_user_copy(ubq);
}
static inline void ublk_init_req_ref(const struct ublk_queue *ubq,
struct request *req)
{
if (ublk_need_req_ref(ubq)) {
struct ublk_rq_data *data = blk_mq_rq_to_pdu(req);
kref_init(&data->ref);
}
}
static inline bool ublk_get_req_ref(const struct ublk_queue *ubq,
struct request *req)
{
if (ublk_need_req_ref(ubq)) {
struct ublk_rq_data *data = blk_mq_rq_to_pdu(req);
return kref_get_unless_zero(&data->ref);
}
return true;
}
static inline void ublk_put_req_ref(const struct ublk_queue *ubq,
struct request *req)
{
if (ublk_need_req_ref(ubq)) {
struct ublk_rq_data *data = blk_mq_rq_to_pdu(req);
kref_put(&data->ref, ublk_complete_rq);
} else {
__ublk_complete_rq(req);
}
}
static inline bool ublk_need_get_data(const struct ublk_queue *ubq)
@ -384,9 +449,9 @@ static void ublk_store_owner_uid_gid(unsigned int *owner_uid,
*owner_gid = from_kgid(&init_user_ns, gid);
}
static int ublk_open(struct block_device *bdev, fmode_t mode)
static int ublk_open(struct gendisk *disk, blk_mode_t mode)
{
struct ublk_device *ub = bdev->bd_disk->private_data;
struct ublk_device *ub = disk->private_data;
if (capable(CAP_SYS_ADMIN))
return 0;
@ -421,49 +486,39 @@ static const struct block_device_operations ub_fops = {
#define UBLK_MAX_PIN_PAGES 32
struct ublk_map_data {
const struct request *rq;
unsigned long ubuf;
unsigned int len;
};
struct ublk_io_iter {
struct page *pages[UBLK_MAX_PIN_PAGES];
unsigned pg_off; /* offset in the 1st page in pages */
int nr_pages; /* how many page pointers in pages */
struct bio *bio;
struct bvec_iter iter;
};
static inline unsigned ublk_copy_io_pages(struct ublk_io_iter *data,
unsigned max_bytes, bool to_vm)
/* return how many pages are copied */
static void ublk_copy_io_pages(struct ublk_io_iter *data,
size_t total, size_t pg_off, int dir)
{
const unsigned total = min_t(unsigned, max_bytes,
PAGE_SIZE - data->pg_off +
((data->nr_pages - 1) << PAGE_SHIFT));
unsigned done = 0;
unsigned pg_idx = 0;
while (done < total) {
struct bio_vec bv = bio_iter_iovec(data->bio, data->iter);
const unsigned int bytes = min3(bv.bv_len, total - done,
(unsigned)(PAGE_SIZE - data->pg_off));
unsigned int bytes = min3(bv.bv_len, (unsigned)total - done,
(unsigned)(PAGE_SIZE - pg_off));
void *bv_buf = bvec_kmap_local(&bv);
void *pg_buf = kmap_local_page(data->pages[pg_idx]);
if (to_vm)
memcpy(pg_buf + data->pg_off, bv_buf, bytes);
if (dir == ITER_DEST)
memcpy(pg_buf + pg_off, bv_buf, bytes);
else
memcpy(bv_buf, pg_buf + data->pg_off, bytes);
memcpy(bv_buf, pg_buf + pg_off, bytes);
kunmap_local(pg_buf);
kunmap_local(bv_buf);
/* advance page array */
data->pg_off += bytes;
if (data->pg_off == PAGE_SIZE) {
pg_off += bytes;
if (pg_off == PAGE_SIZE) {
pg_idx += 1;
data->pg_off = 0;
pg_off = 0;
}
done += bytes;
@ -477,41 +532,58 @@ static inline unsigned ublk_copy_io_pages(struct ublk_io_iter *data,
data->iter = data->bio->bi_iter;
}
}
return done;
}
static int ublk_copy_user_pages(struct ublk_map_data *data, bool to_vm)
static bool ublk_advance_io_iter(const struct request *req,
struct ublk_io_iter *iter, unsigned int offset)
{
const unsigned int gup_flags = to_vm ? FOLL_WRITE : 0;
const unsigned long start_vm = data->ubuf;
unsigned int done = 0;
struct ublk_io_iter iter = {
.pg_off = start_vm & (PAGE_SIZE - 1),
.bio = data->rq->bio,
.iter = data->rq->bio->bi_iter,
};
const unsigned int nr_pages = round_up(data->len +
(start_vm & (PAGE_SIZE - 1)), PAGE_SIZE) >> PAGE_SHIFT;
struct bio *bio = req->bio;
while (done < nr_pages) {
const unsigned to_pin = min_t(unsigned, UBLK_MAX_PIN_PAGES,
nr_pages - done);
unsigned i, len;
for_each_bio(bio) {
if (bio->bi_iter.bi_size > offset) {
iter->bio = bio;
iter->iter = bio->bi_iter;
bio_advance_iter(iter->bio, &iter->iter, offset);
return true;
}
offset -= bio->bi_iter.bi_size;
}
return false;
}
iter.nr_pages = get_user_pages_fast(start_vm +
(done << PAGE_SHIFT), to_pin, gup_flags,
iter.pages);
if (iter.nr_pages <= 0)
return done == 0 ? iter.nr_pages : done;
len = ublk_copy_io_pages(&iter, data->len, to_vm);
for (i = 0; i < iter.nr_pages; i++) {
if (to_vm)
/*
* Copy data between request pages and io_iter, and 'offset'
* is the start point of linear offset of request.
*/
static size_t ublk_copy_user_pages(const struct request *req,
unsigned offset, struct iov_iter *uiter, int dir)
{
struct ublk_io_iter iter;
size_t done = 0;
if (!ublk_advance_io_iter(req, &iter, offset))
return 0;
while (iov_iter_count(uiter) && iter.bio) {
unsigned nr_pages;
ssize_t len;
size_t off;
int i;
len = iov_iter_get_pages2(uiter, iter.pages,
iov_iter_count(uiter),
UBLK_MAX_PIN_PAGES, &off);
if (len <= 0)
return done;
ublk_copy_io_pages(&iter, len, off, dir);
nr_pages = DIV_ROUND_UP(len + off, PAGE_SIZE);
for (i = 0; i < nr_pages; i++) {
if (dir == ITER_DEST)
set_page_dirty(iter.pages[i]);
put_page(iter.pages[i]);
}
data->len -= len;
done += iter.nr_pages;
done += len;
}
return done;
@ -532,21 +604,23 @@ static int ublk_map_io(const struct ublk_queue *ubq, const struct request *req,
{
const unsigned int rq_bytes = blk_rq_bytes(req);
if (ublk_support_user_copy(ubq))
return rq_bytes;
/*
* no zero copy, we delay copy WRITE request data into ublksrv
* context and the big benefit is that pinning pages in current
* context is pretty fast, see ublk_pin_user_pages
*/
if (ublk_need_map_req(req)) {
struct ublk_map_data data = {
.rq = req,
.ubuf = io->addr,
.len = rq_bytes,
};
struct iov_iter iter;
struct iovec iov;
const int dir = ITER_DEST;
ublk_copy_user_pages(&data, true);
import_single_range(dir, u64_to_user_ptr(io->addr), rq_bytes,
&iov, &iter);
return rq_bytes - data.len;
return ublk_copy_user_pages(req, 0, &iter, dir);
}
return rq_bytes;
}
@ -557,18 +631,19 @@ static int ublk_unmap_io(const struct ublk_queue *ubq,
{
const unsigned int rq_bytes = blk_rq_bytes(req);
if (ublk_support_user_copy(ubq))
return rq_bytes;
if (ublk_need_unmap_req(req)) {
struct ublk_map_data data = {
.rq = req,
.ubuf = io->addr,
.len = io->res,
};
struct iov_iter iter;
struct iovec iov;
const int dir = ITER_SOURCE;
WARN_ON_ONCE(io->res > rq_bytes);
ublk_copy_user_pages(&data, false);
return io->res - data.len;
import_single_range(dir, u64_to_user_ptr(io->addr), io->res,
&iov, &iter);
return ublk_copy_user_pages(req, 0, &iter, dir);
}
return rq_bytes;
}
@ -648,13 +723,19 @@ static inline bool ubq_daemon_is_dying(struct ublk_queue *ubq)
}
/* todo: handle partial completion */
static void ublk_complete_rq(struct request *req)
static inline void __ublk_complete_rq(struct request *req)
{
struct ublk_queue *ubq = req->mq_hctx->driver_data;
struct ublk_io *io = &ubq->ios[req->tag];
unsigned int unmapped_bytes;
blk_status_t res = BLK_STS_OK;
/* called from ublk_abort_queue() code path */
if (io->flags & UBLK_IO_FLAG_ABORTED) {
res = BLK_STS_IOERR;
goto exit;
}
/* failed read IO if nothing is read */
if (!io->res && req_op(req) == REQ_OP_READ)
io->res = -EIO;
@ -694,6 +775,15 @@ static void ublk_complete_rq(struct request *req)
blk_mq_end_request(req, res);
}
static void ublk_complete_rq(struct kref *ref)
{
struct ublk_rq_data *data = container_of(ref, struct ublk_rq_data,
ref);
struct request *req = blk_mq_rq_from_pdu(data);
__ublk_complete_rq(req);
}
/*
* Since __ublk_rq_task_work always fails requests immediately during
* exiting, __ublk_fail_req() is only called from abort context during
@ -712,7 +802,7 @@ static void __ublk_fail_req(struct ublk_queue *ubq, struct ublk_io *io,
if (ublk_queue_can_use_recovery_reissue(ubq))
blk_mq_requeue_request(req, false);
else
blk_mq_end_request(req, BLK_STS_IOERR);
ublk_put_req_ref(ubq, req);
}
}
@ -821,6 +911,7 @@ static inline void __ublk_rq_task_work(struct request *req,
mapped_bytes >> 9;
}
ublk_init_req_ref(ubq, req);
ubq_complete_io_cmd(io, UBLK_IO_RES_OK, issue_flags);
}
@ -852,17 +943,6 @@ static void ublk_rq_task_work_cb(struct io_uring_cmd *cmd, unsigned issue_flags)
ublk_forward_io_cmds(ubq, issue_flags);
}
static void ublk_rq_task_work_fn(struct callback_head *work)
{
struct ublk_rq_data *data = container_of(work,
struct ublk_rq_data, work);
struct request *req = blk_mq_rq_from_pdu(data);
struct ublk_queue *ubq = req->mq_hctx->driver_data;
unsigned issue_flags = IO_URING_F_UNLOCKED;
ublk_forward_io_cmds(ubq, issue_flags);
}
static void ublk_queue_cmd(struct ublk_queue *ubq, struct request *rq)
{
struct ublk_rq_data *data = blk_mq_rq_to_pdu(rq);
@ -886,10 +966,6 @@ static void ublk_queue_cmd(struct ublk_queue *ubq, struct request *rq)
*/
if (unlikely(io->flags & UBLK_IO_FLAG_ABORTED)) {
ublk_abort_io_cmds(ubq);
} else if (ublk_can_use_task_work(ubq)) {
if (task_work_add(ubq->ubq_daemon, &data->work,
TWA_SIGNAL_NO_IPI))
ublk_abort_io_cmds(ubq);
} else {
struct io_uring_cmd *cmd = io->cmd;
struct ublk_uring_cmd_pdu *pdu = ublk_get_uring_cmd_pdu(cmd);
@ -961,19 +1037,9 @@ static int ublk_init_hctx(struct blk_mq_hw_ctx *hctx, void *driver_data,
return 0;
}
static int ublk_init_rq(struct blk_mq_tag_set *set, struct request *req,
unsigned int hctx_idx, unsigned int numa_node)
{
struct ublk_rq_data *data = blk_mq_rq_to_pdu(req);
init_task_work(&data->work, ublk_rq_task_work_fn);
return 0;
}
static const struct blk_mq_ops ublk_mq_ops = {
.queue_rq = ublk_queue_rq,
.init_hctx = ublk_init_hctx,
.init_request = ublk_init_rq,
.timeout = ublk_timeout,
};
@ -1050,7 +1116,7 @@ static void ublk_commit_completion(struct ublk_device *ub,
req = blk_mq_tag_to_rq(ub->tag_set.tags[qid], tag);
if (req && likely(!blk_should_fake_timeout(req->q)))
ublk_complete_rq(req);
ublk_put_req_ref(ubq, req);
}
/*
@ -1295,6 +1361,14 @@ static inline int ublk_check_cmd_op(u32 cmd_op)
return 0;
}
static inline void ublk_fill_io_cmd(struct ublk_io *io,
struct io_uring_cmd *cmd, unsigned long buf_addr)
{
io->cmd = cmd;
io->flags |= UBLK_IO_FLAG_ACTIVE;
io->addr = buf_addr;
}
static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd,
unsigned int issue_flags,
const struct ublksrv_io_cmd *ub_cmd)
@ -1340,6 +1414,11 @@ static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd,
^ (_IOC_NR(cmd_op) == UBLK_IO_NEED_GET_DATA))
goto out;
if (ublk_support_user_copy(ubq) && ub_cmd->addr) {
ret = -EINVAL;
goto out;
}
ret = ublk_check_cmd_op(cmd_op);
if (ret)
goto out;
@ -1358,36 +1437,41 @@ static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd,
*/
if (io->flags & UBLK_IO_FLAG_OWNED_BY_SRV)
goto out;
/* FETCH_RQ has to provide IO buffer if NEED GET DATA is not enabled */
if (!ub_cmd->addr && !ublk_need_get_data(ubq))
goto out;
io->cmd = cmd;
io->flags |= UBLK_IO_FLAG_ACTIVE;
io->addr = ub_cmd->addr;
if (!ublk_support_user_copy(ubq)) {
/*
* FETCH_RQ has to provide IO buffer if NEED GET
* DATA is not enabled
*/
if (!ub_cmd->addr && !ublk_need_get_data(ubq))
goto out;
}
ublk_fill_io_cmd(io, cmd, ub_cmd->addr);
ublk_mark_io_ready(ub, ubq);
break;
case UBLK_IO_COMMIT_AND_FETCH_REQ:
req = blk_mq_tag_to_rq(ub->tag_set.tags[ub_cmd->q_id], tag);
/*
* COMMIT_AND_FETCH_REQ has to provide IO buffer if NEED GET DATA is
* not enabled or it is Read IO.
*/
if (!ub_cmd->addr && (!ublk_need_get_data(ubq) || req_op(req) == REQ_OP_READ))
goto out;
if (!(io->flags & UBLK_IO_FLAG_OWNED_BY_SRV))
goto out;
io->addr = ub_cmd->addr;
io->flags |= UBLK_IO_FLAG_ACTIVE;
io->cmd = cmd;
if (!ublk_support_user_copy(ubq)) {
/*
* COMMIT_AND_FETCH_REQ has to provide IO buffer if
* NEED GET DATA is not enabled or it is Read IO.
*/
if (!ub_cmd->addr && (!ublk_need_get_data(ubq) ||
req_op(req) == REQ_OP_READ))
goto out;
}
ublk_fill_io_cmd(io, cmd, ub_cmd->addr);
ublk_commit_completion(ub, ub_cmd);
break;
case UBLK_IO_NEED_GET_DATA:
if (!(io->flags & UBLK_IO_FLAG_OWNED_BY_SRV))
goto out;
io->addr = ub_cmd->addr;
io->cmd = cmd;
io->flags |= UBLK_IO_FLAG_ACTIVE;
ublk_fill_io_cmd(io, cmd, ub_cmd->addr);
ublk_handle_need_get_data(ub, ub_cmd->q_id, ub_cmd->tag);
break;
default:
@ -1402,6 +1486,36 @@ static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd,
return -EIOCBQUEUED;
}
static inline struct request *__ublk_check_and_get_req(struct ublk_device *ub,
struct ublk_queue *ubq, int tag, size_t offset)
{
struct request *req;
if (!ublk_need_req_ref(ubq))
return NULL;
req = blk_mq_tag_to_rq(ub->tag_set.tags[ubq->q_id], tag);
if (!req)
return NULL;
if (!ublk_get_req_ref(ubq, req))
return NULL;
if (unlikely(!blk_mq_request_started(req) || req->tag != tag))
goto fail_put;
if (!ublk_rq_has_data(req))
goto fail_put;
if (offset > blk_rq_bytes(req))
goto fail_put;
return req;
fail_put:
ublk_put_req_ref(ubq, req);
return NULL;
}
static int ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags)
{
/*
@ -1419,11 +1533,112 @@ static int ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags)
return __ublk_ch_uring_cmd(cmd, issue_flags, &ub_cmd);
}
static inline bool ublk_check_ubuf_dir(const struct request *req,
int ubuf_dir)
{
/* copy ubuf to request pages */
if (req_op(req) == REQ_OP_READ && ubuf_dir == ITER_SOURCE)
return true;
/* copy request pages to ubuf */
if (req_op(req) == REQ_OP_WRITE && ubuf_dir == ITER_DEST)
return true;
return false;
}
static struct request *ublk_check_and_get_req(struct kiocb *iocb,
struct iov_iter *iter, size_t *off, int dir)
{
struct ublk_device *ub = iocb->ki_filp->private_data;
struct ublk_queue *ubq;
struct request *req;
size_t buf_off;
u16 tag, q_id;
if (!ub)
return ERR_PTR(-EACCES);
if (!user_backed_iter(iter))
return ERR_PTR(-EACCES);
if (ub->dev_info.state == UBLK_S_DEV_DEAD)
return ERR_PTR(-EACCES);
tag = ublk_pos_to_tag(iocb->ki_pos);
q_id = ublk_pos_to_hwq(iocb->ki_pos);
buf_off = ublk_pos_to_buf_off(iocb->ki_pos);
if (q_id >= ub->dev_info.nr_hw_queues)
return ERR_PTR(-EINVAL);
ubq = ublk_get_queue(ub, q_id);
if (!ubq)
return ERR_PTR(-EINVAL);
if (tag >= ubq->q_depth)
return ERR_PTR(-EINVAL);
req = __ublk_check_and_get_req(ub, ubq, tag, buf_off);
if (!req)
return ERR_PTR(-EINVAL);
if (!req->mq_hctx || !req->mq_hctx->driver_data)
goto fail;
if (!ublk_check_ubuf_dir(req, dir))
goto fail;
*off = buf_off;
return req;
fail:
ublk_put_req_ref(ubq, req);
return ERR_PTR(-EACCES);
}
static ssize_t ublk_ch_read_iter(struct kiocb *iocb, struct iov_iter *to)
{
struct ublk_queue *ubq;
struct request *req;
size_t buf_off;
size_t ret;
req = ublk_check_and_get_req(iocb, to, &buf_off, ITER_DEST);
if (IS_ERR(req))
return PTR_ERR(req);
ret = ublk_copy_user_pages(req, buf_off, to, ITER_DEST);
ubq = req->mq_hctx->driver_data;
ublk_put_req_ref(ubq, req);
return ret;
}
static ssize_t ublk_ch_write_iter(struct kiocb *iocb, struct iov_iter *from)
{
struct ublk_queue *ubq;
struct request *req;
size_t buf_off;
size_t ret;
req = ublk_check_and_get_req(iocb, from, &buf_off, ITER_SOURCE);
if (IS_ERR(req))
return PTR_ERR(req);
ret = ublk_copy_user_pages(req, buf_off, from, ITER_SOURCE);
ubq = req->mq_hctx->driver_data;
ublk_put_req_ref(ubq, req);
return ret;
}
static const struct file_operations ublk_ch_fops = {
.owner = THIS_MODULE,
.open = ublk_ch_open,
.release = ublk_ch_release,
.llseek = no_llseek,
.read_iter = ublk_ch_read_iter,
.write_iter = ublk_ch_write_iter,
.uring_cmd = ublk_ch_uring_cmd,
.mmap = ublk_ch_mmap,
};
@ -1547,7 +1762,7 @@ static int ublk_add_chdev(struct ublk_device *ub)
dev->parent = ublk_misc.this_device;
dev->devt = MKDEV(MAJOR(ublk_chr_devt), minor);
dev->class = ublk_chr_class;
dev->class = &ublk_chr_class;
dev->release = ublk_cdev_rel;
device_initialize(dev);
@ -1818,10 +2033,12 @@ static int ublk_ctrl_add_dev(struct io_uring_cmd *cmd)
*/
ub->dev_info.flags &= UBLK_F_ALL;
if (!IS_BUILTIN(CONFIG_BLK_DEV_UBLK))
ub->dev_info.flags |= UBLK_F_URING_CMD_COMP_IN_TASK;
ub->dev_info.flags |= UBLK_F_CMD_IOCTL_ENCODE |
UBLK_F_URING_CMD_COMP_IN_TASK;
ub->dev_info.flags |= UBLK_F_CMD_IOCTL_ENCODE;
/* GET_DATA isn't needed any more with USER_COPY */
if (ub->dev_info.flags & UBLK_F_USER_COPY)
ub->dev_info.flags &= ~UBLK_F_NEED_GET_DATA;
/* We are not ready to support zero copy */
ub->dev_info.flags &= ~UBLK_F_SUPPORT_ZERO_COPY;
@ -2133,6 +2350,21 @@ static int ublk_ctrl_end_recovery(struct ublk_device *ub,
return ret;
}
static int ublk_ctrl_get_features(struct io_uring_cmd *cmd)
{
const struct ublksrv_ctrl_cmd *header = io_uring_sqe_cmd(cmd->sqe);
void __user *argp = (void __user *)(unsigned long)header->addr;
u64 features = UBLK_F_ALL & ~UBLK_F_SUPPORT_ZERO_COPY;
if (header->len != UBLK_FEATURES_LEN || !header->addr)
return -EINVAL;
if (copy_to_user(argp, &features, UBLK_FEATURES_LEN))
return -EFAULT;
return 0;
}
/*
* All control commands are sent via /dev/ublk-control, so we have to check
* the destination device's permission
@ -2213,6 +2445,7 @@ static int ublk_ctrl_uring_cmd_permission(struct ublk_device *ub,
case UBLK_CMD_GET_DEV_INFO2:
case UBLK_CMD_GET_QUEUE_AFFINITY:
case UBLK_CMD_GET_PARAMS:
case (_IOC_NR(UBLK_U_CMD_GET_FEATURES)):
mask = MAY_READ;
break;
case UBLK_CMD_START_DEV:
@ -2262,6 +2495,11 @@ static int ublk_ctrl_uring_cmd(struct io_uring_cmd *cmd,
if (ret)
goto out;
if (cmd_op == UBLK_U_CMD_GET_FEATURES) {
ret = ublk_ctrl_get_features(cmd);
goto out;
}
if (_IOC_NR(cmd_op) != UBLK_CMD_ADD_DEV) {
ret = -ENODEV;
ub = ublk_get_device_from_id(header->dev_id);
@ -2337,6 +2575,9 @@ static int __init ublk_init(void)
{
int ret;
BUILD_BUG_ON((u64)UBLKSRV_IO_BUF_OFFSET +
UBLKSRV_IO_BUF_TOTAL_SIZE < UBLKSRV_IO_BUF_OFFSET);
init_waitqueue_head(&ublk_idr_wq);
ret = misc_register(&ublk_misc);
@ -2347,11 +2588,10 @@ static int __init ublk_init(void)
if (ret)
goto unregister_mis;
ublk_chr_class = class_create("ublk-char");
if (IS_ERR(ublk_chr_class)) {
ret = PTR_ERR(ublk_chr_class);
ret = class_register(&ublk_chr_class);
if (ret)
goto free_chrdev_region;
}
return 0;
free_chrdev_region:
@ -2369,7 +2609,7 @@ static void __exit ublk_exit(void)
idr_for_each_entry(&ublk_index_idr, ub, id)
ublk_remove(ub);
class_destroy(ublk_chr_class);
class_unregister(&ublk_chr_class);
misc_deregister(&ublk_misc);
idr_destroy(&ublk_index_idr);

View File

@ -473,7 +473,7 @@ static void xenvbd_sysfs_delif(struct xenbus_device *dev)
static void xen_vbd_free(struct xen_vbd *vbd)
{
if (vbd->bdev)
blkdev_put(vbd->bdev, vbd->readonly ? FMODE_READ : FMODE_WRITE);
blkdev_put(vbd->bdev, NULL);
vbd->bdev = NULL;
}
@ -492,7 +492,7 @@ static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle,
vbd->pdevice = MKDEV(major, minor);
bdev = blkdev_get_by_dev(vbd->pdevice, vbd->readonly ?
FMODE_READ : FMODE_WRITE, NULL);
BLK_OPEN_READ : BLK_OPEN_WRITE, NULL, NULL);
if (IS_ERR(bdev)) {
pr_warn("xen_vbd_create: device %08x could not be opened\n",

View File

@ -509,7 +509,7 @@ static int blkif_getgeo(struct block_device *bd, struct hd_geometry *hg)
return 0;
}
static int blkif_ioctl(struct block_device *bdev, fmode_t mode,
static int blkif_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned command, unsigned long argument)
{
struct blkfront_info *info = bdev->bd_disk->private_data;

View File

@ -140,16 +140,14 @@ static void get_chipram(void)
return;
}
static int z2_open(struct block_device *bdev, fmode_t mode)
static int z2_open(struct gendisk *disk, blk_mode_t mode)
{
int device;
int device = disk->first_minor;
int max_z2_map = (Z2RAM_SIZE / Z2RAM_CHUNKSIZE) * sizeof(z2ram_map[0]);
int max_chip_map = (amiga_chip_size / Z2RAM_CHUNKSIZE) *
sizeof(z2ram_map[0]);
int rc = -ENOMEM;
device = MINOR(bdev->bd_dev);
mutex_lock(&z2ram_mutex);
if (current_device != -1 && current_device != device) {
rc = -EBUSY;
@ -290,7 +288,7 @@ static int z2_open(struct block_device *bdev, fmode_t mode)
return rc;
}
static void z2_release(struct gendisk *disk, fmode_t mode)
static void z2_release(struct gendisk *disk)
{
mutex_lock(&z2ram_mutex);
if (current_device == -1) {

View File

@ -420,7 +420,7 @@ static void reset_bdev(struct zram *zram)
return;
bdev = zram->bdev;
blkdev_put(bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL);
blkdev_put(bdev, zram);
/* hope filp_close flush all of IO */
filp_close(zram->backing_dev, NULL);
zram->backing_dev = NULL;
@ -507,8 +507,8 @@ static ssize_t backing_dev_store(struct device *dev,
goto out;
}
bdev = blkdev_get_by_dev(inode->i_rdev,
FMODE_READ | FMODE_WRITE | FMODE_EXCL, zram);
bdev = blkdev_get_by_dev(inode->i_rdev, BLK_OPEN_READ | BLK_OPEN_WRITE,
zram, NULL);
if (IS_ERR(bdev)) {
err = PTR_ERR(bdev);
bdev = NULL;
@ -539,7 +539,7 @@ static ssize_t backing_dev_store(struct device *dev,
kvfree(bitmap);
if (bdev)
blkdev_put(bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL);
blkdev_put(bdev, zram);
if (backing_dev)
filp_close(backing_dev, NULL);
@ -700,7 +700,7 @@ static ssize_t writeback_store(struct device *dev,
bio_init(&bio, zram->bdev, &bio_vec, 1,
REQ_OP_WRITE | REQ_SYNC);
bio.bi_iter.bi_sector = blk_idx * (PAGE_SIZE >> 9);
bio_add_page(&bio, page, PAGE_SIZE, 0);
__bio_add_page(&bio, page, PAGE_SIZE, 0);
/*
* XXX: A single page IO would be inefficient for write
@ -2097,19 +2097,16 @@ static ssize_t reset_store(struct device *dev,
return len;
}
static int zram_open(struct block_device *bdev, fmode_t mode)
static int zram_open(struct gendisk *disk, blk_mode_t mode)
{
int ret = 0;
struct zram *zram;
struct zram *zram = disk->private_data;
WARN_ON(!mutex_is_locked(&bdev->bd_disk->open_mutex));
WARN_ON(!mutex_is_locked(&disk->open_mutex));
zram = bdev->bd_disk->private_data;
/* zram was claimed to reset so open request fails */
if (zram->claim)
ret = -EBUSY;
return ret;
return -EBUSY;
return 0;
}
static const struct block_device_operations zram_devops = {

View File

@ -264,6 +264,7 @@
#include <linux/errno.h>
#include <linux/kernel.h>
#include <linux/mm.h>
#include <linux/nospec.h>
#include <linux/slab.h>
#include <linux/cdrom.h>
#include <linux/sysctl.h>
@ -978,15 +979,6 @@ static void cdrom_dvd_rw_close_write(struct cdrom_device_info *cdi)
cdi->media_written = 0;
}
static int cdrom_close_write(struct cdrom_device_info *cdi)
{
#if 0
return cdrom_flush_cache(cdi);
#else
return 0;
#endif
}
/* badly broken, I know. Is due for a fixup anytime. */
static void cdrom_count_tracks(struct cdrom_device_info *cdi, tracktype *tracks)
{
@ -1155,8 +1147,7 @@ int open_for_data(struct cdrom_device_info *cdi)
* is in their own interest: device control becomes a lot easier
* this way.
*/
int cdrom_open(struct cdrom_device_info *cdi, struct block_device *bdev,
fmode_t mode)
int cdrom_open(struct cdrom_device_info *cdi, blk_mode_t mode)
{
int ret;
@ -1165,7 +1156,7 @@ int cdrom_open(struct cdrom_device_info *cdi, struct block_device *bdev,
/* if this was a O_NONBLOCK open and we should honor the flags,
* do a quick open without drive/disc integrity checks. */
cdi->use_count++;
if ((mode & FMODE_NDELAY) && (cdi->options & CDO_USE_FFLAGS)) {
if ((mode & BLK_OPEN_NDELAY) && (cdi->options & CDO_USE_FFLAGS)) {
ret = cdi->ops->open(cdi, 1);
} else {
ret = open_for_data(cdi);
@ -1173,7 +1164,7 @@ int cdrom_open(struct cdrom_device_info *cdi, struct block_device *bdev,
goto err;
if (CDROM_CAN(CDC_GENERIC_PACKET))
cdrom_mmc3_profile(cdi);
if (mode & FMODE_WRITE) {
if (mode & BLK_OPEN_WRITE) {
ret = -EROFS;
if (cdrom_open_write(cdi))
goto err_release;
@ -1182,6 +1173,7 @@ int cdrom_open(struct cdrom_device_info *cdi, struct block_device *bdev,
ret = 0;
cdi->media_written = 0;
}
cdi->opened_for_data = true;
}
if (ret)
@ -1259,10 +1251,9 @@ static int check_for_audio_disc(struct cdrom_device_info *cdi,
return 0;
}
void cdrom_release(struct cdrom_device_info *cdi, fmode_t mode)
void cdrom_release(struct cdrom_device_info *cdi)
{
const struct cdrom_device_ops *cdo = cdi->ops;
int opened_for_data;
cd_dbg(CD_CLOSE, "entering cdrom_release\n");
@ -1280,20 +1271,12 @@ void cdrom_release(struct cdrom_device_info *cdi, fmode_t mode)
}
}
opened_for_data = !(cdi->options & CDO_USE_FFLAGS) ||
!(mode & FMODE_NDELAY);
/*
* flush cache on last write release
*/
if (CDROM_CAN(CDC_RAM) && !cdi->use_count && cdi->for_data)
cdrom_close_write(cdi);
cdo->release(cdi);
if (cdi->use_count == 0) { /* last process that closes dev*/
if (opened_for_data &&
cdi->options & CDO_AUTO_EJECT && CDROM_CAN(CDC_OPEN_TRAY))
if (cdi->use_count == 0 && cdi->opened_for_data) {
if (cdi->options & CDO_AUTO_EJECT && CDROM_CAN(CDC_OPEN_TRAY))
cdo->tray_move(cdi, 1);
cdi->opened_for_data = false;
}
}
EXPORT_SYMBOL(cdrom_release);
@ -2329,6 +2312,9 @@ static int cdrom_ioctl_media_changed(struct cdrom_device_info *cdi,
if (arg >= cdi->capacity)
return -EINVAL;
/* Prevent arg from speculatively bypassing the length check */
barrier_nospec();
info = kmalloc(sizeof(*info), GFP_KERNEL);
if (!info)
return -ENOMEM;
@ -3337,7 +3323,7 @@ static int mmc_ioctl(struct cdrom_device_info *cdi, unsigned int cmd,
* ATAPI / SCSI specific code now mainly resides in mmc_ioctl().
*/
int cdrom_ioctl(struct cdrom_device_info *cdi, struct block_device *bdev,
fmode_t mode, unsigned int cmd, unsigned long arg)
unsigned int cmd, unsigned long arg)
{
void __user *argp = (void __user *)arg;
int ret;

View File

@ -474,19 +474,19 @@ static const struct cdrom_device_ops gdrom_ops = {
CDC_RESET | CDC_DRIVE_STATUS | CDC_CD_R,
};
static int gdrom_bdops_open(struct block_device *bdev, fmode_t mode)
static int gdrom_bdops_open(struct gendisk *disk, blk_mode_t mode)
{
int ret;
bdev_check_media_change(bdev);
disk_check_media_change(disk);
mutex_lock(&gdrom_mutex);
ret = cdrom_open(gd.cd_info, bdev, mode);
ret = cdrom_open(gd.cd_info);
mutex_unlock(&gdrom_mutex);
return ret;
}
static void gdrom_bdops_release(struct gendisk *disk, fmode_t mode)
static void gdrom_bdops_release(struct gendisk *disk)
{
mutex_lock(&gdrom_mutex);
cdrom_release(gd.cd_info, mode);
@ -499,13 +499,13 @@ static unsigned int gdrom_bdops_check_events(struct gendisk *disk,
return cdrom_check_events(gd.cd_info, clearing);
}
static int gdrom_bdops_ioctl(struct block_device *bdev, fmode_t mode,
static int gdrom_bdops_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned cmd, unsigned long arg)
{
int ret;
mutex_lock(&gdrom_mutex);
ret = cdrom_ioctl(gd.cd_info, bdev, mode, cmd, arg);
ret = cdrom_ioctl(gd.cd_info, bdev, cmd, arg);
mutex_unlock(&gdrom_mutex);
return ret;

View File

@ -275,7 +275,7 @@ struct bcache_device {
int (*cache_miss)(struct btree *b, struct search *s,
struct bio *bio, unsigned int sectors);
int (*ioctl)(struct bcache_device *d, fmode_t mode,
int (*ioctl)(struct bcache_device *d, blk_mode_t mode,
unsigned int cmd, unsigned long arg);
};
@ -1004,11 +1004,11 @@ extern struct workqueue_struct *bch_flush_wq;
extern struct mutex bch_register_lock;
extern struct list_head bch_cache_sets;
extern struct kobj_type bch_cached_dev_ktype;
extern struct kobj_type bch_flash_dev_ktype;
extern struct kobj_type bch_cache_set_ktype;
extern struct kobj_type bch_cache_set_internal_ktype;
extern struct kobj_type bch_cache_ktype;
extern const struct kobj_type bch_cached_dev_ktype;
extern const struct kobj_type bch_flash_dev_ktype;
extern const struct kobj_type bch_cache_set_ktype;
extern const struct kobj_type bch_cache_set_internal_ktype;
extern const struct kobj_type bch_cache_ktype;
void bch_cached_dev_release(struct kobject *kobj);
void bch_flash_dev_release(struct kobject *kobj);

View File

@ -885,7 +885,7 @@ static struct btree *mca_cannibalize(struct cache_set *c, struct btree_op *op,
* cannibalize_bucket() will take. This means every time we unlock the root of
* the btree, we need to release this lock if we have it held.
*/
static void bch_cannibalize_unlock(struct cache_set *c)
void bch_cannibalize_unlock(struct cache_set *c)
{
spin_lock(&c->btree_cannibalize_lock);
if (c->btree_cache_alloc_lock == current) {
@ -1090,10 +1090,12 @@ struct btree *__bch_btree_node_alloc(struct cache_set *c, struct btree_op *op,
struct btree *parent)
{
BKEY_PADDED(key) k;
struct btree *b = ERR_PTR(-EAGAIN);
struct btree *b;
mutex_lock(&c->bucket_lock);
retry:
/* return ERR_PTR(-EAGAIN) when it fails */
b = ERR_PTR(-EAGAIN);
if (__bch_bucket_alloc_set(c, RESERVE_BTREE, &k.key, wait))
goto err;
@ -1138,7 +1140,7 @@ static struct btree *btree_node_alloc_replacement(struct btree *b,
{
struct btree *n = bch_btree_node_alloc(b->c, op, b->level, b->parent);
if (!IS_ERR_OR_NULL(n)) {
if (!IS_ERR(n)) {
mutex_lock(&n->write_lock);
bch_btree_sort_into(&b->keys, &n->keys, &b->c->sort);
bkey_copy_key(&n->key, &b->key);
@ -1340,7 +1342,7 @@ static int btree_gc_coalesce(struct btree *b, struct btree_op *op,
memset(new_nodes, 0, sizeof(new_nodes));
closure_init_stack(&cl);
while (nodes < GC_MERGE_NODES && !IS_ERR_OR_NULL(r[nodes].b))
while (nodes < GC_MERGE_NODES && !IS_ERR(r[nodes].b))
keys += r[nodes++].keys;
blocks = btree_default_blocks(b->c) * 2 / 3;
@ -1352,7 +1354,7 @@ static int btree_gc_coalesce(struct btree *b, struct btree_op *op,
for (i = 0; i < nodes; i++) {
new_nodes[i] = btree_node_alloc_replacement(r[i].b, NULL);
if (IS_ERR_OR_NULL(new_nodes[i]))
if (IS_ERR(new_nodes[i]))
goto out_nocoalesce;
}
@ -1487,7 +1489,7 @@ static int btree_gc_coalesce(struct btree *b, struct btree_op *op,
bch_keylist_free(&keylist);
for (i = 0; i < nodes; i++)
if (!IS_ERR_OR_NULL(new_nodes[i])) {
if (!IS_ERR(new_nodes[i])) {
btree_node_free(new_nodes[i]);
rw_unlock(true, new_nodes[i]);
}
@ -1669,7 +1671,7 @@ static int bch_btree_gc_root(struct btree *b, struct btree_op *op,
if (should_rewrite) {
n = btree_node_alloc_replacement(b, NULL);
if (!IS_ERR_OR_NULL(n)) {
if (!IS_ERR(n)) {
bch_btree_node_write_sync(n);
bch_btree_set_root(n);
@ -1968,6 +1970,15 @@ static int bch_btree_check_thread(void *arg)
c->gc_stats.nodes++;
bch_btree_op_init(&op, 0);
ret = bcache_btree(check_recurse, p, c->root, &op);
/*
* The op may be added to cache_set's btree_cache_wait
* in mca_cannibalize(), must ensure it is removed from
* the list and release btree_cache_alloc_lock before
* free op memory.
* Otherwise, the btree_cache_wait will be damaged.
*/
bch_cannibalize_unlock(c);
finish_wait(&c->btree_cache_wait, &(&op)->wait);
if (ret)
goto out;
}

View File

@ -282,6 +282,7 @@ void bch_initial_gc_finish(struct cache_set *c);
void bch_moving_gc(struct cache_set *c);
int bch_btree_check(struct cache_set *c);
void bch_initial_mark_key(struct cache_set *c, int level, struct bkey *k);
void bch_cannibalize_unlock(struct cache_set *c);
static inline void wake_up_gc(struct cache_set *c)
{

View File

@ -1228,7 +1228,7 @@ void cached_dev_submit_bio(struct bio *bio)
detached_dev_do_request(d, bio, orig_bdev, start_time);
}
static int cached_dev_ioctl(struct bcache_device *d, fmode_t mode,
static int cached_dev_ioctl(struct bcache_device *d, blk_mode_t mode,
unsigned int cmd, unsigned long arg)
{
struct cached_dev *dc = container_of(d, struct cached_dev, disk);
@ -1318,7 +1318,7 @@ void flash_dev_submit_bio(struct bio *bio)
continue_at(cl, search_free, NULL);
}
static int flash_dev_ioctl(struct bcache_device *d, fmode_t mode,
static int flash_dev_ioctl(struct bcache_device *d, blk_mode_t mode,
unsigned int cmd, unsigned long arg)
{
return -ENOTTY;

View File

@ -18,7 +18,6 @@ struct cache_stats {
unsigned long cache_misses;
unsigned long cache_bypass_hits;
unsigned long cache_bypass_misses;
unsigned long cache_readaheads;
unsigned long cache_miss_collisions;
unsigned long sectors_bypassed;

View File

@ -732,9 +732,9 @@ static int prio_read(struct cache *ca, uint64_t bucket)
/* Bcache device */
static int open_dev(struct block_device *b, fmode_t mode)
static int open_dev(struct gendisk *disk, blk_mode_t mode)
{
struct bcache_device *d = b->bd_disk->private_data;
struct bcache_device *d = disk->private_data;
if (test_bit(BCACHE_DEV_CLOSING, &d->flags))
return -ENXIO;
@ -743,14 +743,14 @@ static int open_dev(struct block_device *b, fmode_t mode)
return 0;
}
static void release_dev(struct gendisk *b, fmode_t mode)
static void release_dev(struct gendisk *b)
{
struct bcache_device *d = b->private_data;
closure_put(&d->cl);
}
static int ioctl_dev(struct block_device *b, fmode_t mode,
static int ioctl_dev(struct block_device *b, blk_mode_t mode,
unsigned int cmd, unsigned long arg)
{
struct bcache_device *d = b->bd_disk->private_data;
@ -1369,7 +1369,7 @@ static void cached_dev_free(struct closure *cl)
put_page(virt_to_page(dc->sb_disk));
if (!IS_ERR_OR_NULL(dc->bdev))
blkdev_put(dc->bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL);
blkdev_put(dc->bdev, bcache_kobj);
wake_up(&unregister_wait);
@ -1723,7 +1723,7 @@ static void cache_set_flush(struct closure *cl)
if (!IS_ERR_OR_NULL(c->gc_thread))
kthread_stop(c->gc_thread);
if (!IS_ERR_OR_NULL(c->root))
if (!IS_ERR(c->root))
list_add(&c->root->list, &c->btree_cache);
/*
@ -2087,7 +2087,7 @@ static int run_cache_set(struct cache_set *c)
err = "cannot allocate new btree root";
c->root = __bch_btree_node_alloc(c, NULL, 0, true, NULL);
if (IS_ERR_OR_NULL(c->root))
if (IS_ERR(c->root))
goto err;
mutex_lock(&c->root->write_lock);
@ -2218,7 +2218,7 @@ void bch_cache_release(struct kobject *kobj)
put_page(virt_to_page(ca->sb_disk));
if (!IS_ERR_OR_NULL(ca->bdev))
blkdev_put(ca->bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL);
blkdev_put(ca->bdev, bcache_kobj);
kfree(ca);
module_put(THIS_MODULE);
@ -2359,7 +2359,7 @@ static int register_cache(struct cache_sb *sb, struct cache_sb_disk *sb_disk,
* call blkdev_put() to bdev in bch_cache_release(). So we
* explicitly call blkdev_put() here.
*/
blkdev_put(bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL);
blkdev_put(bdev, bcache_kobj);
if (ret == -ENOMEM)
err = "cache_alloc(): -ENOMEM";
else if (ret == -EPERM)
@ -2461,7 +2461,7 @@ static void register_bdev_worker(struct work_struct *work)
if (!dc) {
fail = true;
put_page(virt_to_page(args->sb_disk));
blkdev_put(args->bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL);
blkdev_put(args->bdev, bcache_kobj);
goto out;
}
@ -2491,7 +2491,7 @@ static void register_cache_worker(struct work_struct *work)
if (!ca) {
fail = true;
put_page(virt_to_page(args->sb_disk));
blkdev_put(args->bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL);
blkdev_put(args->bdev, bcache_kobj);
goto out;
}
@ -2558,9 +2558,8 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
ret = -EINVAL;
err = "failed to open device";
bdev = blkdev_get_by_path(strim(path),
FMODE_READ|FMODE_WRITE|FMODE_EXCL,
sb);
bdev = blkdev_get_by_path(strim(path), BLK_OPEN_READ | BLK_OPEN_WRITE,
bcache_kobj, NULL);
if (IS_ERR(bdev)) {
if (bdev == ERR_PTR(-EBUSY)) {
dev_t dev;
@ -2648,7 +2647,7 @@ static ssize_t register_bcache(struct kobject *k, struct kobj_attribute *attr,
out_put_sb_page:
put_page(virt_to_page(sb_disk));
out_blkdev_put:
blkdev_put(bdev, FMODE_READ | FMODE_WRITE | FMODE_EXCL);
blkdev_put(bdev, register_bcache);
out_free_sb:
kfree(sb);
out_free_path:

View File

@ -1111,26 +1111,25 @@ SHOW(__bch_cache)
vfree(p);
ret = scnprintf(buf, PAGE_SIZE,
"Unused: %zu%%\n"
"Clean: %zu%%\n"
"Dirty: %zu%%\n"
"Metadata: %zu%%\n"
"Average: %llu\n"
"Sectors per Q: %zu\n"
"Quantiles: [",
unused * 100 / (size_t) ca->sb.nbuckets,
available * 100 / (size_t) ca->sb.nbuckets,
dirty * 100 / (size_t) ca->sb.nbuckets,
meta * 100 / (size_t) ca->sb.nbuckets, sum,
n * ca->sb.bucket_size / (ARRAY_SIZE(q) + 1));
ret = sysfs_emit(buf,
"Unused: %zu%%\n"
"Clean: %zu%%\n"
"Dirty: %zu%%\n"
"Metadata: %zu%%\n"
"Average: %llu\n"
"Sectors per Q: %zu\n"
"Quantiles: [",
unused * 100 / (size_t) ca->sb.nbuckets,
available * 100 / (size_t) ca->sb.nbuckets,
dirty * 100 / (size_t) ca->sb.nbuckets,
meta * 100 / (size_t) ca->sb.nbuckets, sum,
n * ca->sb.bucket_size / (ARRAY_SIZE(q) + 1));
for (i = 0; i < ARRAY_SIZE(q); i++)
ret += scnprintf(buf + ret, PAGE_SIZE - ret,
"%u ", q[i]);
ret += sysfs_emit_at(buf, ret, "%u ", q[i]);
ret--;
ret += scnprintf(buf + ret, PAGE_SIZE - ret, "]\n");
ret += sysfs_emit_at(buf, ret, "]\n");
return ret;
}

View File

@ -3,7 +3,7 @@
#define _BCACHE_SYSFS_H_
#define KTYPE(type) \
struct kobj_type type ## _ktype = { \
const struct kobj_type type ## _ktype = { \
.release = type ## _release, \
.sysfs_ops = &((const struct sysfs_ops) { \
.show = type ## _show, \

View File

@ -890,6 +890,16 @@ static int bch_root_node_dirty_init(struct cache_set *c,
if (ret < 0)
pr_warn("sectors dirty init failed, ret=%d!\n", ret);
/*
* The op may be added to cache_set's btree_cache_wait
* in mca_cannibalize(), must ensure it is removed from
* the list and release btree_cache_alloc_lock before
* free op memory.
* Otherwise, the btree_cache_wait will be damaged.
*/
bch_cannibalize_unlock(c);
finish_wait(&c->btree_cache_wait, &(&op.op)->wait);
return ret;
}

View File

@ -2051,8 +2051,8 @@ static int parse_metadata_dev(struct cache_args *ca, struct dm_arg_set *as,
if (!at_least_one_arg(as, error))
return -EINVAL;
r = dm_get_device(ca->ti, dm_shift_arg(as), FMODE_READ | FMODE_WRITE,
&ca->metadata_dev);
r = dm_get_device(ca->ti, dm_shift_arg(as),
BLK_OPEN_READ | BLK_OPEN_WRITE, &ca->metadata_dev);
if (r) {
*error = "Error opening metadata device";
return r;
@ -2074,8 +2074,8 @@ static int parse_cache_dev(struct cache_args *ca, struct dm_arg_set *as,
if (!at_least_one_arg(as, error))
return -EINVAL;
r = dm_get_device(ca->ti, dm_shift_arg(as), FMODE_READ | FMODE_WRITE,
&ca->cache_dev);
r = dm_get_device(ca->ti, dm_shift_arg(as),
BLK_OPEN_READ | BLK_OPEN_WRITE, &ca->cache_dev);
if (r) {
*error = "Error opening cache device";
return r;
@ -2093,8 +2093,8 @@ static int parse_origin_dev(struct cache_args *ca, struct dm_arg_set *as,
if (!at_least_one_arg(as, error))
return -EINVAL;
r = dm_get_device(ca->ti, dm_shift_arg(as), FMODE_READ | FMODE_WRITE,
&ca->origin_dev);
r = dm_get_device(ca->ti, dm_shift_arg(as),
BLK_OPEN_READ | BLK_OPEN_WRITE, &ca->origin_dev);
if (r) {
*error = "Error opening origin device";
return r;

View File

@ -1683,8 +1683,8 @@ static int parse_metadata_dev(struct clone *clone, struct dm_arg_set *as, char *
int r;
sector_t metadata_dev_size;
r = dm_get_device(clone->ti, dm_shift_arg(as), FMODE_READ | FMODE_WRITE,
&clone->metadata_dev);
r = dm_get_device(clone->ti, dm_shift_arg(as),
BLK_OPEN_READ | BLK_OPEN_WRITE, &clone->metadata_dev);
if (r) {
*error = "Error opening metadata device";
return r;
@ -1703,8 +1703,8 @@ static int parse_dest_dev(struct clone *clone, struct dm_arg_set *as, char **err
int r;
sector_t dest_dev_size;
r = dm_get_device(clone->ti, dm_shift_arg(as), FMODE_READ | FMODE_WRITE,
&clone->dest_dev);
r = dm_get_device(clone->ti, dm_shift_arg(as),
BLK_OPEN_READ | BLK_OPEN_WRITE, &clone->dest_dev);
if (r) {
*error = "Error opening destination device";
return r;
@ -1725,7 +1725,7 @@ static int parse_source_dev(struct clone *clone, struct dm_arg_set *as, char **e
int r;
sector_t source_dev_size;
r = dm_get_device(clone->ti, dm_shift_arg(as), FMODE_READ,
r = dm_get_device(clone->ti, dm_shift_arg(as), BLK_OPEN_READ,
&clone->source_dev);
if (r) {
*error = "Error opening source device";

View File

@ -207,11 +207,10 @@ struct dm_table {
unsigned integrity_added:1;
/*
* Indicates the rw permissions for the new logical
* device. This should be a combination of FMODE_READ
* and FMODE_WRITE.
* Indicates the rw permissions for the new logical device. This
* should be a combination of BLK_OPEN_READ and BLK_OPEN_WRITE.
*/
fmode_t mode;
blk_mode_t mode;
/* a list of devices used by this table */
struct list_head devices;

View File

@ -1693,8 +1693,7 @@ static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned int size)
len = (remaining_size > PAGE_SIZE) ? PAGE_SIZE : remaining_size;
bio_add_page(clone, page, len, 0);
__bio_add_page(clone, page, len, 0);
remaining_size -= len;
}

View File

@ -1482,14 +1482,16 @@ static int era_ctr(struct dm_target *ti, unsigned int argc, char **argv)
era->ti = ti;
r = dm_get_device(ti, argv[0], FMODE_READ | FMODE_WRITE, &era->metadata_dev);
r = dm_get_device(ti, argv[0], BLK_OPEN_READ | BLK_OPEN_WRITE,
&era->metadata_dev);
if (r) {
ti->error = "Error opening metadata device";
era_destroy(era);
return -EINVAL;
}
r = dm_get_device(ti, argv[1], FMODE_READ | FMODE_WRITE, &era->origin_dev);
r = dm_get_device(ti, argv[1], BLK_OPEN_READ | BLK_OPEN_WRITE,
&era->origin_dev);
if (r) {
ti->error = "Error opening data device";
era_destroy(era);

View File

@ -293,8 +293,10 @@ static int __init dm_init_init(void)
for (i = 0; i < ARRAY_SIZE(waitfor); i++) {
if (waitfor[i]) {
dev_t dev;
DMINFO("waiting for device %s ...", waitfor[i]);
while (!dm_get_dev_t(waitfor[i]))
while (early_lookup_bdev(waitfor[i], &dev))
fsleep(5000);
}
}

View File

@ -861,7 +861,7 @@ static void __dev_status(struct mapped_device *md, struct dm_ioctl *param)
table = dm_get_inactive_table(md, &srcu_idx);
if (table) {
if (!(dm_table_get_mode(table) & FMODE_WRITE))
if (!(dm_table_get_mode(table) & BLK_OPEN_WRITE))
param->flags |= DM_READONLY_FLAG;
param->target_count = table->num_targets;
}
@ -1189,7 +1189,7 @@ static int do_resume(struct dm_ioctl *param)
if (old_size && new_size && old_size != new_size)
need_resize_uevent = true;
if (dm_table_get_mode(new_map) & FMODE_WRITE)
if (dm_table_get_mode(new_map) & BLK_OPEN_WRITE)
set_disk_ro(dm_disk(md), 0);
else
set_disk_ro(dm_disk(md), 1);
@ -1378,12 +1378,12 @@ static int dev_arm_poll(struct file *filp, struct dm_ioctl *param, size_t param_
return 0;
}
static inline fmode_t get_mode(struct dm_ioctl *param)
static inline blk_mode_t get_mode(struct dm_ioctl *param)
{
fmode_t mode = FMODE_READ | FMODE_WRITE;
blk_mode_t mode = BLK_OPEN_READ | BLK_OPEN_WRITE;
if (param->flags & DM_READONLY_FLAG)
mode = FMODE_READ;
mode = BLK_OPEN_READ;
return mode;
}

View File

@ -3750,11 +3750,11 @@ static int raid_message(struct dm_target *ti, unsigned int argc, char **argv,
* canceling read-auto mode
*/
mddev->ro = 0;
if (!mddev->suspended && mddev->sync_thread)
if (!mddev->suspended)
md_wakeup_thread(mddev->sync_thread);
}
set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
if (!mddev->suspended && mddev->thread)
if (!mddev->suspended)
md_wakeup_thread(mddev->thread);
return 0;

View File

@ -1241,9 +1241,8 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
int i;
int r = -EINVAL;
char *origin_path, *cow_path;
dev_t origin_dev, cow_dev;
unsigned int args_used, num_flush_bios = 1;
fmode_t origin_mode = FMODE_READ;
blk_mode_t origin_mode = BLK_OPEN_READ;
if (argc < 4) {
ti->error = "requires 4 or more arguments";
@ -1253,7 +1252,7 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
if (dm_target_is_snapshot_merge(ti)) {
num_flush_bios = 2;
origin_mode = FMODE_WRITE;
origin_mode = BLK_OPEN_WRITE;
}
s = kzalloc(sizeof(*s), GFP_KERNEL);
@ -1279,24 +1278,21 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv)
ti->error = "Cannot get origin device";
goto bad_origin;
}
origin_dev = s->origin->bdev->bd_dev;
cow_path = argv[0];
argv++;
argc--;
cow_dev = dm_get_dev_t(cow_path);
if (cow_dev && cow_dev == origin_dev) {
ti->error = "COW device cannot be the same as origin device";
r = -EINVAL;
goto bad_cow;
}
r = dm_get_device(ti, cow_path, dm_table_get_mode(ti->table), &s->cow);
if (r) {
ti->error = "Cannot get COW device";
goto bad_cow;
}
if (s->cow->bdev && s->cow->bdev == s->origin->bdev) {
ti->error = "COW device cannot be the same as origin device";
r = -EINVAL;
goto bad_store;
}
r = dm_exception_store_create(ti, argc, argv, s, &args_used, &s->store);
if (r) {

View File

@ -126,7 +126,7 @@ static int alloc_targets(struct dm_table *t, unsigned int num)
return 0;
}
int dm_table_create(struct dm_table **result, fmode_t mode,
int dm_table_create(struct dm_table **result, blk_mode_t mode,
unsigned int num_targets, struct mapped_device *md)
{
struct dm_table *t = kzalloc(sizeof(*t), GFP_KERNEL);
@ -304,7 +304,7 @@ static int device_area_is_invalid(struct dm_target *ti, struct dm_dev *dev,
* device and not to touch the existing bdev field in case
* it is accessed concurrently.
*/
static int upgrade_mode(struct dm_dev_internal *dd, fmode_t new_mode,
static int upgrade_mode(struct dm_dev_internal *dd, blk_mode_t new_mode,
struct mapped_device *md)
{
int r;
@ -323,24 +323,14 @@ static int upgrade_mode(struct dm_dev_internal *dd, fmode_t new_mode,
return 0;
}
/*
* Convert the path to a device
*/
dev_t dm_get_dev_t(const char *path)
{
dev_t dev;
if (lookup_bdev(path, &dev))
dev = name_to_dev_t(path);
return dev;
}
EXPORT_SYMBOL_GPL(dm_get_dev_t);
/*
* Add a device to the list, or just increment the usage count if
* it's already present.
*
* Note: the __ref annotation is because this function can call the __init
* marked early_lookup_bdev when called during early boot code from dm-init.c.
*/
int dm_get_device(struct dm_target *ti, const char *path, fmode_t mode,
int __ref dm_get_device(struct dm_target *ti, const char *path, blk_mode_t mode,
struct dm_dev **result)
{
int r;
@ -358,9 +348,13 @@ int dm_get_device(struct dm_target *ti, const char *path, fmode_t mode,
if (MAJOR(dev) != major || MINOR(dev) != minor)
return -EOVERFLOW;
} else {
dev = dm_get_dev_t(path);
if (!dev)
return -ENODEV;
r = lookup_bdev(path, &dev);
#ifndef MODULE
if (r && system_state < SYSTEM_RUNNING)
r = early_lookup_bdev(path, &dev);
#endif
if (r)
return r;
}
if (dev == disk_devt(t->md->disk))
return -EINVAL;
@ -668,7 +662,8 @@ int dm_table_add_target(struct dm_table *t, const char *type,
t->singleton = true;
}
if (dm_target_always_writeable(ti->type) && !(t->mode & FMODE_WRITE)) {
if (dm_target_always_writeable(ti->type) &&
!(t->mode & BLK_OPEN_WRITE)) {
ti->error = "target type may not be included in a read-only table";
goto bad;
}
@ -2039,7 +2034,7 @@ struct list_head *dm_table_get_devices(struct dm_table *t)
return &t->devices;
}
fmode_t dm_table_get_mode(struct dm_table *t)
blk_mode_t dm_table_get_mode(struct dm_table *t)
{
return t->mode;
}

View File

@ -3300,7 +3300,7 @@ static int pool_ctr(struct dm_target *ti, unsigned int argc, char **argv)
unsigned long block_size;
dm_block_t low_water_blocks;
struct dm_dev *metadata_dev;
fmode_t metadata_mode;
blk_mode_t metadata_mode;
/*
* FIXME Remove validation from scope of lock.
@ -3333,7 +3333,8 @@ static int pool_ctr(struct dm_target *ti, unsigned int argc, char **argv)
if (r)
goto out_unlock;
metadata_mode = FMODE_READ | ((pf.mode == PM_READ_ONLY) ? 0 : FMODE_WRITE);
metadata_mode = BLK_OPEN_READ |
((pf.mode == PM_READ_ONLY) ? 0 : BLK_OPEN_WRITE);
r = dm_get_device(ti, argv[0], metadata_mode, &metadata_dev);
if (r) {
ti->error = "Error opening metadata block device";
@ -3341,7 +3342,7 @@ static int pool_ctr(struct dm_target *ti, unsigned int argc, char **argv)
}
warn_if_metadata_device_too_big(metadata_dev->bdev);
r = dm_get_device(ti, argv[1], FMODE_READ | FMODE_WRITE, &data_dev);
r = dm_get_device(ti, argv[1], BLK_OPEN_READ | BLK_OPEN_WRITE, &data_dev);
if (r) {
ti->error = "Error getting data device";
goto out_metadata;
@ -4222,7 +4223,7 @@ static int thin_ctr(struct dm_target *ti, unsigned int argc, char **argv)
goto bad_origin_dev;
}
r = dm_get_device(ti, argv[2], FMODE_READ, &origin_dev);
r = dm_get_device(ti, argv[2], BLK_OPEN_READ, &origin_dev);
if (r) {
ti->error = "Error opening origin device";
goto bad_origin_dev;

View File

@ -607,7 +607,7 @@ int verity_fec_parse_opt_args(struct dm_arg_set *as, struct dm_verity *v,
(*argc)--;
if (!strcasecmp(arg_name, DM_VERITY_OPT_FEC_DEV)) {
r = dm_get_device(ti, arg_value, FMODE_READ, &v->fec->dev);
r = dm_get_device(ti, arg_value, BLK_OPEN_READ, &v->fec->dev);
if (r) {
ti->error = "FEC device lookup failed";
return r;

View File

@ -1196,7 +1196,7 @@ static int verity_ctr(struct dm_target *ti, unsigned int argc, char **argv)
if (r)
goto bad;
if ((dm_table_get_mode(ti->table) & ~FMODE_READ)) {
if ((dm_table_get_mode(ti->table) & ~BLK_OPEN_READ)) {
ti->error = "Device must be readonly";
r = -EINVAL;
goto bad;
@ -1225,13 +1225,13 @@ static int verity_ctr(struct dm_target *ti, unsigned int argc, char **argv)
}
v->version = num;
r = dm_get_device(ti, argv[1], FMODE_READ, &v->data_dev);
r = dm_get_device(ti, argv[1], BLK_OPEN_READ, &v->data_dev);
if (r) {
ti->error = "Data device lookup failed";
goto bad;
}
r = dm_get_device(ti, argv[2], FMODE_READ, &v->hash_dev);
r = dm_get_device(ti, argv[2], BLK_OPEN_READ, &v->hash_dev);
if (r) {
ti->error = "Hash device lookup failed";
goto bad;

View File

@ -577,7 +577,7 @@ static struct dmz_mblock *dmz_get_mblock_slow(struct dmz_metadata *zmd,
bio->bi_iter.bi_sector = dmz_blk2sect(block);
bio->bi_private = mblk;
bio->bi_end_io = dmz_mblock_bio_end_io;
bio_add_page(bio, mblk->page, DMZ_BLOCK_SIZE, 0);
__bio_add_page(bio, mblk->page, DMZ_BLOCK_SIZE, 0);
submit_bio(bio);
return mblk;
@ -728,7 +728,7 @@ static int dmz_write_mblock(struct dmz_metadata *zmd, struct dmz_mblock *mblk,
bio->bi_iter.bi_sector = dmz_blk2sect(block);
bio->bi_private = mblk;
bio->bi_end_io = dmz_mblock_bio_end_io;
bio_add_page(bio, mblk->page, DMZ_BLOCK_SIZE, 0);
__bio_add_page(bio, mblk->page, DMZ_BLOCK_SIZE, 0);
submit_bio(bio);
return 0;
@ -752,7 +752,7 @@ static int dmz_rdwr_block(struct dmz_dev *dev, enum req_op op,
bio = bio_alloc(dev->bdev, 1, op | REQ_SYNC | REQ_META | REQ_PRIO,
GFP_NOIO);
bio->bi_iter.bi_sector = dmz_blk2sect(block);
bio_add_page(bio, page, DMZ_BLOCK_SIZE, 0);
__bio_add_page(bio, page, DMZ_BLOCK_SIZE, 0);
ret = submit_bio_wait(bio);
bio_put(bio);

View File

@ -310,13 +310,13 @@ int dm_deleting_md(struct mapped_device *md)
return test_bit(DMF_DELETING, &md->flags);
}
static int dm_blk_open(struct block_device *bdev, fmode_t mode)
static int dm_blk_open(struct gendisk *disk, blk_mode_t mode)
{
struct mapped_device *md;
spin_lock(&_minor_lock);
md = bdev->bd_disk->private_data;
md = disk->private_data;
if (!md)
goto out;
@ -334,7 +334,7 @@ static int dm_blk_open(struct block_device *bdev, fmode_t mode)
return md ? 0 : -ENXIO;
}
static void dm_blk_close(struct gendisk *disk, fmode_t mode)
static void dm_blk_close(struct gendisk *disk)
{
struct mapped_device *md;
@ -448,7 +448,7 @@ static void dm_unprepare_ioctl(struct mapped_device *md, int srcu_idx)
dm_put_live_table(md, srcu_idx);
}
static int dm_blk_ioctl(struct block_device *bdev, fmode_t mode,
static int dm_blk_ioctl(struct block_device *bdev, blk_mode_t mode,
unsigned int cmd, unsigned long arg)
{
struct mapped_device *md = bdev->bd_disk->private_data;
@ -734,7 +734,7 @@ static char *_dm_claim_ptr = "I belong to device-mapper";
* Open a table device so we can use it as a map destination.
*/
static struct table_device *open_table_device(struct mapped_device *md,
dev_t dev, fmode_t mode)
dev_t dev, blk_mode_t mode)
{
struct table_device *td;
struct block_device *bdev;
@ -746,7 +746,7 @@ static struct table_device *open_table_device(struct mapped_device *md,
return ERR_PTR(-ENOMEM);
refcount_set(&td->count, 1);
bdev = blkdev_get_by_dev(dev, mode | FMODE_EXCL, _dm_claim_ptr);
bdev = blkdev_get_by_dev(dev, mode, _dm_claim_ptr, NULL);
if (IS_ERR(bdev)) {
r = PTR_ERR(bdev);
goto out_free_td;
@ -771,7 +771,7 @@ static struct table_device *open_table_device(struct mapped_device *md,
return td;
out_blkdev_put:
blkdev_put(bdev, mode | FMODE_EXCL);
blkdev_put(bdev, _dm_claim_ptr);
out_free_td:
kfree(td);
return ERR_PTR(r);
@ -784,14 +784,14 @@ static void close_table_device(struct table_device *td, struct mapped_device *md
{
if (md->disk->slave_dir)
bd_unlink_disk_holder(td->dm_dev.bdev, md->disk);
blkdev_put(td->dm_dev.bdev, td->dm_dev.mode | FMODE_EXCL);
blkdev_put(td->dm_dev.bdev, _dm_claim_ptr);
put_dax(td->dm_dev.dax_dev);
list_del(&td->list);
kfree(td);
}
static struct table_device *find_table_device(struct list_head *l, dev_t dev,
fmode_t mode)
blk_mode_t mode)
{
struct table_device *td;
@ -802,7 +802,7 @@ static struct table_device *find_table_device(struct list_head *l, dev_t dev,
return NULL;
}
int dm_get_table_device(struct mapped_device *md, dev_t dev, fmode_t mode,
int dm_get_table_device(struct mapped_device *md, dev_t dev, blk_mode_t mode,
struct dm_dev **result)
{
struct table_device *td;

View File

@ -203,7 +203,7 @@ int dm_open_count(struct mapped_device *md);
int dm_lock_for_deletion(struct mapped_device *md, bool mark_deferred, bool only_deferred);
int dm_cancel_deferred_remove(struct mapped_device *md);
int dm_request_based(struct mapped_device *md);
int dm_get_table_device(struct mapped_device *md, dev_t dev, fmode_t mode,
int dm_get_table_device(struct mapped_device *md, dev_t dev, blk_mode_t mode,
struct dm_dev **result);
void dm_put_table_device(struct mapped_device *md, struct dm_dev *d);

Some files were not shown because too many files have changed in this diff Show More