linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-01-12 08:09:56 +00:00

Author	SHA1	Message	Date
Alex Williamson	6a2a235aa6	Merge branches 'v5.13/vfio/embed-vfio_device', 'v5.13/vfio/misc' and 'v5.13/vfio/nvlink' into v5.13/vfio/next Spelling fixes merged with file deletion. Conflicts: drivers/vfio/pci/vfio_pci_nvlink2.c Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 12:01:51 -06:00
Jason Gunthorpe	1e04ec1420	vfio: Remove device_data from the vfio bus driver API There are no longer any users, so it can go away. Everything is using container_of now. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <14-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:55:11 -06:00
Jason Gunthorpe	07d47b4222	vfio/pci: Replace uses of vfio_device_data() with container_of This tidies a few confused places that think they can have a refcount on the vfio_device but the device_data could be NULL, that isn't possible by design. Most of the change falls out when struct vfio_devices is updated to just store the struct vfio_pci_device itself. This wasn't possible before because there was no easy way to get from the 'struct vfio_pci_device' to the 'struct vfio_device' to put back the refcount. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <13-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:55:11 -06:00
Jason Gunthorpe	6df62c5b05	vfio: Make vfio_device_ops pass a 'struct vfio_device ' instead of 'void ' This is the standard kernel pattern, the ops associated with a struct get the struct pointer in for typesafety. The expected design is to use container_of to cleanly go from the subsystem level type to the driver level type without having any type erasure in a void *. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <12-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:55:11 -06:00
Jason Gunthorpe	66873b5fa7	vfio/mdev: Make to_mdev_device() into a static inline The macro wrongly uses 'dev' as both the macro argument and the member name, which means it fails compilation if any caller uses a word other than 'dev' as the single argument. Fix this defect by making it into proper static inline, which is more clear and typesafe anyhow. Fixes: 99e3123e3d72 ("vfio-mdev: Make mdev_device private and abstract interfaces") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <11-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:55:11 -06:00
Jason Gunthorpe	1ae1b20f6f	vfio/mdev: Use vfio_init/register/unregister_group_dev mdev gets little benefit because it doesn't actually do anything, however it is the last user, so move the vfio_init/register/unregister_group_dev() code here for now. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <10-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:55:11 -06:00
Jason Gunthorpe	6b018e203d	vfio/pci: Use vfio_init/register/unregister_group_dev pci already allocates a struct vfio_pci_device with exactly the same lifetime as vfio_device, switch to the new API and embed vfio_device in vfio_pci_device. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <9-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:55:10 -06:00
Jason Gunthorpe	4aeec3984d	vfio/pci: Re-order vfio_pci_probe() vfio_add_group_dev() must be called only after all of the private data in vdev is fully setup and ready, otherwise there could be races with user space instantiating a device file descriptor and starting to call ops. For instance vfio_pci_reflck_attach() sets vdev->reflck and vfio_pci_open(), called by fops open, unconditionally derefs it, which will crash if things get out of order. Fixes: cc20d7999000 ("vfio/pci: Introduce VF token") Fixes: e309df5b0c9e ("vfio/pci: Parallelize device open and release") Fixes: 6eb7018705de ("vfio-pci: Move idle devices to D3hot power state") Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <8-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:55:10 -06:00
Jason Gunthorpe	61e9081748	vfio/pci: Move VGA and VF initialization to functions vfio_pci_probe() is quite complicated, with optional VF and VGA sub components. Move these into clear init/uninit functions and have a linear flow in probe/remove. This fixes a few little buglets: - vfio_pci_remove() is in the wrong order, vga_client_register() removes a notifier and is after kfree(vdev), but the notifier refers to vdev, so it can use after free in a race. - vga_client_register() can fail but was ignored Organize things so destruction order is the reverse of creation order. Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <7-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:55:10 -06:00
Jason Gunthorpe	0ca78666fa	vfio/fsl-mc: Use vfio_init/register/unregister_group_dev fsl-mc already allocates a struct vfio_fsl_mc_device with exactly the same lifetime as vfio_device, switch to the new API and embed vfio_device in vfio_fsl_mc_device. While here remove the devm usage for the vdev, this code is clean and doesn't need devm. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <6-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:55:10 -06:00
Jason Gunthorpe	2b1fe162e5	vfio/fsl-mc: Re-order vfio_fsl_mc_probe() vfio_add_group_dev() must be called only after all of the private data in vdev is fully setup and ready, otherwise there could be races with user space instantiating a device file descriptor and starting to call ops. For instance vfio_fsl_mc_reflck_attach() sets vdev->reflck and vfio_fsl_mc_open(), called by fops open, unconditionally derefs it, which will crash if things get out of order. This driver started life with the right sequence, but two commits added stuff after vfio_add_group_dev(). Fixes: 2e0d29561f59 ("vfio/fsl-mc: Add irq infrastructure for fsl-mc devices") Fixes: f2ba7e8c947b ("vfio/fsl-mc: Added lock support in preparation for interrupt handling") Co-developed-by: Diana Craciun OSS <diana.craciun@oss.nxp.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <5-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:55:10 -06:00
Jason Gunthorpe	cb61645868	vfio/platform: Use vfio_init/register/unregister_group_dev platform already allocates a struct vfio_platform_device with exactly the same lifetime as vfio_device, switch to the new API and embed vfio_device in vfio_platform_device. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <4-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:55:10 -06:00
Jason Gunthorpe	0bfc6a4ea6	vfio: Split creation of a vfio_device into init and register ops This makes the struct vfio_device part of the public interface so it can be used with container_of and so forth, as is typical for a Linux subystem. This is the first step to bring some type-safety to the vfio interface by allowing the replacement of 'void ' and 'struct device ' inputs with a simple and clear 'struct vfio_device ' For now the self-allocating vfio_add_group_dev() interface is kept so each user can be updated as a separate patch. The expected usage pattern is driver core probe() function: my_device = kzalloc(sizeof(mydevice)); vfio_init_group_dev(&my_device->vdev, dev, ops, mydevice); /* other driver specific prep / vfio_register_group_dev(&my_device->vdev); dev_set_drvdata(dev, my_device); driver core remove() function: my_device = dev_get_drvdata(dev); vfio_unregister_group_dev(&my_device->vdev); / other driver specific tear down */ kfree(my_device); Allowing the driver to be able to use the drvdata and vfio_device to go to/from its own data. The pattern also makes it clear that vfio_register_group_dev() must be last in the sequence, as once it is called the core code can immediately start calling ops. The init/register gap is provided to allow for the driver to do setup before ops can be called and thus avoid races. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <3-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:55:10 -06:00
Jason Gunthorpe	5e42c99944	vfio: Simplify the lifetime logic for vfio_device The vfio_device is using a 'sleep until all refs go to zero' pattern for its lifetime, but it is indirectly coded by repeatedly scanning the group list waiting for the device to be removed on its own. Switch this around to be a direct representation, use a refcount to count the number of places that are blocking destruction and sleep directly on a completion until that counter goes to zero. kfree the device after other accesses have been excluded in vfio_del_group_dev(). This is a fairly common Linux idiom. Due to this we can now remove kref_put_mutex(), which is very rarely used in the kernel. Here it is being used to prevent a zero ref device from being seen in the group list. Instead allow the zero ref device to continue to exist in the device_list and use refcount_inc_not_zero() to exclude it once refs go to zero. This patch is organized so the next patch will be able to alter the API to allow drivers to provide the kfree. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <2-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:55:10 -06:00
Jason Gunthorpe	e572bfb2b6	vfio: Remove extra put/gets around vfio_device->group The vfio_device->group value has a get obtained during vfio_add_group_dev() which gets moved from the stack to vfio_device->group in vfio_group_create_device(). The reference remains until we reach the end of vfio_del_group_dev() when it is put back. Thus anything that already has a kref on the vfio_device is guaranteed a valid group pointer. Remove all the extra reference traffic. It is tricky to see, but the get at the start of vfio_del_group_dev() is actually pairing with the put hidden inside vfio_device_put() a few lines below. A later patch merges vfio_group_create_device() into vfio_add_group_dev() which makes the ownership and error flow on the create side easier to follow. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <1-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:55:09 -06:00
Christoph Hellwig	b392a19891	vfio/pci: remove vfio_pci_nvlink2 This driver never had any open userspace (which for VFIO would include VM kernel drivers) that use it, and thus should never have been added by our normal userspace ABI rules. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Message-Id: <20210326061311.1497642-2-hch@lst.de> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:54:13 -06:00
Shenming Lu	a536019d3e	vfio/type1: Remove the almost unused check in vfio_iommu_type1_unpin_pages The check i > npage at the end of vfio_iommu_type1_unpin_pages is unused unless npage < 0, but if npage < 0, this function will return npage, which should return -EINVAL instead. So let's just check the parameter npage at the start of the function. By the way, replace unpin_exit with break. Signed-off-by: Shenming Lu <lushenming@huawei.com> Message-Id: <20210406135009.1707-1-lushenming@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:53:50 -06:00
Zhen Lei	f5c858ec2b	vfio/platform: Fix spelling mistake "registe" -> "register" There is a spelling mistake in a comment, fix it. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20210326083528.1329-5-thunder.leizhen@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:53:50 -06:00
Zhen Lei	d0915b3291	vfio/pci: fix a couple of spelling mistakes There are several spelling mistakes, as follows: thru ==> through presense ==> presence Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20210326083528.1329-4-thunder.leizhen@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:53:50 -06:00
Zhen Lei	d0a7541dd9	vfio/mdev: Fix spelling mistake "interal" -> "internal" There is a spelling mistake in a comment, fix it. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20210326083528.1329-3-thunder.leizhen@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:53:50 -06:00
Zhen Lei	06d738c8ab	vfio/type1: fix a couple of spelling mistakes There are several spelling mistakes, as follows: userpsace ==> userspace Accouting ==> Accounting exlude ==> exclude Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Message-Id: <20210326083528.1329-2-thunder.leizhen@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:53:50 -06:00
Fred Gao	bab2c1990b	vfio/pci: Add support for opregion v2.1+ Before opregion version 2.0 VBT data is stored in opregion mailbox #4, but when VBT data exceeds 6KB size and cannot be within mailbox #4 then from opregion v2.0+, Extended VBT region, next to opregion is used to hold the VBT data, so the total size will be opregion size plus extended VBT region size. Since opregion v2.0 with physical host VBT address would not be practically available for end user and guest can not directly access host physical address, so it is not supported. Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Swee Yee Fonn <swee.yee.fonn@intel.com> Signed-off-by: Fred Gao <fred.gao@intel.com> Message-Id: <20210325170953.24549-1-fred.gao@intel.com> Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:53:50 -06:00
Zhou Wang	36f0be5a30	vfio/pci: Remove an unnecessary blank line in vfio_pci_enable This blank line is unnecessary, so remove it. Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Message-Id: <1615808073-178604-1-git-send-email-wangzhou1@hisilicon.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:53:50 -06:00
Bhaskar Chowdhury	fbc9d37161	vfio: pci: Spello fix in the file vfio_pci.c s/permision/permission/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Message-Id: <20210314052925.3560-1-unixbhaskar@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-04-06 11:53:49 -06:00
Jason Gunthorpe	e0146a108c	vfio/nvlink: Add missing SPAPR_TCE_IOMMU depends Compiling the nvlink stuff relies on the SPAPR_TCE_IOMMU otherwise there are compile errors: drivers/vfio/pci/vfio_pci_nvlink2.c:101:10: error: implicit declaration of function 'mm_iommu_put' [-Werror,-Wimplicit-function-declaration] ret = mm_iommu_put(data->mm, data->mem); As PPC only defines these functions when the config is set. Previously this wasn't a problem by chance as SPAPR_TCE_IOMMU was the only IOMMU that could have satisfied IOMMU_API on POWERNV. Fixes: 179209fa1270 ("vfio: IOMMU_API should be selected") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <0-v1-83dba9768fc3+419-vfio_nvlink2_kconfig_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-29 14:48:00 -06:00
Daniel Jordan	60c988bc15	vfio/type1: Empty batch for pfnmap pages When vfio_pin_pages_remote() returns with a partial batch consisting of a single VM_PFNMAP pfn, a subsequent call will unfortunately try restoring it from batch->pages, resulting in vfio mapping the wrong page and unbalancing the page refcount. Prevent the function from returning with this kind of partial batch to avoid the issue. There's no explicit check for a VM_PFNMAP pfn because it's awkward to do so, so infer it from characteristics of the batch instead. This may result in occasional false positives but keeps the code simpler. Fixes: 4d83de6da265 ("vfio/type1: Batch page pinning") Link: https://lkml.kernel.org/r/20210323133254.33ed9161@omen.home.shazbot.org/ Reported-by: Alex Williamson <alex.williamson@redhat.com> Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Message-Id: <20210325010552.185481-1-daniel.m.jordan@oracle.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-25 12:48:38 -06:00
Daniel Jordan	4ab4fcfce5	vfio/type1: fix vaddr_get_pfns() return in vfio_pin_page_external() vaddr_get_pfns() now returns the positive number of pfns successfully gotten instead of zero. vfio_pin_page_external() might return 1 to vfio_iommu_type1_pin_pages(), which will treat it as an error, if vaddr_get_pfns() is successful but vfio_pin_page_external() doesn't reach vfio_lock_acct(). Fix it up in vfio_pin_page_external(). Found by inspection. Fixes: be16c1fd99f4 ("vfio/type1: Change success value of vaddr_get_pfn()") Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Message-Id: <20210308172452.38864-1-daniel.m.jordan@oracle.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:39:29 -06:00
Jason Gunthorpe	b2b12db535	vfio: Depend on MMU VFIO_IOMMU_TYPE1 does not compile with !MMU: ../drivers/vfio/vfio_iommu_type1.c: In function 'follow_fault_pfn': ../drivers/vfio/vfio_iommu_type1.c:536:22: error: implicit declaration of function 'pte_write'; did you mean 'vfs_write'? [-Werror=implicit-function-declaration] So require it. Suggested-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <0-v1-02cb5500df6e+78-vfio_no_mmu_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:39:28 -06:00
Jason Gunthorpe	3b49dfb08c	ARM: amba: Allow some ARM_AMBA users to compile with COMPILE_TEST CONFIG_VFIO_AMBA has a light use of AMBA, adding some inline fallbacks when AMBA is disabled will allow it to be compiled under COMPILE_TEST and make VFIO easier to maintain. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <3-v1-df057e0f92c3+91-vfio_arm_compile_test_jgg@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:39:28 -06:00
Jason Gunthorpe	d3d72a6dff	vfio-platform: Add COMPILE_TEST to VFIO_PLATFORM x86 can build platform bus code too, so vfio-platform and all the platform reset implementations compile successfully on x86. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <2-v1-df057e0f92c3+91-vfio_arm_compile_test_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:39:28 -06:00
Jason Gunthorpe	179209fa12	vfio: IOMMU_API should be selected As IOMMU_API is a kconfig without a description (eg does not show in the menu) the correct operator is select not 'depends on'. Using 'depends on' for this kind of symbol means VFIO is not selectable unless some other random kconfig has already enabled IOMMU_API for it. Fixes: cba3345cc494 ("vfio: VFIO core") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <1-v1-df057e0f92c3+91-vfio_arm_compile_test_jgg@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:39:27 -06:00
Steve Sistare	7dc4b2fdb2	vfio/type1: fix unmap all on ILP32 Some ILP32 architectures support mapping a 32-bit vaddr within a 64-bit iova space. The unmap-all code uses 32-bit SIZE_MAX as an upper bound on the extent of the mappings within iova space, so mappings above 4G cannot be found and unmapped. Use U64_MAX instead, and use u64 for size variables. This also fixes a static analysis bug found by the kernel test robot running smatch for ILP32. Fixes: 0f53afa12bae ("vfio/type1: unmap cleanup") Fixes: c19650995374 ("vfio/type1: implement unmap all") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Message-Id: <1614281102-230747-1-git-send-email-steven.sistare@oracle.com> Link: https://lore.kernel.org/linux-mm/20210222141043.GW2222@kadam Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-03-16 10:39:27 -06:00
Linus Torvalds	719bbd4a50	VFIO updates for v5.12-rc1 - Virtual address update handling (Steve Sistare) - s390/zpci fixes and cleanups (Max Gurtovoy) - Fixes for dirty bitmap handling, non-mdev page pinning, and improved pinned dirty scope tracking (Keqian Zhu) - Batched page pinning enhancement (Daniel Jordan) - Page access permission fix (Alex Williamson) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iQIcBAABAgAGBQJgNpAEAAoJECObm247sIsiDEsP/1G0QJIum3KqG0+ABHgSS7ks j3oKeLxDl2BGeDBw2yIfinif1fjtafmUWg3Q0RlVRv0S71ccu7Ee4MfAHqy8k7Gp BM/G+2Amnrz1qWsgEV2JGw8T2wwZDG8ZJluh0sxj2KFqI99jWKftlPH4D8TTJeDj VrsFHzQlpcILFBh9Mj5zWFkIuqm2/70O7FJF3jhyN2b0MjYG/f390k0TLQZS+Mkr l+6pfIZ3pHYngzro8pX56B1z3c1mJEeRChMPt7IdTVruBcGkUCMXrZKZVN2WqoOf Otj6Mxvq5Wur8Rk9VfKs2fO/oz9FJjr5/sL4Vv7xUigWe9nDXBnoy+OR4XJUwxEf BaB4tK8f9xTJcf8MrK+eOpBvMSx7eE0qnP/7VMtykC7Cw57qdhCuzEq7ueUGKuVw ubj+pjHcAx6T2urjL7KdzuJUMNPkafATi8hN/Bj6oshESZuhM2lSCHiqI4ZQnh5H TPMWpb2dX/ohRkcnQdO9N2T2+Lcg6tmD4Kqigv+75zzDj+U15Ph2owtnmH5OFJIG BCtibsX2yk6UuxBPvl8eN0X7n41G6gwJcsD6spuaoateK6UTJugjTCZtKB96YMFQ c4eULO+hvUIiQJkWbbpFA+mXUcLwcpEoT2pWfuj3MET0FuHVtEhbGEO609gGAAWI GMheKjGI+GRW07JFwgCV =ei4J -----END PGP SIGNATURE----- Merge tag 'vfio-v5.12-rc1' of git://github.com/awilliam/linux-vfio Pull VFIO updatesfrom Alex Williamson: - Virtual address update handling (Steve Sistare) - s390/zpci fixes and cleanups (Max Gurtovoy) - Fixes for dirty bitmap handling, non-mdev page pinning, and improved pinned dirty scope tracking (Keqian Zhu) - Batched page pinning enhancement (Daniel Jordan) - Page access permission fix (Alex Williamson) * tag 'vfio-v5.12-rc1' of git://github.com/awilliam/linux-vfio: (21 commits) vfio/type1: Batch page pinning vfio/type1: Prepare for batched pinning with struct vfio_batch vfio/type1: Change success value of vaddr_get_pfn() vfio/type1: Use follow_pte() vfio/pci: remove CONFIG_VFIO_PCI_ZDEV from Kconfig vfio/iommu_type1: Fix duplicate included kthread.h vfio-pci/zdev: fix possible segmentation fault issue vfio-pci/zdev: remove unused vdev argument vfio/pci: Fix handling of pci use accessor return codes vfio/iommu_type1: Mantain a counter for non_pinned_groups vfio/iommu_type1: Fix some sanity checks in detach group vfio/iommu_type1: Populate full dirty when detach non-pinned group vfio/type1: block on invalid vaddr vfio/type1: implement notify callback vfio: iommu driver notify callback vfio/type1: implement interfaces to update vaddr vfio/type1: massage unmap iteration vfio: interfaces to update vaddr vfio/type1: implement unmap all vfio/type1: unmap cleanup ...	2021-02-24 10:43:40 -08:00
Daniel Jordan	4d83de6da2	vfio/type1: Batch page pinning Pinning one 4K page at a time is inefficient, so do it in batches of 512 instead. This is just an optimization with no functional change intended, and in particular the driver still calls iommu_map() with the largest physically contiguous range possible. Add two fields in vfio_batch to remember where to start between calls to vfio_pin_pages_remote(), and use vfio_batch_unpin() to handle remaining pages in the batch in case of error. qemu pins pages for guests around 8% faster on my test system, a two-node Broadwell server with 128G memory per node. The qemu process was bound to one node with its allocations constrained there as well. base test guest ---------------- ---------------- mem (GB) speedup avg sec (std) avg sec (std) 1 7.4% 0.61 (0.00) 0.56 (0.00) 2 8.3% 0.93 (0.00) 0.85 (0.00) 4 8.4% 1.46 (0.00) 1.34 (0.00) 8 8.6% 2.54 (0.01) 2.32 (0.00) 16 8.3% 4.66 (0.00) 4.27 (0.01) 32 8.3% 8.94 (0.01) 8.20 (0.01) 64 8.2% 17.47 (0.01) 16.04 (0.03) 120 8.5% 32.45 (0.13) 29.69 (0.01) perf diff confirms less time spent in pup. Here are the top ten functions: Baseline Delta Abs Symbol 78.63% +6.64% clear_page_erms 1.50% -1.50% __gup_longterm_locked 1.27% -0.78% __get_user_pages +0.76% kvm_zap_rmapp.constprop.0 0.54% -0.53% vmacache_find 0.55% -0.51% get_pfnblock_flags_mask 0.48% -0.48% __get_user_pages_remote +0.39% slot_rmap_walk_next +0.32% vfio_pin_map_dma +0.26% kvm_handle_hva_range ... Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-22 16:30:47 -07:00
Daniel Jordan	4b6c33b322	vfio/type1: Prepare for batched pinning with struct vfio_batch Get ready to pin more pages at once with struct vfio_batch, which represents a batch of pinned pages. The struct has a fallback page pointer to avoid two unlikely scenarios: pointlessly allocating a page if disable_hugepages is enabled or failing the whole pinning operation if the kernel can't allocate memory. vaddr_get_pfn() becomes vaddr_get_pfns() to prepare for handling multiple pages, though for now only one page is stored in the pages array. Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-22 16:30:45 -07:00
Daniel Jordan	be16c1fd99	vfio/type1: Change success value of vaddr_get_pfn() vaddr_get_pfn() simply returns 0 on success. Have it report the number of pfns successfully gotten instead, whether from page pinning or follow_fault_pfn(), which will be used later when batching pinning. Change the last check in vfio_pin_pages_remote() for consistency with the other two. Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-22 16:30:44 -07:00
Alex Williamson	07956b6269	vfio/type1: Use follow_pte() follow_pfn() doesn't make sure that we're using the correct page protections, get the pte with follow_pte() so that we can test protections and get the pfn from the pte. Fixes: 5cbf3264bc71 ("vfio/type1: Fix VA->PA translation for PFNMAP VMAs in vaddr_get_pfn()") Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-22 10:17:13 -07:00
Max Gurtovoy	b9abef43a0	vfio/pci: remove CONFIG_VFIO_PCI_ZDEV from Kconfig In case we're running on s390 system always expose the capabilities for configuration of zPCI devices. In case we're running on different platform, continue as usual. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-19 10:29:56 -07:00
Tian Tao	35ac5991cd	vfio/iommu_type1: Fix duplicate included kthread.h linux/kthread.h is included more than once, remove the one that isn't necessary. Fixes: 898b9eaeb3fe ("vfio/type1: block on invalid vaddr") Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-18 12:25:37 -07:00
Alex Williamson	76adb20f92	Merge branch 'v5.12/vfio/next-vaddr' into v5.12/vfio/next	2021-02-02 09:17:48 -07:00
Max Gurtovoy	7e31d6dc2c	vfio-pci/zdev: fix possible segmentation fault issue In case allocation fails, we must behave correctly and exit with error. Fixes: e6b817d4b821 ("vfio-pci/zdev: Add zPCI capabilities to VFIO_DEVICE_GET_INFO") Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-02 09:06:02 -07:00
Uwe Kleine-König	3fd269e74f	amba: Make the remove callback return void All amba drivers return 0 in their remove callback. Together with the driver core ignoring the return value anyhow, it doesn't make sense to return a value here. Change the remove prototype to return void, which makes it explicit that returning an error value doesn't work as expected. This simplifies changing the core remove callback to return void, too. Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Krzysztof Kozlowski <krzk@kernel.org> # for drivers/memory Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> # for hwtracing/coresight Acked-By: Vinod Koul <vkoul@kernel.org> # for dmaengine Acked-by: Guenter Roeck <linux@roeck-us.net> # for watchdog Acked-by: Wolfram Sang <wsa@kernel.org> # for I2C Acked-by: Takashi Iwai <tiwai@suse.de> # for sound Acked-by: Vladimir Zapolskiy <vz@mleia.com> # for memory/pl172 Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20210126165835.687514-5-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>	2021-02-02 14:25:50 +01:00
Uwe Kleine-König	5b495ac8fe	vfio: platform: simplify device removal vfio_platform_remove_common() cannot return non-NULL in vfio_amba_remove() as the latter is only called if vfio_amba_probe() returned success. Diagnosed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20210126165835.687514-4-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>	2021-02-02 14:24:23 +01:00
Max Gurtovoy	46c4746660	vfio-pci/zdev: remove unused vdev argument Zdev static functions do not use vdev argument. Remove it. Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-01 13:43:06 -07:00
Heiner Kallweit	37a682ffbe	vfio/pci: Fix handling of pci use accessor return codes The pci user accessors return negative errno's on error. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> [aw: drop Fixes tag, pcibios_err_to_errno() behaves correctly for -errno] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-01 13:40:52 -07:00
Keqian Zhu	010321565a	vfio/iommu_type1: Mantain a counter for non_pinned_groups With this counter, we never need to traverse all groups to update pinned_scope of vfio_iommu. Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-01 13:40:52 -07:00
Keqian Zhu	4a19f37a3d	vfio/iommu_type1: Fix some sanity checks in detach group vfio_sanity_check_pfn_list() is used to check whether pfn_list and notifier are empty when remove the external domain, so it makes a wrong assumption that only external domain will use the pinning interface. Now we apply the pfn_list check when a vfio_dma is removed and apply the notifier check when all domains are removed. Fixes: a54eb55045ae ("vfio iommu type1: Add support for mediated devices") Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-01 13:40:52 -07:00
Keqian Zhu	d0a78f9176	vfio/iommu_type1: Populate full dirty when detach non-pinned group If a group with non-pinned-page dirty scope is detached with dirty logging enabled, we should fully populate the dirty bitmaps at the time it's removed since we don't know the extent of its previous DMA, nor will the group be present to trigger the full bitmap when the user retrieves the dirty bitmap. Fixes: d6a4c185660c ("vfio iommu: Implementation of ioctl for dirty pages tracking") Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-01 13:40:52 -07:00
Steve Sistare	898b9eaeb3	vfio/type1: block on invalid vaddr Block translation of host virtual address while an iova range has an invalid vaddr. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-01 13:20:07 -07:00
Steve Sistare	487ace1340	vfio/type1: implement notify callback Implement a notify callback that remembers if the container's file descriptor has been closed. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2021-02-01 13:20:07 -07:00

1 2 3 4 5 ...

737 Commits