Commit Graph

686 Commits

Author SHA1 Message Date
Tony Krowiak
7847a19b5b s390/vfio-ap: check for TAPQ response codes 0x35 and 0x36
Check for response codes 0x35 and 0x36 which are asynchronous return codes
indicating a failure of the guest to associate a secret with a queue. Since
there can be no interaction with this queue from the guest (i.e., the vcpus
are out of SIE for hot unplug, the guest is being shut down or an emulated
subsystem reset of the guest is taking place), let's go ahead and re-issue
the ZAPQ to reset and zeroize the queue.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Viktor Mihajlovski <mihajlov@linux.ibm.com>
Link: https://lore.kernel.org/r/20230815184333.6554-10-akrowiak@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-18 15:09:29 +02:00
Tony Krowiak
e1f17f8ea9 s390/vfio-ap: handle queue state change in progress on reset
A new APQSW response code (0xA) indicating the designated queue is in the
process of being bound or associated to a configuration may be returned
from the PQAP(ZAPQ) command. This patch introduces code that will verify
when the PQAP(ZAPQ) command can be re-issued after receiving response code
0xA.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Viktor Mihajlovski <mihajlov@linux.ibm.com>
Link: https://lore.kernel.org/r/20230815184333.6554-9-akrowiak@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-18 15:09:29 +02:00
Tony Krowiak
9261f04388 s390/vfio-ap: use work struct to verify queue reset
Instead of waiting to verify that a queue is reset in the
vfio_ap_mdev_reset_queue function, let's use a wait queue to check the
the state of the reset. This way, when resetting all of the queues assigned
to a matrix mdev, we don't have to wait for each queue to be reset before
initiating a reset on the next queue to be reset.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Suggested-by: Halil Pasic <pasic@linux.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Tested-by: Viktor Mihajlovski <mihajlov@linux.ibm.com>
Link: https://lore.kernel.org/r/20230815184333.6554-8-akrowiak@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-18 15:09:29 +02:00
Tony Krowiak
62aab082e9 s390/vfio-ap: store entire AP queue status word with the queue object
Store the entire AP queue status word returned from the ZAPQ command with
the struct vfio_ap_queue object instead of just the response code field.
The other information contained in the status word is need by the
apq_reset_check function to display a proper message to indicate that the
vfio_ap driver is waiting for the ZAPQ to complete because the queue is
not empty or IRQs are still enabled.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Tested-by: Viktor Mihajlovski <mihajlov@linux.ibm.com>
Link: https://lore.kernel.org/r/20230815184333.6554-7-akrowiak@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-18 15:09:29 +02:00
Tony Krowiak
dd174833e4 s390/vfio-ap: remove upper limit on wait for queue reset to complete
The architecture does not define an upper limit on how long a queue reset
(RAPQ/ZAPQ) can take to complete. In order to ensure both the security
requirements and prevent resource leakage and corruption in the hypervisor,
it is necessary to remove the upper limit (200ms) the vfio_ap driver
currently waits for a reset to complete. This, of course, may result in a
hang which is a less than desirable user experience, but until a firmware
solution is provided, this is a necessary evil.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Viktor Mihajlovski <mihajlov@linux.ibm.com>
Link: https://lore.kernel.org/r/20230815184333.6554-6-akrowiak@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-18 15:09:28 +02:00
Tony Krowiak
c51f8c6bb5 s390/vfio-ap: allow deconfigured queue to be passed through to a guest
When a queue is reset, the status response code returned from the reset
operation is stored in the reset_rc field of the vfio_ap_queue structure
representing the queue being reset. This field is later used to decide
whether the queue should be passed through to a guest. If the reset_rc
field is a non-zero value, the queue will be filtered from the list of
queues passed through.

When an adapter is deconfigured, all queues associated with that adapter
are reset. That being the case, it is not necessary to filter those queues;
so, if the status response code returned from a reset operation indicates
the queue is deconfigured, the reset_rc field of the vfio_ap_queue
structure will be set to zero so it will be passed through (i.e., not
filtered).

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Viktor Mihajlovski <mihajlov@linux.ibm.com>
Link: https://lore.kernel.org/r/20230815184333.6554-5-akrowiak@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-18 15:09:28 +02:00
Tony Krowiak
411b0109da s390/vfio-ap: wait for response code 05 to clear on queue reset
Response code 05, AP busy, is a valid response code for a ZAPQ or TAPQ.
Instead of returning error -EIO when a ZAPQ fails with response code 05,
let's wait until the queue is no longer busy and try the ZAPQ again.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Tested-by: Viktor Mihajlovski <mihajlov@linux.ibm.com>
Link: https://lore.kernel.org/r/20230815184333.6554-4-akrowiak@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-18 15:09:28 +02:00
Tony Krowiak
7aa7b2a80c s390/vfio-ap: clean up irq resources if possible
The architecture does not specify whether interrupts are disabled as part
of the asynchronous reset or upon return from the PQAP/ZAPQ instruction.
If, however, PQAP/ZAPQ completes with APQSW response code 0 and the
interrupt bit in the status word is also 0, we know the interrupts are
disabled and we can go ahead and clean up the corresponding resources;
otherwise, we must wait until the asynchronous reset has completed.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Suggested-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Tested-by: Viktor Mihajlovski <mihajlov@linux.ibm.com>
Link: https://lore.kernel.org/r/20230815184333.6554-3-akrowiak@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-18 15:09:28 +02:00
Tony Krowiak
680b7ddd7e s390/vfio-ap: no need to check the 'E' and 'I' bits in APQSW after TAPQ
After a ZAPQ is executed to reset a queue, if the queue is not empty or
interrupts are still enabled, the vfio_ap driver will wait for the reset
operation to complete by repeatedly executing the TAPQ instruction and
checking the 'E' and 'I' bits in the APQSW to verify that the queue is
empty and interrupts are disabled. This is unnecessary because it is
sufficient to check only the response code in the APQSW. If the reset is
still in progress, the response code will be 02; however, if the reset has
completed successfully, the response code will be 00.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Tested-by: Viktor Mihajlovski <mihajlov@linux.ibm.com>
Link: https://lore.kernel.org/r/20230815184333.6554-2-akrowiak@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-18 15:09:28 +02:00
Holger Dengler
386cb81e4b s390/zcrypt_ep11misc: support API ordinal 6 with empty pin-blob
Secure execution guest environments require an empty pinblob in all
key generation and unwrap requests. Empty pinblobs are only available
in EP11 API ordinal 6 or higher.

Add an empty pinblob to key generation and unwrap requests, if the AP
secure binding facility is available. In all other cases, stay with
the empty pin tag (no pinblob) and the current API ordinals.

The EP11 API ordinal also needs to be considered when the pkey module
tries to figure out the list of eligible cards for key operations
with protected keys in secure execution environment.

These changes are transparent to userspace but required for running
an secure execution guest with handling key generate and key derive
(e.g. secure key to protected key) correct. Especially using EP11
secure keys with the kernel dm-crypt layer requires this patch.

Co-developed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Holger Dengler <dengler@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-18 15:07:57 +02:00
Holger Dengler
b9352e4b9b s390/pkey: fix PKEY_TYPE_EP11_AES handling for sysfs attributes
Commit 'fa6999e326fe ("s390/pkey: support CCA and EP11 secure ECC
private keys")' introduced a new PKEY_TYPE_EP11_AES securekey type as
a supplement to the existing PKEY_TYPE_EP11 (which won't work in
environments with session-bound keys). The pkey EP11 securekey
attributes use PKEY_TYPE_EP11_AES (instead of PKEY_TYPE_EP11)
keyblobs, to make the generated keyblobs usable also in environments,
where session-bound keys are required.

There should be no negative impacts to userspace because the internal
structure of the keyblobs is opaque. The increased size of the
generated keyblobs is reflected by the changed size of the attributes.

Fixes: fa6999e326 ("s390/pkey: support CCA and EP11 secure ECC private keys")
Signed-off-by: Holger Dengler <dengler@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-17 15:18:53 +02:00
Holger Dengler
745742dbca s390/pkey: fix PKEY_TYPE_EP11_AES handling in PKEY_VERIFYKEY2 IOCTL
Commit 'fa6999e326fe ("s390/pkey: support CCA and EP11 secure ECC
private keys")' introduced a new PKEY_TYPE_EP11_AES type for the
PKEY_VERIFYKEY2 IOCTL to verify keyblobs of this type. Unfortunately,
all PKEY_VERIFYKEY2 IOCTL requests with keyblobs of this type return
with an error (-EINVAL). Fix PKEY_TYPE_EP11_AES handling in
PKEY_VERIFYKEY2 IOCTL, so that userspace can verify keyblobs of this
type.

Fixes: fa6999e326 ("s390/pkey: support CCA and EP11 secure ECC private keys")
Signed-off-by: Holger Dengler <dengler@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-17 15:18:53 +02:00
Holger Dengler
d1fdfb0b2f s390/pkey: fix PKEY_TYPE_EP11_AES handling in PKEY_KBLOB2PROTK[23]
Commit 'fa6999e326fe ("s390/pkey: support CCA and EP11 secure ECC
private keys")' introduced a new PKEY_TYPE_EP11_AES type for the
PKEY_KBLOB2PROTK2 and a new IOCTL, PKEY_KBLOB2PROTK3, which both
allows userspace to convert opaque securekey blobs of this type into
protectedkey blobs. Unfortunately, all PKEY_KBLOB2PROTK2 and
PKEY_KBLOB2PROTK3 IOCTL requests with this keyblobs of this type
return with an error (-EINVAL). Fix PKEY_TYPE_EP11_AES handling in
PKEY_KBLOB2PROTK2 and PKEY_KBLOB2PROTK3 IOCTLs, so that userspace can
convert PKEY_TYPE_EP11_AES keyblobs into protectedkey blobs.

Add a helper function to decode the start and size of the internal
header as well as start and size of the keyblob payload of an existing
keyblob. Also validate the length of header and keyblob, as well as
the keyblob magic.

Introduce another helper function, which handles a raw key wrapping
request and do the keyblob decoding in the calling function. Remove
all other header-related calculations.

Fixes: fa6999e326 ("s390/pkey: support CCA and EP11 secure ECC private keys")
Signed-off-by: Holger Dengler <dengler@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-17 15:18:53 +02:00
Holger Dengler
da2863f159 s390/pkey: fix PKEY_TYPE_EP11_AES handling in PKEY_CLR2SECK2 IOCTL
Commit 'fa6999e326fe ("s390/pkey: support CCA and EP11 secure ECC
private keys")' introduced PKEY_TYPE_EP11_AES for the PKEY_CLR2SECK2
IOCTL to convert an AES clearkey into a securekey of this type.
Unfortunately, all PKEY_CLR2SECK2 IOCTL requests with type
PKEY_TYPE_EP11_AES return with an error (-EINVAL). Fix the handling
for PKEY_TYPE_EP11_AES in PKEY_CLR2SECK2 IOCTL, so that userspace can
convert clearkey blobs into PKEY_TYPE_EP11_AES securekey blobs.

Cc: stable@vger.kernel.org # v5.10+
Fixes: fa6999e326 ("s390/pkey: support CCA and EP11 secure ECC private keys")
Signed-off-by: Holger Dengler <dengler@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-17 15:18:53 +02:00
Holger Dengler
fb249ce7f7 s390/pkey: fix PKEY_TYPE_EP11_AES handling in PKEY_GENSECK2 IOCTL
Commit 'fa6999e326fe ("s390/pkey: support CCA and EP11 secure ECC
private keys")' introduced PKEY_TYPE_EP11_AES for the PKEY_GENSECK2
IOCTL, to enable userspace to generate securekey blobs of this
type. Unfortunately, all PKEY_GENSECK2 IOCTL requests for
PKEY_TYPE_EP11_AES return with an error (-EINVAL). Fix the handling
for PKEY_TYPE_EP11_AES in PKEY_GENSECK2 IOCTL, so that userspace can
generate securekey blobs of this type.

The start of the header and the keyblob, as well as the length need
special handling, depending on the internal keyversion. Add a helper
function that splits an uninitialized buffer into start and size of
the header as well as start and size of the payload, depending on the
requested keyversion.

Do the header-related calculations and the raw genkey request handling
in separate functions. Use the raw genkey request function for
internal purposes.

Fixes: fa6999e326 ("s390/pkey: support CCA and EP11 secure ECC private keys")
Signed-off-by: Holger Dengler <dengler@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-17 15:18:53 +02:00
Holger Dengler
37a08f010b s390/pkey: fix/harmonize internal keyblob headers
Commit 'fa6999e326fe ("s390/pkey: support CCA and EP11 secure ECC
private keys")' introduced PKEY_TYPE_EP11_AES as a supplement to
PKEY_TYPE_EP11. All pkeys have an internal header/payload structure,
which is opaque to the userspace. The header structures for
PKEY_TYPE_EP11 and PKEY_TYPE_EP11_AES are nearly identical and there
is no reason, why different structures are used. In preparation to fix
the keyversion handling in the broken PKEY IOCTLs, the same header
structure is used for PKEY_TYPE_EP11 and PKEY_TYPE_EP11_AES. This
reduces the number of different code paths and increases the
readability.

Fixes: fa6999e326 ("s390/pkey: support CCA and EP11 secure ECC private keys")
Signed-off-by: Holger Dengler <dengler@linux.ibm.com>
Reviewed-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-17 15:18:52 +02:00
Yi Liu
8cfa718602 vfio-iommufd: Add detach_ioas support for emulated VFIO devices
This prepares for adding DETACH ioctl for emulated VFIO devices.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Terrence Xu <terrence.xu@intel.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Yanting Jiang <yanting.jiang@intel.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Link: https://lore.kernel.org/r/20230718135551.6592-16-yi.l.liu@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-07-25 10:19:18 -06:00
Harald Freudenberger
5ac8c72462 s390/zcrypt: remove CEX2 and CEX3 device drivers
Remove the legacy device driver code for CEX2 and CEX3 cards.

The last machines which are able to handle CEX2 crypto cards
are z10 EC first available 2008 and z10 BC first available 2009.
The last machines able to handle a CEX3 crypto card are
z196 first available 2010 and z114 first available 2011.

Please note that this does not imply to drop CEX2 and CEX3
support in general. With older kernels on hardware up to the
aforementioned machine models these crypto cards will get
support by IBM.

The removal of the CEX2 and CEX3 device drivers code opens up
some simplifications, for example support for crypto cards
without rng support can be removed also.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-07-24 12:12:22 +02:00
Harald Freudenberger
4cfca532dd s390/zcrypt: fix reply buffer calculations for CCA replies
The length information for available buffer space for CCA
replies is covered with two fields in the T6 header prepended
on each CCA reply: fromcardlen1 and fromcardlen2. The sum of
these both values must not exceed the AP bus limit for this
card (24KB for CEX8, 12KB CEX7 and older) minus the always
present headers.

The current code adjusted the fromcardlen2 value in case
of exceeding the AP bus limit when there was a non-zero
value given from userspace. Some tests now showed that this
was the wrong assumption. Instead the userspace value given for
this field should always be trusted and if the sum of the
two fields exceeds the AP bus limit for this card the first
field fromcardlen1 should be adjusted instead.

So now the calculation is done with this new insight in mind.
Also some additional checks for overflow have been introduced
and some comments to provide some documentation for future
maintainers of this complicated calculation code.

Furthermore the 128 bytes of fix overhead which is used
in the current code is not correct. Investigations showed
that for a reply always the same two header structs are
prepended before a possible payload. So this is also fixed
with this patch.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-07-20 16:48:56 +02:00
Heiko Carstens
cada938a01 s390: fix various typos
Fix various typos found with codespell.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-07-03 11:19:42 +02:00
Harald Freudenberger
2b70a11955 s390/zcrypt: remove ZCRYPT_MULTIDEVNODES kernel config option
Remove ZCRYPT_MULTIDEVNODES kernel config option and make
the dependent code always build.

The last years showed, that this option is enabled on all distros
and exploited by some features (for example CEX plugin for kubernetes).
So remove this choice as it was never used to switch off the multiple
devices support for the zcrypt device driver.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-07-03 11:19:41 +02:00
Harald Freudenberger
af40322e90 s390/zcrypt: do not retry administrative requests
All kind of administrative requests should not been retried. Some card
firmware detects this and assumes a replay attack. This patch checks
on failure if the low level functions indicate a retry (EAGAIN) and
checks for the ADMIN flag set on the request message.  If this both
are true, the response code for this message is changed to EIO to make
sure the zcrypt API layer does not attempt to retry the request. As of
now the ADMIN flag is set for a request message when
- for EP11 the field 'flags' of the EP11 CPRB struct has the leftmost
  bit set.
- for CCA when the CPRB minor version is 'T3', 'T5', 'T6' or 'T7'.

Please note that the do-not-retry only applies to a request
which has been sent to the card (= has been successfully enqueued) but
the reply indicates some kind of failure and by default it would be
replied. It is totally fine to retry a request if a previous attempt
to enqueue the msg into the firmware queue had some kind of failure
and thus the card has never seen this request.

Reported-by: Frank Uhlig <Frank.Uhlig1@ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-07-03 11:19:41 +02:00
Harald Freudenberger
0fdcc88bb9 s390/zcrypt: cleanup some debug code
This patch removes most of the debug code which
is build in when CONFIG_ZCRYPT_DEBUG is enabled.
There is no real exploiter for this code any more and
at least one ioctl fails with this code enabled.

The CONFIG_ZCRYPT_DEBUG kernel config option still
makes sense as some debug sysfs entries can get
enabled with this and maybe long term a new better
designed debug and error injection way will get
introduced.

This patch only removes code surrounded by the named
kernel config option. This option should by default
always be off anyway. The structs and defines removed
by the patch have been used only by code surrounded
by a CONFIG_ZCRYPT_DEBUG ifdef and thus can be removed
also.

In the end this patch removes all the failure-injection
possibilities which had been available when the kernel
had been build with CONFIG_ZCRYPT_DEBUG. It has never
been used that much and was too unflexible anyway.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-07-03 11:19:41 +02:00
Heiko Carstens
13cf06d57f s390/zcrypt: use kvmalloc_array() instead of kzalloc()
zcrypt_unlocked_ioctl() allocates 256k with kzalloc() which is likely to
fail if memory is fragmented. To avoid that use kvmalloc_array() instead,
like it is done at several other places for the same reason.

Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-28 13:57:09 +02:00
Tony Krowiak
2e3d8d71e2 s390/vfio-ap: wire in the vfio_device_ops request callback
The mdev device is being removed, so pass the request to userspace to
ask for a graceful cleanup. This should free up the thread that
would otherwise loop waiting for the device to be fully released.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Link: https://lore.kernel.org/r/20230530223538.279198-4-akrowiak@linux.ibm.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-06 13:42:07 +02:00
Tony Krowiak
bf48961f6f s390/vfio-ap: realize the VFIO_DEVICE_SET_IRQS ioctl
Realize the VFIO_DEVICE_SET_IRQS ioctl to set an eventfd file descriptor
to be used by the vfio_ap device driver to signal a device request to
userspace.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20230530223538.279198-3-akrowiak@linux.ibm.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-06 13:42:07 +02:00
Tony Krowiak
6afc770048 s390/vfio-ap: realize the VFIO_DEVICE_GET_IRQ_INFO ioctl
Realize the VFIO_DEVICE_GET_IRQ_INFO ioctl to retrieve the information for
the VFIO device request IRQ.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20230530223538.279198-2-akrowiak@linux.ibm.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-06 13:42:06 +02:00
Harald Freudenberger
9e436c195e s390/pkey: add support for ecc clear key
Add support for a new 'non CCA clear key token' with these
ECC clear keys supported:

- ECC P256
- ECC P384
- ECC P521
- ECC ED25519
- ECC ED448

This makes it possible to derive a protected key from this
ECC clear key input via PKEY_KBLOB2PROTK3 ioctl. As of now
the only way to derive protected keys from these clear key
tokens is via PCKMO instruction. For AES keys an alternate
path via creating a secure key from the clear key and then
derive a protected key from the secure key exists. This
alternate path is not implemented for ECC keys as it would
require to rearrange and maybe recalculate the clear key
material for input to derive an CCA or EP11 ECC secure key.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-01 17:10:21 +02:00
Harald Freudenberger
f370f45c64 s390/pkey: do not use struct pkey_protkey
This is an internal rework of the pkey code to not use the
struct pkey_protkey internal any more. This struct has a hard
coded protected key buffer with MAXPROTKEYSIZE = 64 bytes.
However, with support for ECC protected key, this limit is
too short and thus this patch reworks all the internal code
to use the triple u8 *protkey, u32 protkeylen, u32 protkeytype
instead. So the ioctl which still has to deal with this struct
coming from userspace and/or provided to userspace invoke all
the internal functions now with the triple instead of passing
a pointer to struct pkey_protkey.

Also the struct pkey_clrkey has been internally replaced in
a similar way. This struct also has a hard coded clear key
buffer of MAXCLRKEYSIZE = 32 bytes and thus is not usable with
e.g. ECC clear key material.

This is a transparent rework for userspace applications using
the pkey API. The internal kernel API used by the PAES crypto
ciphers has been adapted to this change to make it possible
to provide ECC protected keys via this interface in the future.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-01 17:10:21 +02:00
Harald Freudenberger
46a29b039e s390/pkey: introduce reverse x-mas trees
This patch introduces reverse x-mas trees for all
local variables on all the functions in pkey.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-01 17:10:20 +02:00
Holger Dengler
844cf829e5 s390/pkey: zeroize key blobs
Key blobs for the IOCTLs PKEY_KBLOB2PROTK[23] may contain clear key
material. Zeroize the copies of these keys in kernel memory after
creating the protected key.

Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-05-15 14:20:13 +02:00
Linus Torvalds
10de638d8e s390 updates for the 6.4 merge window
- Add support for stackleak feature. Also allow specifying
   architecture-specific stackleak poison function to enable faster
   implementation. On s390, the mvc-based implementation helps decrease
   typical overhead from a factor of 3 to just 25%
 
 - Convert all assembler files to use SYM* style macros, deprecating the
   ENTRY() macro and other annotations. Select ARCH_USE_SYM_ANNOTATIONS
 
 - Improve KASLR to also randomize module and special amode31 code
   base load addresses
 
 - Rework decompressor memory tracking to support memory holes and improve
   error handling
 
 - Add support for protected virtualization AP binding
 
 - Add support for set_direct_map() calls
 
 - Implement set_memory_rox() and noexec module_alloc()
 
 - Remove obsolete overriding of mem*() functions for KASAN
 
 - Rework kexec/kdump to avoid using nodat_stack to call purgatory
 
 - Convert the rest of the s390 code to use flexible-array member instead
   of a zero-length array
 
 - Clean up uaccess inline asm
 
 - Enable ARCH_HAS_MEMBARRIER_SYNC_CORE
 
 - Convert to using CONFIG_FUNCTION_ALIGNMENT and enable
   DEBUG_FORCE_FUNCTION_ALIGN_64B
 
 - Resolve last_break in userspace fault reports
 
 - Simplify one-level sysctl registration
 
 - Clean up branch prediction handling
 
 - Rework CPU counter facility to retrieve available counter sets just
   once
 
 - Other various small fixes and improvements all over the code
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmRM8pwACgkQjYWKoQLX
 FBjV1AgAlvAhu1XkwOdwqdT4GqE8pcN4XXzydog1MYihrSO2PdgWAxpEW7o2QURN
 W+3xa6RIqt7nX2YBiwTanMZ12TYaFY7noGl3eUpD/NhueprweVirVl7VZUEuRoW/
 j0mbx77xsVzLfuDFxkpVwE6/j+tTO78kLyjUHwcN9rFVUaL7/orJneDJf+V8fZG0
 sHLOv0aljF7Jr2IIkw82lCmW/vdk7k0dACWMXK2kj1H3dIK34B9X4AdKDDf/WKXk
 /OSElBeZ93tSGEfNDRIda6iR52xocROaRnQAaDtargKFl9VO0/dN9ADxO+SLNHjN
 pFE/9VD6xT/xo4IuZZh/Z3TcYfiLvA==
 =Geqx
 -----END PGP SIGNATURE-----

Merge tag 's390-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Add support for stackleak feature. Also allow specifying
   architecture-specific stackleak poison function to enable faster
   implementation. On s390, the mvc-based implementation helps decrease
   typical overhead from a factor of 3 to just 25%

 - Convert all assembler files to use SYM* style macros, deprecating the
   ENTRY() macro and other annotations. Select ARCH_USE_SYM_ANNOTATIONS

 - Improve KASLR to also randomize module and special amode31 code base
   load addresses

 - Rework decompressor memory tracking to support memory holes and
   improve error handling

 - Add support for protected virtualization AP binding

 - Add support for set_direct_map() calls

 - Implement set_memory_rox() and noexec module_alloc()

 - Remove obsolete overriding of mem*() functions for KASAN

 - Rework kexec/kdump to avoid using nodat_stack to call purgatory

 - Convert the rest of the s390 code to use flexible-array member
   instead of a zero-length array

 - Clean up uaccess inline asm

 - Enable ARCH_HAS_MEMBARRIER_SYNC_CORE

 - Convert to using CONFIG_FUNCTION_ALIGNMENT and enable
   DEBUG_FORCE_FUNCTION_ALIGN_64B

 - Resolve last_break in userspace fault reports

 - Simplify one-level sysctl registration

 - Clean up branch prediction handling

 - Rework CPU counter facility to retrieve available counter sets just
   once

 - Other various small fixes and improvements all over the code

* tag 's390-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (118 commits)
  s390/stackleak: provide fast __stackleak_poison() implementation
  stackleak: allow to specify arch specific stackleak poison function
  s390: select ARCH_USE_SYM_ANNOTATIONS
  s390/mm: use VM_FLUSH_RESET_PERMS in module_alloc()
  s390: wire up memfd_secret system call
  s390/mm: enable ARCH_HAS_SET_DIRECT_MAP
  s390/mm: use BIT macro to generate SET_MEMORY bit masks
  s390/relocate_kernel: adjust indentation
  s390/relocate_kernel: use SYM* macros instead of ENTRY(), etc.
  s390/entry: use SYM* macros instead of ENTRY(), etc.
  s390/purgatory: use SYM* macros instead of ENTRY(), etc.
  s390/kprobes: use SYM* macros instead of ENTRY(), etc.
  s390/reipl: use SYM* macros instead of ENTRY(), etc.
  s390/head64: use SYM* macros instead of ENTRY(), etc.
  s390/earlypgm: use SYM* macros instead of ENTRY(), etc.
  s390/mcount: use SYM* macros instead of ENTRY(), etc.
  s390/crc32le: use SYM* macros instead of ENTRY(), etc.
  s390/crc32be: use SYM* macros instead of ENTRY(), etc.
  s390/crypto,chacha: use SYM* macros instead of ENTRY(), etc.
  s390/amode31: use SYM* macros instead of ENTRY(), etc.
  ...
2023-04-30 11:43:31 -07:00
Harald Freudenberger
3b42877cd5 s390/zcrypt: rework arrays with length zero occurrences
Review and rework all the zero length array occurrences
within structs to flexible array fields or comment if
not used at all. However, some struct fields are there
for documentation purpose or to have correct sizeof()
evaluation of a struct and thus should not get deleted.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-04-19 16:47:31 +02:00
Harald Freudenberger
0f2d4fee91 s390/zcrypt: simplify prep of CCA key token
The preparation of the key data struct for a CCA RSA ME
operation had some improvement to skip leading zeros
in the key's exponent. However, all supported CCA cards
nowadays support leading zeros in key tokens.

So for simplifying the CCA key preparing code, this
patch simply removes this optimization code.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-04-04 18:34:55 +02:00
Harald Freudenberger
bd922f33d4 s390/zcrypt: remove unused ancient padding code
There was some ancient code which padded the results of
a clear key ME or CRT operation with some PKCS 1.2 header.
According to the comment this was only needed by crypto
cards older than the CEX2. These cards are not supported
any more and so this patch removes this obscure result
padding code.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-04-04 18:34:55 +02:00
Greg Kroah-Hartman
cd8fe5b6db Merge 6.3-rc5 into driver-core-next
We need the fixes in here for testing, as well as the driver core
changes for documentation updates to build on.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-03 09:33:30 +02:00
Greg Kroah-Hartman
75a2d4226b driver core: class: mark the struct class for sysfs callbacks as constant
struct class should never be modified in a sysfs callback as there is
nothing in the structure to modify, and frankly, the structure is almost
never used in a sysfs callback, so mark it as constant to allow struct
class to be moved to read-only memory.

While we are touching all class sysfs callbacks also mark the attribute
as constant as it can not be modified.  The bonding code still uses this
structure so it can not be removed from the function callbacks.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Bartosz Golaszewski <brgl@bgdev.pl>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Russ Weight <russell.h.weight@intel.com>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steve French <sfrench@samba.org>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: linux-cifs@vger.kernel.org
Cc: linux-gpio@vger.kernel.org
Cc: linux-mtd@lists.infradead.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: netdev@vger.kernel.org
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Link: https://lore.kernel.org/r/20230325084537.3622280-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-29 07:54:58 +02:00
Tony Krowiak
8f8cf76758 s390/vfio-ap: fix memory leak in vfio_ap device driver
The device release callback function invoked to release the matrix device
uses the dev_get_drvdata(device *dev) function to retrieve the
pointer to the vfio_matrix_dev object in order to free its storage. The
problem is, this object is not stored as drvdata with the device; since the
kfree function will accept a NULL pointer, the memory for the
vfio_matrix_dev object is never freed.

Since the device being released is contained within the vfio_matrix_dev
object, the container_of macro will be used to retrieve its pointer.

Fixes: 1fde573413 ("s390: vfio-ap: base implementation of VFIO AP device driver")
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20230320150447.34557-1-akrowiak@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-03-27 17:23:08 +02:00
Lizhe
85206bf953 s390/vfio-ap: remove redundant driver match function
If there is no driver match function, the driver core assumes that each
candidate pair (driver, device) matches, see driver_match_device().

Drop the matrix bus's match function that always returned 1 and so
implements the same behaviour as when there is no match function

Signed-off-by: Lizhe <sensor1010@163.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Link: https://lore.kernel.org/r/20230319041941.259830-1-sensor1010@163.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-03-27 17:19:52 +02:00
Greg Kroah-Hartman
75cff725d9 driver core: bus: mark the struct bus_type for sysfs callbacks as constant
struct bus_type should never be modified in a sysfs callback as there is
nothing in the structure to modify, and frankly, the structure is almost
never used in a sysfs callback, so mark it as constant to allow struct
bus_type to be moved to read-only memory.

Cc: "David S. Miller" <davem@davemloft.net>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Ben Widawsky <bwidawsk@kernel.org>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Harald Freudenberger <freude@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Hu Haowen <src.res@email.cn>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Stuart Yoder <stuyoder@gmail.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Acked-by: Ilya Dryomov <idryomov@gmail.com> # rbd
Acked-by: Ira Weiny <ira.weiny@intel.com> # cxl
Reviewed-by: Alex Shi <alexs@kernel.org>
Acked-by: Iwona Winiarska <iwona.winiarska@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>	# pci
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com> # scsi
Link: https://lore.kernel.org/r/20230313182918.1312597-23-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-23 13:20:40 +01:00
Harald Freudenberger
038c5bedbc s390/ap: add ap status asynch error support
Review and extend the low level AP code to be able to
deal with asynchronous reported errors on APQNs.

The hypervisor and the SE guest may be confronted with
an asynchronously reported error at return of an AP
instruction. So all places where AP instructions are
called need review and may eventually need extensions.
However, not all places need rework. As together with
the AP status and the enabled asynch bit there is always
a response code set. The asynch error reporting comes
with new response codes which may be simple handled in
the default case of a switch statement.

The idea behind this patch is to report asynch errors
as -EPERM (read this as "Operation not permitted") which
reflects the fact that only a rapq (with F bit enabled)
is a valid AP instruction when an asynch error is flagged.

The AP queue state machine functions return
AP_SM_WAIT_NONE when a asynch error is detected to reflect
the fact, that the state machine can't do anything with
such an error as long as the queue is reset.

Unfortunately the ap bus scan function needed some
update as the ap_queue_info() now needs to return
3 states: 1 if an APQN exists and info is available,
-1 if it is assumed an APQN does not exist and the new
return value 0 without any info values filled. This 0
returncode is handled as "there is an APQN but we currently
don't know any more hw info about this, so please use
your previous info and try again later".

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:49 +01:00
Harald Freudenberger
2d72eaf036 s390/ap: implement SE AP bind, unbind and associate
Implementation of the new functions for SE AP support:
bind, unbind and associate. There are two new sysfs
attributes for this:

/sys/devices/ap/cardxx/xx.yyyy/se_bind
/sys/devices/ap/cardxx/xx.yyyy/se_associate

Writing a 1 into the se_bind attribute triggers the
SE AP bind for this AP queue, writing a 0 into does
an unbind - that's a reset (RAPQ) with the F bit enabled.

The se_associate attribute needs an integer value in
range 0...2^16-1 written in. This is the index into a
secrets table feed into the ultravisor. For more details
please see the Architecture documents.

These both new ap queue attributes are only visible
inside a SE guest with SB (Secure Binding) available.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:49 +01:00
Harald Freudenberger
263c8454db s390/ap: introduce low frequency polling possibility
For some events the ap bus needs to poll. For example
when an AP queue is reset until the reset is through.
Also when no interrupt support is available (e.g. zVM)
there is a need to poll until all requests have been
processed and all replies have been delivered.

Polling is done with a high resolution timer by default
run with a rate of 4kHz (LPAR) or 666Hz (zVM guest).

For some events (wait for reset complete, wait for irq
enabled complete) this is a much too high poll rate
which triggers a lot of TAPQ invocations.

This patch introduces the possibility for the state
machine functions to return a new wait enum
AP_SM_WAIT_LOW_TIMEOUT which gives a hint to the
ap_wait() function to eventually set up the timer
with a more relaxed timeout value of 25Hz.

This patch also includes a slight rework of the sysfs
functions parsing the timer related stuff: Use of
kstrtobool and kstrtoul instead of sscanf.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:49 +01:00
Harald Freudenberger
4bdf3c3956 s390/ap: provide F bit parameter for ap_rapq() and ap_zapq()
Extent the ap inline functions ap_rapq() (calls PQAP(RAPQ))
and ap_zapq() (calls PQAP(ZAPQ)) with a new parameter to
enable the new architectured F bit which forces an
unassociate and/or unbind on a secure execution associated
and/or bound queue.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:49 +01:00
Harald Freudenberger
088174960e s390/ap: filter ap card functions, new queue functions attribute
With SE SB (Secure Binding) some currently unused and thus always
zero bits in the TAPQ GR2 result are now used to show the binding
state of a queue. So to check if a card has changed the comparing
base is exactly this GR2 value shown as 'ap_function' in sysfs
(/sys/devices/ap/cardxx/ap_functions). Now there is some queue
specific info in this info and so a new mask TAPQ_CARD_FUNC_CMP_MASK
is used to filter out only the relevant bits for card compare.

For the same reason now the function bits (including exactly this
bind/associate information) need to be exposed to user space now.
So tools like lszcrypt can evaluate binding/association state on a
queue base. So here comes a new sysfs attribute

  /sys/devices/ap/cardxx/xx.yyyy/ap_functions

This sysfs attribute is similar to the already existing
ap_functions attribute at ap card level. It shows the
upper 32 bits of GR2 from an invocation of TAPQ for this
AP queue.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:48 +01:00
Harald Freudenberger
211c06d845 s390/ap: make tapq gr2 response a struct
This patch introduces a new struct ap_tapq_gr2 which covers
the response in GR2 on TAPQ invocation. This makes it much
easier and less error-prone for the calling functions to
access the right field without shifting and masking.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:48 +01:00
Harald Freudenberger
d7b1813af6 s390/ap: introduce new AP bus sysfs attribute features
Introduce a new AP bus sysfs attribute /sys/bus/ap/features
which shows the features from the QCI information.
Currently these feature bits are evaluated:

- QCI S bit is shown as 'APSC'
- QCI N bit is shown as 'APXA'
- QCI C bit is shown as 'QACT'
- QCI R bit is shown as 'RC8A'
- QCI B bit is shown as 'APSB'

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:48 +01:00
Harald Freudenberger
f604704021 s390/ap: exploit new B bit from QCI config info
This patch introduces an update to the ap_config_info
struct which is filled with the QCI subfunction. There
is a new bit apsb (short 'B') showing if the AP secure
bind facility is available. The patch also includes a
simple function ap_sb_available() wrapping this bit test.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:48 +01:00
Harald Freudenberger
964d581daf s390/zcrypt: replace scnprintf with sysfs_emit
Replace scnprintf() with sysfs_emit() and friends
where possible.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:48 +01:00
Harald Freudenberger
8794c59613 s390/zcrypt: rework length information for dqap
The inline ap_dqap function does not return the number of
bytes actually written into the message buffer. The calling
code inspects the AP message header to figure out what kind
of AP message has been received and pulls the length
information from this header. This processing may not work
correctly in cases where only a fragment of the reply is
received.

With this patch the ap_dqap inline function now returns
the number of actually written bytes in the *length parameter.
So the calling function has a chance to compare the number of
received bytes against what the AP message header length
field states. This is especially useful in cases where a
message could only get partially received.

The low level reply processing functions needed some rework
to be able to catch this new length information and compare
it the right way. The rework also deals with some situations
where until now the reply length was not correctly calculated
and/or set.

All this has been heavily tested as the modifications on
the reply length information may affect crypto load.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:47 +01:00
Harald Freudenberger
003d248fee s390/zcrypt: make psmid unsigned long instead of long long
Since s390 kernel build does not support 32 bit build any
more there is no difference between long and long long.
So this patch reworks all occurrences of psmid (a 64 bit
value) to use unsigned long now.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20 11:12:47 +01:00
Greg Kroah-Hartman
1aaba11da9 driver core: class: remove module * from class_create()
The module pointer in class_create() never actually did anything, and it
shouldn't have been requred to be set as a parameter even if it did
something.  So just remove it and fix up all callers of the function in
the kernel tree at the same time.

Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Link: https://lore.kernel.org/r/20230313181843.1207845-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17 15:16:33 +01:00
Yu Zhe
72c2112ce9 s390/zcrypt: remove unnecessary (void *) conversions
Pointer variables of void * type do not require type cast.

Signed-off-by: Yu Zhe <yuzhe@nfschina.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/20230303052155.21072-1-yuzhe@nfschina.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-03-13 09:16:42 +01:00
Linus Torvalds
0bdf4a8bf0 s390 updates for 6.3 merge window part 2
- Add empty command line parameter handling stubs to kernel for all command
   line parameters which are handled in the decompressor. This avoids
   invalid "Unknown kernel command line parameters" messages from the
   kernel, and also avoids that these will be incorrectly passed to user
   space. This caused already confusion, therefore add the empty stubs
 
 - Add missing phys_to_virt() handling to machine check handler
 
 - Introduce and use a union to be used for zcrypt inline assemblies. This
   makes sure that only a register wide member of the union is passed as
   input and output parameter to inline assemblies, while usual C code uses
   other members of the union to access bit fields of it
 
 - Add and use a READ_ONCE_ALIGNED_128() macro, which can be used to
   atomically read a 128-bit value from memory. This replaces the (mis-)use
   of the 128-bit cmpxchg operation to do the same in cpum_sf code.
   Currently gcc does not generate the used lpq instruction if __READ_ONCE()
   is used for aligned 128-bit accesses, therefore use this s390 specific
   helper
 
 - Simplify machine check handler code if a task needs to be killed because
   of e.g. register corruption due to a machine malfunction
 
 - Perform CPU reset to clear pending interrupts and TLB entries on an
   already stopped target CPU before delegating work to it
 
 - Generate arch/s390/boot/vmlinux.map link map for the decompressor, when
   CONFIG_VMLINUX_MAP is enabled for debugging purposes
 
 - Fix segment type handling for dcssblk devices. It incorrectly always
   returned type "READ/WRITE" even for read-only segements, which can result
   in a kernel panic if somebody tries to write to a read-only device
 
 - Sort config S390 select list again
 
 - Fix two kprobe reenter bugs revealed by a recently added kprobe kunit
   test
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmQB5LgACgkQIg7DeRsp
 bsLvPRAAojiU7uZTDKL58Ta67D7gyWWZa6ClI7MHrtEwEM70cOZHmOYToFuyapbT
 Af+PBg8YcqDDHxGf+HNuxnYRI4y2hGJUyjURZmGuCzPPJ5IGvy+8/X0d0wsvclqC
 fWeTtVvZIxdu5NdBQTs1iMCCxPXg9OfGDVCU+P/pnrlKskHorA6a2ZWGdkpae3fj
 kd1pgLLcZq0Jf8+Q6bSLBZjm45R/Q+qEpd8eIOncP7xrQTpLPkLxRSrnZLPtYdLu
 7uUDtzEC63x7/0Weri71ZQqbc22CzHNiF6FZj5sm78KsWmparmUUsns/U5CrEO4J
 e85/kPXjgfvwMZdQCXSPHZJ8CGTRzN6Zl/Lym9W5+9X6cgb1WNVNCATGZIG3HueA
 MYnXmLkmNmaEgsckcYbCVPR9SjIwVibIL+/32hH63oUnc8WnrbVlai/G57j0dRUg
 +kxVvwbxaDyesRbF7XKe4PssYZJxpO+QFIhnvo6EEghUuc32o+WVNqjoU04/VBaa
 i2FtbARGDWWs5+EwS9td/xHiFPQHpFCJHrMbxacEQmtDbOTOhoLbxNlP17YfQ2oD
 ch0QCZ1w0VwE9ehMRLmZQCem51n7aPt11tKOEFKsck4Mb1dXVLW9LWwNpjIBXrIo
 2g4U4Ytj2I9HbxWMJmKmLsP4NHn2oG3A+uGUkSWf/wdDTfylB30=
 =zZId
 -----END PGP SIGNATURE-----

Merge tag 's390-6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Heiko Carstens:

 - Add empty command line parameter handling stubs to kernel for all
   command line parameters which are handled in the decompressor. This
   avoids invalid "Unknown kernel command line parameters" messages from
   the kernel, and also avoids that these will be incorrectly passed to
   user space. This caused already confusion, therefore add the empty
   stubs

 - Add missing phys_to_virt() handling to machine check handler

 - Introduce and use a union to be used for zcrypt inline assemblies.
   This makes sure that only a register wide member of the union is
   passed as input and output parameter to inline assemblies, while
   usual C code uses other members of the union to access bit fields of
   it

 - Add and use a READ_ONCE_ALIGNED_128() macro, which can be used to
   atomically read a 128-bit value from memory. This replaces the
   (mis-)use of the 128-bit cmpxchg operation to do the same in cpum_sf
   code. Currently gcc does not generate the used lpq instruction if
   __READ_ONCE() is used for aligned 128-bit accesses, therefore use
   this s390 specific helper

 - Simplify machine check handler code if a task needs to be killed
   because of e.g. register corruption due to a machine malfunction

 - Perform CPU reset to clear pending interrupts and TLB entries on an
   already stopped target CPU before delegating work to it

 - Generate arch/s390/boot/vmlinux.map link map for the decompressor,
   when CONFIG_VMLINUX_MAP is enabled for debugging purposes

 - Fix segment type handling for dcssblk devices. It incorrectly always
   returned type "READ/WRITE" even for read-only segements, which can
   result in a kernel panic if somebody tries to write to a read-only
   device

 - Sort config S390 select list again

 - Fix two kprobe reenter bugs revealed by a recently added kprobe kunit
   test

* tag 's390-6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/kprobes: fix current_kprobe never cleared after kprobes reenter
  s390/kprobes: fix irq mask clobbering on kprobe reenter from post_handler
  s390/Kconfig: sort config S390 select list again
  s390/extmem: return correct segment type in __segment_load()
  s390/decompressor: add link map saving
  s390/smp: perform cpu reset before delegating work to target cpu
  s390/mcck: cleanup user process termination path
  s390/cpum_sf: use READ_ONCE_ALIGNED_128() instead of 128-bit cmpxchg
  s390/rwonce: add READ_ONCE_ALIGNED_128() macro
  s390/ap,zcrypt,vfio: introduce and use ap_queue_status_reg union
  s390/nmi: fix virtual-physical address confusion
  s390/setup: do not complain about parameters handled in decompressor
2023-03-03 09:38:01 -08:00
Harald Freudenberger
ebf95e8846 s390/ap,zcrypt,vfio: introduce and use ap_queue_status_reg union
Introduce a new ap queue status register wrapper union to access register
wide values. So the inline assembler only sees register wide values but the
surrounding code may use a more structured view of the same value and a
reader of the code (and the compiler) gets a clear understanding about the
mapping between fields and register values.

All the changes to access the ap queue status are local to the inline
functions within ap.h. However, the struct ap_qirq_ctrl has been replaces
by a union for same reason and this needed slight adaptions in the calling
code.

Suggested-by: Halil Pasic <pasic@linux.ibm.com>
Suggested-by: Andreas Arnez <arnez@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Holger Dengler <dengler@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-27 15:29:36 +01:00
Linus Torvalds
a93e884edf Driver core changes for 6.3-rc1
Here is the large set of driver core changes for 6.3-rc1.
 
 There's a lot of changes this development cycle, most of the work falls
 into two different categories:
   - fw_devlink fixes and updates.  This has gone through numerous review
     cycles and lots of review and testing by lots of different devices.
     Hopefully all should be good now, and Saravana will be keeping a
     watch for any potential regression on odd embedded systems.
   - driver core changes to work to make struct bus_type able to be moved
     into read-only memory (i.e. const)  The recent work with Rust has
     pointed out a number of areas in the driver core where we are
     passing around and working with structures that really do not have
     to be dynamic at all, and they should be able to be read-only making
     things safer overall.  This is the contuation of that work (started
     last release with kobject changes) in moving struct bus_type to be
     constant.  We didn't quite make it for this release, but the
     remaining patches will be finished up for the release after this
     one, but the groundwork has been laid for this effort.
 
 Other than that we have in here:
   - debugfs memory leak fixes in some subsystems
   - error path cleanups and fixes for some never-able-to-be-hit
     codepaths.
   - cacheinfo rework and fixes
   - Other tiny fixes, full details are in the shortlog
 
 All of these have been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY/ipdg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ynL3gCgwzbcWu0So3piZyLiJKxsVo9C2EsAn3sZ9gN6
 6oeFOjD3JDju3cQsfGgd
 =Su6W
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the large set of driver core changes for 6.3-rc1.

  There's a lot of changes this development cycle, most of the work
  falls into two different categories:

   - fw_devlink fixes and updates. This has gone through numerous review
     cycles and lots of review and testing by lots of different devices.
     Hopefully all should be good now, and Saravana will be keeping a
     watch for any potential regression on odd embedded systems.

   - driver core changes to work to make struct bus_type able to be
     moved into read-only memory (i.e. const) The recent work with Rust
     has pointed out a number of areas in the driver core where we are
     passing around and working with structures that really do not have
     to be dynamic at all, and they should be able to be read-only
     making things safer overall. This is the contuation of that work
     (started last release with kobject changes) in moving struct
     bus_type to be constant. We didn't quite make it for this release,
     but the remaining patches will be finished up for the release after
     this one, but the groundwork has been laid for this effort.

  Other than that we have in here:

   - debugfs memory leak fixes in some subsystems

   - error path cleanups and fixes for some never-able-to-be-hit
     codepaths.

   - cacheinfo rework and fixes

   - Other tiny fixes, full details are in the shortlog

  All of these have been in linux-next for a while with no reported
  problems"

[ Geert Uytterhoeven points out that that last sentence isn't true, and
  that there's a pending report that has a fix that is queued up - Linus ]

* tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (124 commits)
  debugfs: drop inline constant formatting for ERR_PTR(-ERROR)
  OPP: fix error checking in opp_migrate_dentry()
  debugfs: update comment of debugfs_rename()
  i3c: fix device.h kernel-doc warnings
  dma-mapping: no need to pass a bus_type into get_arch_dma_ops()
  driver core: class: move EXPORT_SYMBOL_GPL() lines to the correct place
  Revert "driver core: add error handling for devtmpfs_create_node()"
  Revert "devtmpfs: add debug info to handle()"
  Revert "devtmpfs: remove return value of devtmpfs_delete_node()"
  driver core: cpu: don't hand-override the uevent bus_type callback.
  devtmpfs: remove return value of devtmpfs_delete_node()
  devtmpfs: add debug info to handle()
  driver core: add error handling for devtmpfs_create_node()
  driver core: bus: update my copyright notice
  driver core: bus: add bus_get_dev_root() function
  driver core: bus: constify bus_unregister()
  driver core: bus: constify some internal functions
  driver core: bus: constify bus_get_kset()
  driver core: bus: constify bus_register/unregister_notifier()
  driver core: remove private pointer from struct bus_type
  ...
2023-02-24 12:58:55 -08:00
Halil Pasic
a64a6d2387 s390: vfio-ap: tighten the NIB validity check
The NIB is architecturally invalid if the address designates a
storage location that is not installed or if it is zero.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reported-by: Janosch Frank <frankja@linux.ibm.com>
Fixes: ec89b55e3b ("s390: ap: implement PAPQ AQIC interception in kernel")
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-10 10:55:29 +01:00
Greg Kroah-Hartman
2a81ada32f driver core: make struct bus_type.uevent() take a const *
The uevent() callback in struct bus_type should not be modifying the
device that is passed into it, so mark it as a const * and propagate the
function signature changes out into all relevant subsystems that use
this callback.

Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230111113018.459199-16-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-27 13:45:52 +01:00
Tony Krowiak
7cb7636a1a s390/vfio_ap: increase max wait time for reset verification
Increase the maximum time to wait for verification of a queue reset
operation to 200ms.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20230118203111.529766-7-akrowiak@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-22 18:42:36 +01:00
Tony Krowiak
51d4d98770 s390/vfio_ap: fix handling of error response codes
Some response codes returned from the queue reset function are not being
handled correctly; this patch fixes them:

1. Response code 3, AP queue deconfigured: Deconfiguring an AP adapter
   resets all of its queues, so this is handled by indicating the reset
   verification completed successfully.

2. For all response codes other than 0 (normal reset completion), 2
   (queue reset in progress) and 3 (AP deconfigured), the -EIO error will
   be returned from the vfio_ap_mdev_reset_queue() function. In all cases,
   all fields of the status word other than the response code will be
   set to zero, so it makes no sense to check status bits.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20230118203111.529766-6-akrowiak@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-22 18:42:36 +01:00
Tony Krowiak
5a42b348ad s390/vfio_ap: verify ZAPQ completion after return of response code zero
Verification that the asynchronous ZAPQ function has completed only needs
to be done when the response code indicates the function was successfully
initiated; so, let's call the apq_reset_check function immediately after
the response code zero is returned from the ZAPQ.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20230118203111.529766-5-akrowiak@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-22 18:42:36 +01:00
Tony Krowiak
3ba4176810 s390/vfio_ap: use TAPQ to verify reset in progress completes
To eliminate the repeated calls to the PQAP(ZAPQ) function to verify that
a reset in progress completed successfully and ensure that error response
codes get appropriately logged, let's call the apq_reset_check() function
when the ZAPQ response code indicates that a reset that is already in
progress.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20230118203111.529766-4-akrowiak@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-22 18:42:36 +01:00
Tony Krowiak
0daf9878a7 s390/vfio_ap: check TAPQ response code when waiting for queue reset
The vfio_ap_mdev_reset_queue() function does not check the status
response code returned form the PQAP(TAPQ) function when verifying the
queue's status; consequently, there is no way of knowing whether
verification failed because the wait time was exceeded, or because the
PQAP(TAPQ) failed.

This patch adds a function to check the status response code from the
PQAP(TAPQ) instruction and logs an appropriate message if it fails.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20230118203111.529766-3-akrowiak@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-22 18:42:36 +01:00
Tony Krowiak
62414d901c s390/vfio-ap: verify reset complete in separate function
The vfio_ap_mdev_reset_queue() function contains a loop to verify that the
reset successfully completes within 40ms. This patch moves that loop into
a separate function.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Link: https://lore.kernel.org/r/20230118203111.529766-2-akrowiak@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-22 18:42:35 +01:00
Christophe JAILLET
08866d34c7 s390/vfio-ap: fix an error handling path in vfio_ap_mdev_probe_queue()
The commit in Fixes: has switch the order of a sysfs_create_group() and a
kzalloc().

It correctly removed the now useless kfree() but forgot to add a
sysfs_remove_group() in case of (unlikely) memory allocation failure.

Add it now.

Fixes: 260f3ea141 ("s390/vfio-ap: move probe and remove callbacks to vfio_ap_ops.c")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Link: https://lore.kernel.org/r/d0c0a35eec4fa87cb7f3910d8ac4dc0f7dc9008a.1659283738.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-13 14:15:07 +01:00
Xu Panda
a43e3115fb s390/zcrypt: use strscpy() to instead of strncpy()
The implementation of strscpy() is more robust and safer.
That's now the recommended way to copy NUL-terminated strings.

Signed-off-by: Xu Panda <xu.panda@zte.com.cn>
Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
Link: https://lore.kernel.org/r/202301052024349365834@zte.com.cn
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-09 14:34:09 +01:00
Linus Torvalds
785d21ba2f VFIO updates for v6.2-rc1
- Replace deprecated git://github.com link in MAINTAINERS. (Palmer Dabbelt)
 
  - Simplify vfio/mlx5 with module_pci_driver() helper. (Shang XiaoJing)
 
  - Drop unnecessary buffer from ACPI call. (Rafael Mendonca)
 
  - Correct latent missing include issue in iova-bitmap and fix support
    for unaligned bitmaps.  Follow-up with better fix through refactor.
    (Joao Martins)
 
  - Rework ccw mdev driver to split private data from parent structure,
    better aligning with the mdev lifecycle and allowing us to remove
    a temporary workaround. (Eric Farman)
 
  - Add an interface to get an estimated migration data size for a device,
    allowing userspace to make informed decisions, ex. more accurately
    predicting VM downtime. (Yishai Hadas)
 
  - Fix minor typo in vfio/mlx5 array declaration. (Yishai Hadas)
 
  - Simplify module and Kconfig through consolidating SPAPR/EEH code and
    config options and folding virqfd module into main vfio module.
    (Jason Gunthorpe)
 
  - Fix error path from device_register() across all vfio mdev and sample
    drivers. (Alex Williamson)
 
  - Define migration pre-copy interface and implement for vfio/mlx5
    devices, allowing portions of the device state to be saved while the
    device continues operation, towards reducing the stop-copy state
    size. (Jason Gunthorpe, Yishai Hadas, Shay Drory)
 
  - Implement pre-copy for hisi_acc devices. (Shameer Kolothum)
 
  - Fixes to mdpy mdev driver remove path and error path on probe.
    (Shang XiaoJing)
 
  - vfio/mlx5 fixes for incorrect return after copy_to_user() fault and
    incorrect buffer freeing. (Dan Carpenter)
 -----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCAA5FiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmObfPgbHGFsZXgud2ls
 bGlhbXNvbkByZWRoYXQuY29tAAoJECObm247sIsiDogP/i9GuBKposvZpnfxXWwo
 oNpKBZSOVMW8wgavNEuryMb+9WoouIghce8XU49MmONoP26kIh5TA14Zpi3XWkLK
 K+NlpwicESvLeZVHU7f3R8meVqmPtlxIi59jE+CfEHB8BW2HIAsEdwdhkxMwus9C
 nuiiK/2YYyQWOXYc4LAIkspMzjtGPy6Im5P6AED+dI+TFCEqJAM5qgOLJZFlk4a/
 WwZY2xjVKOl6xf5VZXGw+v7fDgz2Ju+j4Bm3X5lx1HgiDrEH83MjXY5h67neAIVb
 bXrfNLN++MiuO5niGTFMbUjGVUIFxsfmJzBnL9QrLsuj0JrGEKsu/1JEO78g0Km0
 ZCChoJ6UyUOgxt6evEymUAZAAkbcKaaht2gdbAXW71tv9p1TripAbBKwVeah1bQp
 SiHPqy9InKJlhaf+GbXL9eux1WVMfQ6FZccU16bNt7VaV2I8js85z/2gqVD0a5Mw
 +gnwp5XMUFWNKlJrnc7uVCD0bDExwQhr75OP4rWjMNvvLi9hPXJ2cI2Sg+9OLzQw
 vm/I+Df+FfXCuGAgX4Lxq76pqWlYGJH0Qxc14Ds6YoXqygBPz9yvTtuBv8mTHJzE
 KdAl/6DmZZxZ/JFD9lPF80KRiAsJ6iNf6tPTWES7hfDBfIdgQ/DZbXridLWJPNoi
 xLfaW19yrLTXWKSmR7G2Lsz4
 =q9xs
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v6.2-rc1' of https://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:

 - Replace deprecated git://github.com link in MAINTAINERS (Palmer
   Dabbelt)

 - Simplify vfio/mlx5 with module_pci_driver() helper (Shang XiaoJing)

 - Drop unnecessary buffer from ACPI call (Rafael Mendonca)

 - Correct latent missing include issue in iova-bitmap and fix support
   for unaligned bitmaps. Follow-up with better fix through refactor
   (Joao Martins)

 - Rework ccw mdev driver to split private data from parent structure,
   better aligning with the mdev lifecycle and allowing us to remove a
   temporary workaround (Eric Farman)

 - Add an interface to get an estimated migration data size for a
   device, allowing userspace to make informed decisions, ex. more
   accurately predicting VM downtime (Yishai Hadas)

 - Fix minor typo in vfio/mlx5 array declaration (Yishai Hadas)

 - Simplify module and Kconfig through consolidating SPAPR/EEH code and
   config options and folding virqfd module into main vfio module (Jason
   Gunthorpe)

 - Fix error path from device_register() across all vfio mdev and sample
   drivers (Alex Williamson)

 - Define migration pre-copy interface and implement for vfio/mlx5
   devices, allowing portions of the device state to be saved while the
   device continues operation, towards reducing the stop-copy state size
   (Jason Gunthorpe, Yishai Hadas, Shay Drory)

 - Implement pre-copy for hisi_acc devices (Shameer Kolothum)

 - Fixes to mdpy mdev driver remove path and error path on probe (Shang
   XiaoJing)

 - vfio/mlx5 fixes for incorrect return after copy_to_user() fault and
   incorrect buffer freeing (Dan Carpenter)

* tag 'vfio-v6.2-rc1' of https://github.com/awilliam/linux-vfio: (42 commits)
  vfio/mlx5: error pointer dereference in error handling
  vfio/mlx5: fix error code in mlx5vf_precopy_ioctl()
  samples: vfio-mdev: Fix missing pci_disable_device() in mdpy_fb_probe()
  hisi_acc_vfio_pci: Enable PRE_COPY flag
  hisi_acc_vfio_pci: Move the dev compatibility tests for early check
  hisi_acc_vfio_pci: Introduce support for PRE_COPY state transitions
  hisi_acc_vfio_pci: Add support for precopy IOCTL
  vfio/mlx5: Enable MIGRATION_PRE_COPY flag
  vfio/mlx5: Fallback to STOP_COPY upon specific PRE_COPY error
  vfio/mlx5: Introduce multiple loads
  vfio/mlx5: Consider temporary end of stream as part of PRE_COPY
  vfio/mlx5: Introduce vfio precopy ioctl implementation
  vfio/mlx5: Introduce SW headers for migration states
  vfio/mlx5: Introduce device transitions of PRE_COPY
  vfio/mlx5: Refactor to use queue based data chunks
  vfio/mlx5: Refactor migration file state
  vfio/mlx5: Refactor MKEY usage
  vfio/mlx5: Refactor PD usage
  vfio/mlx5: Enforce a single SAVE command at a time
  vfio: Extend the device migration protocol with PRE_COPY
  ...
2022-12-15 13:12:15 -08:00
Linus Torvalds
8fa590bf34 ARM64:
* Enable the per-vcpu dirty-ring tracking mechanism, together with an
   option to keep the good old dirty log around for pages that are
   dirtied by something other than a vcpu.
 
 * Switch to the relaxed parallel fault handling, using RCU to delay
   page table reclaim and giving better performance under load.
 
 * Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping option,
   which multi-process VMMs such as crosvm rely on (see merge commit 382b5b87a9:
   "Fix a number of issues with MTE, such as races on the tags being
   initialised vs the PG_mte_tagged flag as well as the lack of support
   for VM_SHARED when KVM is involved.  Patches from Catalin Marinas and
   Peter Collingbourne").
 
 * Merge the pKVM shadow vcpu state tracking that allows the hypervisor
   to have its own view of a vcpu, keeping that state private.
 
 * Add support for the PMUv3p5 architecture revision, bringing support
   for 64bit counters on systems that support it, and fix the
   no-quite-compliant CHAIN-ed counter support for the machines that
   actually exist out there.
 
 * Fix a handful of minor issues around 52bit VA/PA support (64kB pages
   only) as a prefix of the oncoming support for 4kB and 16kB pages.
 
 * Pick a small set of documentation and spelling fixes, because no
   good merge window would be complete without those.
 
 s390:
 
 * Second batch of the lazy destroy patches
 
 * First batch of KVM changes for kernel virtual != physical address support
 
 * Removal of a unused function
 
 x86:
 
 * Allow compiling out SMM support
 
 * Cleanup and documentation of SMM state save area format
 
 * Preserve interrupt shadow in SMM state save area
 
 * Respond to generic signals during slow page faults
 
 * Fixes and optimizations for the non-executable huge page errata fix.
 
 * Reprogram all performance counters on PMU filter change
 
 * Cleanups to Hyper-V emulation and tests
 
 * Process Hyper-V TLB flushes from a nested guest (i.e. from a L2 guest
   running on top of a L1 Hyper-V hypervisor)
 
 * Advertise several new Intel features
 
 * x86 Xen-for-KVM:
 
 ** Allow the Xen runstate information to cross a page boundary
 
 ** Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured
 
 ** Add support for 32-bit guests in SCHEDOP_poll
 
 * Notable x86 fixes and cleanups:
 
 ** One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).
 
 ** Reinstate IBPB on emulated VM-Exit that was incorrectly dropped a few
    years back when eliminating unnecessary barriers when switching between
    vmcs01 and vmcs02.
 
 ** Clean up vmread_error_trampoline() to make it more obvious that params
    must be passed on the stack, even for x86-64.
 
 ** Let userspace set all supported bits in MSR_IA32_FEAT_CTL irrespective
    of the current guest CPUID.
 
 ** Fudge around a race with TSC refinement that results in KVM incorrectly
    thinking a guest needs TSC scaling when running on a CPU with a
    constant TSC, but no hardware-enumerated TSC frequency.
 
 ** Advertise (on AMD) that the SMM_CTL MSR is not supported
 
 ** Remove unnecessary exports
 
 Generic:
 
 * Support for responding to signals during page faults; introduces
   new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks
 
 Selftests:
 
 * Fix an inverted check in the access tracking perf test, and restore
   support for asserting that there aren't too many idle pages when
   running on bare metal.
 
 * Fix build errors that occur in certain setups (unsure exactly what is
   unique about the problematic setup) due to glibc overriding
   static_assert() to a variant that requires a custom message.
 
 * Introduce actual atomics for clear/set_bit() in selftests
 
 * Add support for pinning vCPUs in dirty_log_perf_test.
 
 * Rename the so called "perf_util" framework to "memstress".
 
 * Add a lightweight psuedo RNG for guest use, and use it to randomize
   the access pattern and write vs. read percentage in the memstress tests.
 
 * Add a common ucall implementation; code dedup and pre-work for running
   SEV (and beyond) guests in selftests.
 
 * Provide a common constructor and arch hook, which will eventually be
   used by x86 to automatically select the right hypercall (AMD vs. Intel).
 
 * A bunch of added/enabled/fixed selftests for ARM64, covering memslots,
   breakpoints, stage-2 faults and access tracking.
 
 * x86-specific selftest changes:
 
 ** Clean up x86's page table management.
 
 ** Clean up and enhance the "smaller maxphyaddr" test, and add a related
    test to cover generic emulation failure.
 
 ** Clean up the nEPT support checks.
 
 ** Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.
 
 ** Fix an ordering issue in the AMX test introduced by recent conversions
    to use kvm_cpu_has(), and harden the code to guard against similar bugs
    in the future.  Anything that tiggers caching of KVM's supported CPUID,
    kvm_cpu_has() in this case, effectively hides opt-in XSAVE features if
    the caching occurs before the test opts in via prctl().
 
 Documentation:
 
 * Remove deleted ioctls from documentation
 
 * Clean up the docs for the x86 MSR filter.
 
 * Various fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmOaFrcUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPemQgAq49excg2Cc+EsHnZw3vu/QWdA0Rt
 KhL3OgKxuHNjCbD2O9n2t5di7eJOTQ7F7T0eDm3xPTr4FS8LQ2327/mQePU/H2CF
 mWOpq9RBWLzFsSTeVA2Mz9TUTkYSnDHYuRsBvHyw/n9cL76BWVzjImldFtjYjjex
 yAwl8c5itKH6bc7KO+5ydswbvBzODkeYKUSBNdbn6m0JGQST7XppNwIAJvpiHsii
 Qgpk0e4Xx9q4PXG/r5DedI6BlufBsLhv0aE9SHPzyKH3JbbUFhJYI8ZD5OhBQuYW
 MwxK2KlM5Jm5ud2NZDDlsMmmvd1lnYCFDyqNozaKEWC1Y5rq1AbMa51fXA==
 =QAYX
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM64:

   - Enable the per-vcpu dirty-ring tracking mechanism, together with an
     option to keep the good old dirty log around for pages that are
     dirtied by something other than a vcpu.

   - Switch to the relaxed parallel fault handling, using RCU to delay
     page table reclaim and giving better performance under load.

   - Relax the MTE ABI, allowing a VMM to use the MAP_SHARED mapping
     option, which multi-process VMMs such as crosvm rely on (see merge
     commit 382b5b87a9: "Fix a number of issues with MTE, such as
     races on the tags being initialised vs the PG_mte_tagged flag as
     well as the lack of support for VM_SHARED when KVM is involved.
     Patches from Catalin Marinas and Peter Collingbourne").

   - Merge the pKVM shadow vcpu state tracking that allows the
     hypervisor to have its own view of a vcpu, keeping that state
     private.

   - Add support for the PMUv3p5 architecture revision, bringing support
     for 64bit counters on systems that support it, and fix the
     no-quite-compliant CHAIN-ed counter support for the machines that
     actually exist out there.

   - Fix a handful of minor issues around 52bit VA/PA support (64kB
     pages only) as a prefix of the oncoming support for 4kB and 16kB
     pages.

   - Pick a small set of documentation and spelling fixes, because no
     good merge window would be complete without those.

  s390:

   - Second batch of the lazy destroy patches

   - First batch of KVM changes for kernel virtual != physical address
     support

   - Removal of a unused function

  x86:

   - Allow compiling out SMM support

   - Cleanup and documentation of SMM state save area format

   - Preserve interrupt shadow in SMM state save area

   - Respond to generic signals during slow page faults

   - Fixes and optimizations for the non-executable huge page errata
     fix.

   - Reprogram all performance counters on PMU filter change

   - Cleanups to Hyper-V emulation and tests

   - Process Hyper-V TLB flushes from a nested guest (i.e. from a L2
     guest running on top of a L1 Hyper-V hypervisor)

   - Advertise several new Intel features

   - x86 Xen-for-KVM:

      - Allow the Xen runstate information to cross a page boundary

      - Allow XEN_RUNSTATE_UPDATE flag behaviour to be configured

      - Add support for 32-bit guests in SCHEDOP_poll

   - Notable x86 fixes and cleanups:

      - One-off fixes for various emulation flows (SGX, VMXON, NRIPS=0).

      - Reinstate IBPB on emulated VM-Exit that was incorrectly dropped
        a few years back when eliminating unnecessary barriers when
        switching between vmcs01 and vmcs02.

      - Clean up vmread_error_trampoline() to make it more obvious that
        params must be passed on the stack, even for x86-64.

      - Let userspace set all supported bits in MSR_IA32_FEAT_CTL
        irrespective of the current guest CPUID.

      - Fudge around a race with TSC refinement that results in KVM
        incorrectly thinking a guest needs TSC scaling when running on a
        CPU with a constant TSC, but no hardware-enumerated TSC
        frequency.

      - Advertise (on AMD) that the SMM_CTL MSR is not supported

      - Remove unnecessary exports

  Generic:

   - Support for responding to signals during page faults; introduces
     new FOLL_INTERRUPTIBLE flag that was reviewed by mm folks

  Selftests:

   - Fix an inverted check in the access tracking perf test, and restore
     support for asserting that there aren't too many idle pages when
     running on bare metal.

   - Fix build errors that occur in certain setups (unsure exactly what
     is unique about the problematic setup) due to glibc overriding
     static_assert() to a variant that requires a custom message.

   - Introduce actual atomics for clear/set_bit() in selftests

   - Add support for pinning vCPUs in dirty_log_perf_test.

   - Rename the so called "perf_util" framework to "memstress".

   - Add a lightweight psuedo RNG for guest use, and use it to randomize
     the access pattern and write vs. read percentage in the memstress
     tests.

   - Add a common ucall implementation; code dedup and pre-work for
     running SEV (and beyond) guests in selftests.

   - Provide a common constructor and arch hook, which will eventually
     be used by x86 to automatically select the right hypercall (AMD vs.
     Intel).

   - A bunch of added/enabled/fixed selftests for ARM64, covering
     memslots, breakpoints, stage-2 faults and access tracking.

   - x86-specific selftest changes:

      - Clean up x86's page table management.

      - Clean up and enhance the "smaller maxphyaddr" test, and add a
        related test to cover generic emulation failure.

      - Clean up the nEPT support checks.

      - Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.

      - Fix an ordering issue in the AMX test introduced by recent
        conversions to use kvm_cpu_has(), and harden the code to guard
        against similar bugs in the future. Anything that tiggers
        caching of KVM's supported CPUID, kvm_cpu_has() in this case,
        effectively hides opt-in XSAVE features if the caching occurs
        before the test opts in via prctl().

  Documentation:

   - Remove deleted ioctls from documentation

   - Clean up the docs for the x86 MSR filter.

   - Various fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (361 commits)
  KVM: x86: Add proper ReST tables for userspace MSR exits/flags
  KVM: selftests: Allocate ucall pool from MEM_REGION_DATA
  KVM: arm64: selftests: Align VA space allocator with TTBR0
  KVM: arm64: Fix benign bug with incorrect use of VA_BITS
  KVM: arm64: PMU: Fix period computation for 64bit counters with 32bit overflow
  KVM: x86: Advertise that the SMM_CTL MSR is not supported
  KVM: x86: remove unnecessary exports
  KVM: selftests: Fix spelling mistake "probabalistic" -> "probabilistic"
  tools: KVM: selftests: Convert clear/set_bit() to actual atomics
  tools: Drop "atomic_" prefix from atomic test_and_set_bit()
  tools: Drop conflicting non-atomic test_and_{clear,set}_bit() helpers
  KVM: selftests: Use non-atomic clear/set bit helpers in KVM tests
  perf tools: Use dedicated non-atomic clear/set bit helpers
  tools: Take @bit as an "unsigned long" in {clear,set}_bit() helpers
  KVM: arm64: selftests: Enable single-step without a "full" ucall()
  KVM: x86: fix APICv/x2AVIC disabled when vm reboot by itself
  KVM: Remove stale comment about KVM_REQ_UNHALT
  KVM: Add missing arch for KVM_CREATE_DEVICE and KVM_{SET,GET}_DEVICE_ATTR
  KVM: Reference to kvm_userspace_memory_region in doc and comments
  KVM: Delete all references to removed KVM_SET_MEMORY_ALIAS ioctl
  ...
2022-12-15 11:12:21 -08:00
Linus Torvalds
64e7003c6b This update includes the following changes:
API:
 
 - Optimise away self-test overhead when they are disabled.
 - Support symmetric encryption via keyring keys in af_alg.
 - Flip hwrng default_quality, the default is now maximum entropy.
 
 Algorithms:
 
 - Add library version of aesgcm.
 - CFI fixes for assembly code.
 - Add arm/arm64 accelerated versions of sm3/sm4.
 
 Drivers:
 
 - Remove assumption on arm64 that kmalloc is DMA-aligned.
 - Fix selftest failures in rockchip.
 - Add support for RK3328/RK3399 in rockchip.
 - Add deflate support in qat.
 - Merge ux500 into stm32.
 - Add support for TEE for PCI ID 0x14CA in ccp.
 - Add mt7986 support in mtk.
 - Add MaxLinear platform support in inside-secure.
 - Add NPCM8XX support in npcm.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmOZhNQACgkQxycdCkmx
 i6edOQ/+IHYe2Z+fLsMGs0qgTVaEV33O0crTRl/PMkfBJai57grz6x/G9QrkwGHS
 084u4RmwhVrE7Z/pxvey48m0lHMw3H/ElLTRl5LV1zE2OtGgr4VV63wtqthu1QS1
 KblVnjb52DhFhvF1O1IrK9lxyX0lByOiARFVdyZR6+Rb66Xfq8rqk5t8U8mmTUFz
 ds9S2Un4HajgtjNEyI78DOX8o4wVST8tltQs0eVii6T9AeXgSgX37ytD7Xtg/zrz
 /p61KFgKBQkRT7EEGD6xgNrND0vNAp2w98ZTTRXTZI8+Y0aTUcTYya7cXOLBt9bQ
 rA7z9sNKvmwJijTMV6O9eqRGcYfzc2G4qfMhlQqj/P2pjLnEZXdvFNHTTbclR76h
 2UFlZXPDQVQukvnNNnB6bmIvv6DsM+jmGH0pK5BnBJXnD5SOZh1RqjJxw0Kj6QCM
 VxpKDvfStux2Guh6mz1lJna/S44qKy/sVYkWUawcmE4RF2+GfNayM1GUpEUofndE
 vz1yZdgLPETSh5QzKrjFkUAnqo/AsAdc5Qxroz9DRz1BCC0GCuIxjUG8ScTWgcth
 R/reQDczBckCNpPxrWPHHYoVXnAMwEFySfcjZyuCoMO6t6qVUvcjRShCyKwO/JPl
 9YREdRmq0swwIB9cFIrEoWrzc3wjjBtsltDFlkKsa9c92LXoW+g=
 =OpWt
 -----END PGP SIGNATURE-----

Merge tag 'v6.2-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto updates from Herbert Xu:
 "API:
   - Optimise away self-test overhead when they are disabled
   - Support symmetric encryption via keyring keys in af_alg
   - Flip hwrng default_quality, the default is now maximum entropy

  Algorithms:
   - Add library version of aesgcm
   - CFI fixes for assembly code
   - Add arm/arm64 accelerated versions of sm3/sm4

  Drivers:
   - Remove assumption on arm64 that kmalloc is DMA-aligned
   - Fix selftest failures in rockchip
   - Add support for RK3328/RK3399 in rockchip
   - Add deflate support in qat
   - Merge ux500 into stm32
   - Add support for TEE for PCI ID 0x14CA in ccp
   - Add mt7986 support in mtk
   - Add MaxLinear platform support in inside-secure
   - Add NPCM8XX support in npcm"

* tag 'v6.2-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (184 commits)
  crypto: ux500/cryp - delete driver
  crypto: stm32/cryp - enable for use with Ux500
  crypto: stm32 - enable drivers to be used on Ux500
  dt-bindings: crypto: Let STM32 define Ux500 CRYP
  hwrng: geode - Fix PCI device refcount leak
  hwrng: amd - Fix PCI device refcount leak
  crypto: qce - Set DMA alignment explicitly
  crypto: octeontx2 - Set DMA alignment explicitly
  crypto: octeontx - Set DMA alignment explicitly
  crypto: keembay - Set DMA alignment explicitly
  crypto: safexcel - Set DMA alignment explicitly
  crypto: hisilicon/hpre - Set DMA alignment explicitly
  crypto: chelsio - Set DMA alignment explicitly
  crypto: ccree - Set DMA alignment explicitly
  crypto: ccp - Set DMA alignment explicitly
  crypto: cavium - Set DMA alignment explicitly
  crypto: img-hash - Fix variable dereferenced before check 'hdev->req'
  crypto: arm64/ghash-ce - use frame_push/pop macros consistently
  crypto: arm64/crct10dif - use frame_push/pop macros consistently
  crypto: arm64/aes-modes - use frame_push/pop macros consistently
  ...
2022-12-14 12:31:09 -08:00
Alex Williamson
ce3895735c vfio/ap/ccw/samples: Fix device_register() unwind path
We always need to call put_device() if device_register() fails.
All vfio drivers calling device_register() include a similar unwind
stack via gotos, therefore split device_unregister() into its
device_del() and put_device() components in the unwind path, and
add a goto target to handle only the put_device() requirement.

Reported-by: Ruan Jinjie <ruanjinjie@huawei.com>
Link: https://lore.kernel.org/all/20221118032827.3725190-1-ruanjinjie@huawei.com
Fixes: d61fc96f47 ("sample: vfio mdev display - host device")
Fixes: 9d1a546c53 ("docs: Sample driver to demonstrate how to use Mediated device framework.")
Fixes: a5e6e6505f ("sample: vfio bochs vbe display (host device for bochs-drm)")
Fixes: 9e6f07cd1e ("vfio/ccw: create a parent struct")
Fixes: 36360658eb ("s390: vfio_ap: link the vfio_ap devices to the vfio_ap bus subsystem")
Cc: Tony Krowiak <akrowiak@linux.ibm.com>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Jason Herne <jjherne@linux.ibm.com>
Cc: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Link: https://lore.kernel.org/r/166999942139.645727.12439756512449846442.stgit@omen
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-12-05 12:37:54 -07:00
Jason Gunthorpe
90337f526c Merge tag 'v6.1-rc7' into iommufd.git for-next
Resolve conflicts in drivers/vfio/vfio_main.c by using the iommfd version.
The rc fix was done a different way when iommufd patches reworked this
code.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-02 12:04:39 -04:00
Jason Gunthorpe
4741f2e941 vfio-iommufd: Support iommufd for emulated VFIO devices
Emulated VFIO devices are calling vfio_register_emulated_iommu_dev() and
consist of all the mdev drivers.

Like the physical drivers, support for iommufd is provided by the driver
supplying the correct standard ops. Provide ops from the core that
duplicate what vfio_register_emulated_iommu_dev() does.

Emulated drivers are where it is more likely to see variation in the
iommfd support ops. For instance IDXD will probably need to setup both a
iommfd_device context linked to a PASID and an iommufd_access context to
support all their mdev operations.

Link: https://lore.kernel.org/r/7-v4-42cd2eb0e3eb+335a-vfio_iommufd_jgg@nvidia.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Yi Liu <yi.l.liu@intel.com>
Tested-by: Lixiao Yang <lixiao.yang@intel.com>
Tested-by: Matthew Rosato <mjrosato@linux.ibm.com>
Tested-by: Yu He <yu.he@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-02 11:52:03 -04:00
Matthew Rosato
2a54e347d9 vfio/ap: Validate iova during dma_unmap and trigger irq disable
Currently, each mapped iova is stashed in its associated vfio_ap_queue;
when we get an unmap request, validate that it matches with one or more of
these stashed values before attempting unpins.

Each stashed iova represents IRQ that was enabled for a queue.  Therefore,
if a match is found, trigger IRQ disable for this queue to ensure that
underlying firmware will no longer try to use the associated pfn after the
page is unpinned. IRQ disable will also handle the associated unpin.

Link: https://lore.kernel.org/r/20221202135402.756470-3-yi.l.liu@intel.com
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-02 11:50:10 -04:00
Paolo Bonzini
1e79a9e3ab - Second batch of the lazy destroy patches
- First batch of KVM changes for kernel virtual != physical address support
 - Removal of a unused function
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEwGNS88vfc9+v45Yq41TmuOI4ufgFAmN/eYwACgkQ41TmuOI4
 ufjoxA/9Et38aXO/IhmUt8v0QhA4yec+sc5GSFfQSYehej/1Vqhw0DXx+ORUiRgg
 +rtiXJSSqkuD2dL+BDffY2xoul6nzNdVf4AbkcnrWscfWr6xwVYlPvuL0ymGI6J2
 U/IPedRoKw0bHw/wHs05yV5PubrRwDFERKhtyXWYGbPJhX0w2n3IFOoKH1oWBhLW
 Dc8jEs6t3gDbJ71Er0xoeBUoiuu+PgZG06cpOvzBZ0KclRgjADXyISqqk8/4mu8w
 R+/Wf8NcrbQYV1jfCeq5zIsKC8uvnFj25UuyTLumn5vh+dNNsvE72Khe4tz7LI0I
 ZPZ+GZuemu7Yi12dKjw4Sw3ui0ejWH/5XL1SVB0X/xYIWrBqOot+Lq6538GCng+c
 tJt+zsu64VFgXCCZ8O9qO4uE4DBL70H3ThT7VZxIghSTZtY0xh3uFc64f3/3d9dy
 K4WTJHrmMxhXaA/rqtIa8I53JvFl8CztofZATiQQesyPuc7lZ01w1Co5el4xYaxe
 YknyMTq11qf/iYqVOW7sjoWW/YRuuMZ4+FhpI3o/SllVdN98iTwkk1kP3wcoBO5P
 bvzpm+WXHbv9OxifPrqkqv34+upbjfEmSogHudQzagBX4vl3rZRfBCdQGCAha0Uc
 ZYyg68kiil5sWmHI/Ln/ZjANYfbS5sF0CreuWxnmqcwKl2NSN/E=
 =/1yt
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-6.2-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

- Second batch of the lazy destroy patches
- First batch of KVM changes for kernel virtual != physical address support
- Removal of a unused function
2022-11-28 13:34:47 -05:00
Wei Yongjun
9ac74f0666 s390/ap: fix memory leak in ap_init_qci_info()
If kzalloc() for 'ap_qci_info_old' failed, 'ap_qci_info' shold be
freed before return. Otherwise it is a memory leak.

Link: https://lore.kernel.org/r/20221114110830.542246-1-weiyongjun@huaweicloud.com
Fixes: 283915850a ("s390/ap: notify drivers on config changed and scan complete callbacks")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-11-23 16:26:21 +01:00
Nico Boehr
dbec280045 s390/vfio-ap: GISA: sort out physical vs virtual pointers usage
Fix virtual vs physical address confusion (which currently are the same)
for the GISA when enabling the IRQ.

Signed-off-by: Nico Boehr <nrb@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20221118100429.70453-1-nrb@linux.ibm.com
Message-Id: <20221118100429.70453-1-nrb@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2022-11-23 09:06:50 +00:00
Jason A. Donenfeld
16bdbae394 hwrng: core - treat default_quality as a maximum and default to 1024
Most hw_random devices return entropy which is assumed to be of full
quality, but driver authors don't bother setting the quality knob. Some
hw_random devices return less than full quality entropy, and then driver
authors set the quality knob. Therefore, the entropy crediting should be
opt-out rather than opt-in per-driver, to reflect the actual reality on
the ground.

For example, the two Raspberry Pi RNG drivers produce full entropy
randomness, and both EDK2 and U-Boot's drivers for these treat them as
such. The result is that EFI then uses these numbers and passes the to
Linux, and Linux credits them as boot, thereby initializing the RNG.
Yet, in Linux, the quality knob was never set to anything, and so on the
chance that Linux is booted without EFI, nothing is ever credited.
That's annoying.

The same pattern appears to repeat itself throughout various drivers. In
fact, very very few drivers have bothered setting quality=1024.

Looking at the git history of existing drivers and corresponding mailing
list discussion, this conclusion tracks. There's been a decent amount of
discussion about drivers that set quality < 1024 -- somebody read and
interepreted a datasheet, or made some back of the envelope calculation
somehow. But there's been very little, if any, discussion about most
drivers where the quality is just set to 1024 or unset (or set to 1000
when the authors misunderstood the API and assumed it was base-10 rather
than base-2); in both cases the intent was fairly clear of, "this is a
hardware random device; it's fine."

So let's invert this logic. A hw_random struct's quality knob now
controls the maximum quality a driver can produce, or 0 to specify 1024.
Then, the module-wide switch called "default_quality" is changed to
represent the maximum quality of any driver. By default it's 1024, and
the quality of any particular driver is then given by:

    min(default_quality, rng->quality ?: 1024);

This way, the user can still turn this off for weird reasons (and we can
replace whatever driver-specific disabling hacks existed in the past),
yet we get proper crediting for relevant RNGs.

Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-11-18 16:59:34 +08:00
Eric Farman
913447d06f vfio: Remove vfio_free_device
With the "mess" sorted out, we should be able to inline the
vfio_free_device call introduced by commit cb9ff3f3b8
("vfio: Add helpers for unifying vfio_device life cycle")
and remove them from driver release callbacks.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>	# vfio-ap part
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20221104142007.1314999-8-farman@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-11-10 11:30:23 -07:00
Harald Freudenberger
b43088f30d s390/zcrypt: fix warning about field-spanning write
This patch fixes the warning

memcpy: detected field-spanning write (size 60) of single field "to" at drivers/s390/crypto/zcrypt_api.h:173 (size 2)
WARNING: CPU: 1 PID: 2114 at drivers/s390/crypto/zcrypt_api.h:173 prep_ep11_ap_msg+0x2c6/0x2e0 [zcrypt]

The code has been rewritten to use a union in combination
with a flex array to clearly state which part of the buffer
the payload is to be copied in via z_copy_from_user
function (which may call memcpy() in case of in-kernel calls).

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Suggested-by: Jürgen Christ <jchrist@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-11-02 22:15:57 +01:00
Jason J. Herne
e38de48044 s390/vfio-ap: Fix memory allocation for mdev_types array
The vfio-ap crypto driver fails to allocate memory for an array of
pointers used to pass supported mdev types to mdev_register_parent().

Since we only support a single mdev type, the fix is to allocate a
single entry in the ap_matrix_dev->mdev_types array.

Link: https://lore.kernel.org/r/20221021145905.15100-1-jjherne@linux.ibm.com
Fixes: da44c340c4 ("vfio/mdev: simplify mdev_type handling")
Cc: stable@vger.kernel.org
Cc: Tony Krowiak <akrowiak@linux.ibm.com>
Reported-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-10-26 14:47:31 +02:00
Linus Torvalds
d3cf405133 VFIO updates for v6.1-rc1
- Prune private items from vfio_pci_core.h to a new internal header,
    fix missed function rename, and refactor vfio-pci interrupt defines.
    (Jason Gunthorpe)
 
  - Create consistent naming and handling of ioctls with a function per
    ioctl for vfio-pci and vfio group handling, use proper type args
    where available. (Jason Gunthorpe)
 
  - Implement a set of low power device feature ioctls allowing userspace
    to make use of power states such as D3cold where supported.
    (Abhishek Sahu)
 
  - Remove device counter on vfio groups, which had restricted the page
    pinning interface to singleton groups to account for limitations in
    the type1 IOMMU backend.  Document usage as limited to emulated IOMMU
    devices, ie. traditional mdev devices where this restriction is
    consistent.  (Jason Gunthorpe)
 
  - Correct function prefix in hisi_acc driver incurred during previous
    refactoring. (Shameer Kolothum)
 
  - Correct typo and remove redundant warning triggers in vfio-fsl driver.
    (Christophe JAILLET)
 
  - Introduce device level DMA dirty tracking uAPI and implementation in
    the mlx5 variant driver (Yishai Hadas & Joao Martins)
 
  - Move much of the vfio_device life cycle management into vfio core,
    simplifying and avoiding duplication across drivers.  This also
    facilitates adding a struct device to vfio_device which begins the
    introduction of device rather than group level user support and fills
    a gap allowing userspace identify devices as vfio capable without
    implicit knowledge of the driver. (Kevin Tian & Yi Liu)
 
  - Split vfio container handling to a separate file, creating a more
    well defined API between the core and container code, masking IOMMU
    backend implementation from the core, allowing for an easier future
    transition to an iommufd based implementation of the same.
    (Jason Gunthorpe)
 
  - Attempt to resolve race accessing the iommu_group for a device
    between vfio releasing DMA ownership and removal of the device from
    the IOMMU driver.  Follow-up with support to allow vfio_group to
    exist with NULL iommu_group pointer to support existing userspace
    use cases of holding the group file open.  (Jason Gunthorpe)
 
  - Fix error code and hi/lo register manipulation issues in the hisi_acc
    variant driver, along with various code cleanups. (Longfang Liu)
 
  - Fix a prior regression in GVT-g group teardown, resulting in
    unreleased resources. (Jason Gunthorpe)
 
  - A significant cleanup and simplification of the mdev interface,
    consolidating much of the open coded per driver sysfs interface
    support into the mdev core. (Christoph Hellwig)
 
  - Simplification of tracking and locking around vfio_groups that
    fall out from previous refactoring. (Jason Gunthorpe)
 
  - Replace trivial open coded f_ops tests with new helper.
    (Alex Williamson)
 -----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCAA5FiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmNGz2AbHGFsZXgud2ls
 bGlhbXNvbkByZWRoYXQuY29tAAoJECObm247sIsiatYQAI+7bFjVsTKwCnWUhp/A
 WnFmLpnh/OsBIYiXRbXGZBgIO4iPmMyFkxqjnv6e8H1WnKhLbuPy/xCaAvPrtI8b
 YKCpzdrDnfrPfB4+0cyGLJx15Jqd3sOZy097kl2lQJTscELTjJxTl0uB/Fbf/s38
 t1K2nIhBm+sGK3rTf3JjY4Jc7vDbwX7HQt6rUVEbd3NoyLJV1T/HdeSgwSMdyiED
 WwkRZ0z/vU0hEDk5wk1ZyltkiUzdCSws3C8T0J39xRObPLHR1vYgKO8aeZhfQb4p
 luD1fzGRMt3JinSXCPPm5HfADXq2Rozx7Y7a454fvCa7lpX4MNAgaQdfIzI64lZj
 cMgSYAIskVq4vxCkO4bKec4FYrzJoxBMJwiXZvOZ4mF5SL4UIDwerMqQTA3fvtQ+
 puS6x+/DF9XXHrEewEX7teg6QYPQueneSS+fWeFpMGzDXSjdQB6qV+rMWS297t+4
 1KyITxkOxcZQ4+j1OLPGtxsRLKtWApawoNTpRMlaD+hSExxHLbUmKexOLXzuAoVP
 nhbjud+jzEbpCnwps24Og/iEBdRYJcl2KwEeSRPI856YRDrNa9jPtiDlsAtKZOK2
 gJnOixSss6R+wgVVYIyMDZ8tsvO+UDQruvqQ2kFku1FOlO86pvwD6UUVuTVosdNc
 fktw6Dx90N3fdb/o8jjAjssx
 =Z8+P
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v6.1-rc1' of https://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:

 - Prune private items from vfio_pci_core.h to a new internal header,
   fix missed function rename, and refactor vfio-pci interrupt defines
   (Jason Gunthorpe)

 - Create consistent naming and handling of ioctls with a function per
   ioctl for vfio-pci and vfio group handling, use proper type args
   where available (Jason Gunthorpe)

 - Implement a set of low power device feature ioctls allowing userspace
   to make use of power states such as D3cold where supported (Abhishek
   Sahu)

 - Remove device counter on vfio groups, which had restricted the page
   pinning interface to singleton groups to account for limitations in
   the type1 IOMMU backend. Document usage as limited to emulated IOMMU
   devices, ie. traditional mdev devices where this restriction is
   consistent (Jason Gunthorpe)

 - Correct function prefix in hisi_acc driver incurred during previous
   refactoring (Shameer Kolothum)

 - Correct typo and remove redundant warning triggers in vfio-fsl driver
   (Christophe JAILLET)

 - Introduce device level DMA dirty tracking uAPI and implementation in
   the mlx5 variant driver (Yishai Hadas & Joao Martins)

 - Move much of the vfio_device life cycle management into vfio core,
   simplifying and avoiding duplication across drivers. This also
   facilitates adding a struct device to vfio_device which begins the
   introduction of device rather than group level user support and fills
   a gap allowing userspace identify devices as vfio capable without
   implicit knowledge of the driver (Kevin Tian & Yi Liu)

 - Split vfio container handling to a separate file, creating a more
   well defined API between the core and container code, masking IOMMU
   backend implementation from the core, allowing for an easier future
   transition to an iommufd based implementation of the same (Jason
   Gunthorpe)

 - Attempt to resolve race accessing the iommu_group for a device
   between vfio releasing DMA ownership and removal of the device from
   the IOMMU driver. Follow-up with support to allow vfio_group to exist
   with NULL iommu_group pointer to support existing userspace use cases
   of holding the group file open (Jason Gunthorpe)

 - Fix error code and hi/lo register manipulation issues in the hisi_acc
   variant driver, along with various code cleanups (Longfang Liu)

 - Fix a prior regression in GVT-g group teardown, resulting in
   unreleased resources (Jason Gunthorpe)

 - A significant cleanup and simplification of the mdev interface,
   consolidating much of the open coded per driver sysfs interface
   support into the mdev core (Christoph Hellwig)

 - Simplification of tracking and locking around vfio_groups that fall
   out from previous refactoring (Jason Gunthorpe)

 - Replace trivial open coded f_ops tests with new helper (Alex
   Williamson)

* tag 'vfio-v6.1-rc1' of https://github.com/awilliam/linux-vfio: (77 commits)
  vfio: More vfio_file_is_group() use cases
  vfio: Make the group FD disassociate from the iommu_group
  vfio: Hold a reference to the iommu_group in kvm for SPAPR
  vfio: Add vfio_file_is_group()
  vfio: Change vfio_group->group_rwsem to a mutex
  vfio: Remove the vfio_group->users and users_comp
  vfio/mdev: add mdev available instance checking to the core
  vfio/mdev: consolidate all the description sysfs into the core code
  vfio/mdev: consolidate all the available_instance sysfs into the core code
  vfio/mdev: consolidate all the name sysfs into the core code
  vfio/mdev: consolidate all the device_api sysfs into the core code
  vfio/mdev: remove mtype_get_parent_dev
  vfio/mdev: remove mdev_parent_dev
  vfio/mdev: unexport mdev_bus_type
  vfio/mdev: remove mdev_from_dev
  vfio/mdev: simplify mdev_type handling
  vfio/mdev: embedd struct mdev_parent in the parent data structure
  vfio/mdev: make mdev.h standalone includable
  drm/i915/gvt: simplify vgpu configuration management
  drm/i915/gvt: fix a memory leak in intel_gvt_init_vgpu_types
  ...
2022-10-12 14:46:48 -07:00
Jason Gunthorpe
9c799c224d vfio/mdev: add mdev available instance checking to the core
Many of the mdev drivers use a simple counter for keeping track of the
available instances. Move this code to the core code and store the counter
in the mdev_parent. Implement it using correct locking, fixing mdpy.

Drivers just provide the value in the mdev_driver at registration time
and the core code takes care of maintaining it and exposing the value in
sysfs.

[hch: count instances per-parent instead of per-type, use an atomic_t
 to avoid taking mdev_list_lock in the show method]

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Link: https://lore.kernel.org/r/20220923092652.100656-15-hch@lst.de
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04 12:06:58 -06:00
Christoph Hellwig
f2fbc72e6d vfio/mdev: consolidate all the available_instance sysfs into the core code
Every driver just print a number, simply add a method to the mdev_driver
to return it and provide a standard sysfs show function.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Link: https://lore.kernel.org/r/20220923092652.100656-13-hch@lst.de
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04 12:06:58 -06:00
Christoph Hellwig
0bc79069cc vfio/mdev: consolidate all the name sysfs into the core code
Every driver just emits a static string, simply add a field to the
mdev_type for the driver to fill out or fall back to the sysfs name and
provide a standard sysfs show function.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Link: https://lore.kernel.org/r/20220923092652.100656-12-hch@lst.de
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04 12:06:58 -06:00
Jason Gunthorpe
290aac5df8 vfio/mdev: consolidate all the device_api sysfs into the core code
Every driver just emits a static string, simply feed it through the ops
and provide a standard sysfs show function.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Link: https://lore.kernel.org/r/20220923092652.100656-11-hch@lst.de
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04 12:06:58 -06:00
Christoph Hellwig
da44c340c4 vfio/mdev: simplify mdev_type handling
Instead of abusing struct attribute_group to control initialization of
struct mdev_type, just define the actual attributes in the mdev_driver,
allocate the mdev_type structures in the caller and pass them to
mdev_register_parent.

This allows the caller to use container_of to get at the containing
structure and thus significantly simplify the code.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Link: https://lore.kernel.org/r/20220923092652.100656-6-hch@lst.de
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04 12:06:58 -06:00
Christoph Hellwig
89345d5177 vfio/mdev: embedd struct mdev_parent in the parent data structure
Simplify mdev_{un}register_device by requiring the caller to pass in
a structure allocate as part of the parent device structure.  This
removes the need for a list of parents and the separate mdev_parent
refcount as we can simplify rely on the reference to the parent device.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Link: https://lore.kernel.org/r/20220923092652.100656-5-hch@lst.de
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04 12:06:58 -06:00
Christoph Hellwig
bdef2b7896 vfio/mdev: make mdev.h standalone includable
Include <linux/device.h> and <linux/uuid.h> so that users of this headers
don't need to do that and remove those includes that aren't needed
any more.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Link: https://lore.kernel.org/r/20220923092652.100656-4-hch@lst.de
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-10-04 12:06:58 -06:00
Tony Krowiak
1918f2b20c s390/vfio-ap: bypass unnecessary processing of AP resources
It is not necessary to go through the process of validation, linking of
queues to mdev and vice versa and filtering the APQNs assigned to the
matrix mdev to build an AP configuration for a guest if an adapter or
domain being assigned is already assigned to the matrix mdev. Likewise, it
is not necessary to proceed through the process the unassignment of an
adapter, domain or control domain if it is not assigned to the matrix mdev.

Since it is not necessary to process assignment of a resource already
assigned or process unassignment of a resource that is been assigned,
this patch will bypass all assignment/unassignment operations for an
adapter, domain or control domain under these circumstances.

Not only is assignment of a duplicate adapter or domain unnecessary, it
will also cause a hang situation when removing the matrix mdev to which it is
assigned. The reason is because the same vfio_ap_queue objects with an
APQN containing the APID of the adapter or APQI of the domain being
assigned will get added multiple times to the hashtable that holds them.
This results in the pprev and next pointers of the hlist_node (mdev_qnode
field in the vfio_ap_queue object) pointing to the queue object itself
resulting in an interminable loop when the mdev is removed and the queue
table is iterated to reset the queues.

Cc: stable@vger.kernel.org
Fixes: 11cb2419fa ("s390/vfio-ap: manage link between queue struct and matrix mdev")
Reported-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-09-21 22:33:16 +02:00
Yi Liu
7cb5a82eb1 vfio/ap: Use the new device life cycle helpers
and manage available_instances inside @init/@release.

Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220921104401.38898-10-kevin.tian@intel.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-09-21 14:15:11 -06:00
Harald Freudenberger
0fef40be5d s390/ap: fix crash on older machines based on QCI info missing
On older z series machines (z12 and older) there is no QCI info
available. The AP code took care of this and the AP bus scan then
switched to simple probing via TAPQ.

With commit
283915850a ("s390/ap: notify drivers on config changed and scan complete callbacks")
some code was introduced which silently assumed that the QCI info is
always available. However, with KVM simulating an older machine (z12)
the result was a kernel crash. Funnily the same crash does not happen
on LPAR - maybe because NULL is a valid pointer and reading some data
from address 0 also works fine.

This fix now improves the code to be aware that the QCI instruction
may not be available on older machines and thus the two pointers to
QCI info structs may simple be NULL.

However, on a machine not providing the QCI info the two callbacks to
the zcrypt device drivers on_config_changed() and on_scan_complete()
provide parameters which are pointers to a QCI info struct.
These both callbacks are NOT served if there is no QCI info available.
The only consumer of these callbacks is the vfio device driver. This
driver only supports CEX4 and higher. All physical machines which are
able to provide CEX4 cards have QCI support available. So there is
no sense in for example fill the QCI info struct by hand with looping
over cards and queues and TAPQ each APQN.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 283915850a ("s390/ap: notify drivers on config changed and scan complete callbacks")
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-08-15 17:19:52 +02:00
Linus Torvalds
24cb958695 s390 updates for 5.20 merge window
- Rework copy_oldmem_page() callback to take an iov_iter.
   This includes few prerequisite updates and fixes to the
   oldmem reading code.
 
 - Rework cpufeature implementation to allow for various CPU feature
   indications, which is not only limited to hardware capabilities,
   but also allows CPU facilities.
 
 - Use the cpufeature rework to autoload Ultravisor module when CPU
   facility 158 is available.
 
 - Add ELF note type for encrypted CPU state of a protected virtual CPU.
   The zgetdump tool from s390-tools package will decrypt the CPU state
   using a Customer Communication Key and overwrite respective notes to
   make the data accessible for crash and other debugging tools.
 
 - Use vzalloc() instead of vmalloc() + memset() in ChaCha20 crypto test.
 
 - Fix incorrect recovery of kretprobe modified return address in stacktrace.
 
 - Switch the NMI handler to use generic irqentry_nmi_enter() and
   irqentry_nmi_exit() helper functions.
 
 - Rework the cryptographic Adjunct Processors (AP) pass-through design
   to support dynamic changes to the AP matrix of a running guest as well
   as to implement more of the AP architecture.
 
 - Minor boot code cleanups.
 
 - Grammar and typo fixes to hmcdrv and tape drivers.
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYIADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCYu4dRBccYWdvcmRlZXZA
 bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8DnlAP45Sk4cE35T+Z0vdHE2f0uMXE/p
 uHNjS3fDZOQVFJ2jZwEA99xPF5qPCttbR/b1VHsMSb30684IT1A4PC7y05kgfAw=
 =jCc3
 -----END PGP SIGNATURE-----

Merge tag 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Alexander Gordeev:

 - Rework copy_oldmem_page() callback to take an iov_iter.

   This includes a few prerequisite updates and fixes to the oldmem
   reading code.

 - Rework cpufeature implementation to allow for various CPU feature
   indications, which is not only limited to hardware capabilities, but
   also allows CPU facilities.

 - Use the cpufeature rework to autoload Ultravisor module when CPU
   facility 158 is available.

 - Add ELF note type for encrypted CPU state of a protected virtual CPU.
   The zgetdump tool from s390-tools package will decrypt the CPU state
   using a Customer Communication Key and overwrite respective notes to
   make the data accessible for crash and other debugging tools.

 - Use vzalloc() instead of vmalloc() + memset() in ChaCha20 crypto
   test.

 - Fix incorrect recovery of kretprobe modified return address in
   stacktrace.

 - Switch the NMI handler to use generic irqentry_nmi_enter() and
   irqentry_nmi_exit() helper functions.

 - Rework the cryptographic Adjunct Processors (AP) pass-through design
   to support dynamic changes to the AP matrix of a running guest as
   well as to implement more of the AP architecture.

 - Minor boot code cleanups.

 - Grammar and typo fixes to hmcdrv and tape drivers.

* tag 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (46 commits)
  Revert "s390/smp: enforce lowcore protection on CPU restart"
  Revert "s390/smp: rework absolute lowcore access"
  Revert "s390/smp,ptdump: add absolute lowcore markers"
  s390/unwind: fix fgraph return address recovery
  s390/nmi: use irqentry_nmi_enter()/irqentry_nmi_exit()
  s390: add ELF note type for encrypted CPU state of a PV VCPU
  s390/smp,ptdump: add absolute lowcore markers
  s390/smp: rework absolute lowcore access
  s390/setup: rearrange absolute lowcore initialization
  s390/boot: cleanup adjust_to_uv_max() function
  s390/smp: enforce lowcore protection on CPU restart
  s390/tape: fix comment typo
  s390/hmcdrv: fix Kconfig "its" grammar
  s390/docs: fix warnings for vfio_ap driver doc
  s390/docs: fix warnings for vfio_ap driver lock usage doc
  s390/crash: support multi-segment iterators
  s390/crash: use static swap buffer for copy_to_user_real()
  s390/crash: move copy_to_user_real() to crash_dump.c
  s390/zcore: fix race when reading from hardware system area
  s390/crash: fix incorrect number of bytes to copy to user space
  ...
2022-08-06 17:05:21 -07:00
Linus Torvalds
a9cf69d0e7 VFIO updates for v6.0-rc1
- Cleanup use of extern in function prototypes (Alex Williamson)
 
  - Simplify bus_type usage and convert to device IOMMU interfaces
    (Robin Murphy)
 
  - Check missed return value and fix comment typos (Bo Liu)
 
  - Split migration ops from device ops and fix races in mlx5 migration
    support (Yishai Hadas)
 
  - Fix missed return value check in noiommu support (Liam Ni)
 
  - Hardening to clear buffer pointer to avoid use-after-free (Schspa Shi)
 
  - Remove requirement that only the same mm can unmap a previously
    mapped range (Li Zhe)
 
  - Adjust semaphore release vs device open counter (Yi Liu)
 
  - Remove unused arg from SPAPR support code (Deming Wang)
 
  - Rework vfio-ccw driver to better fit new mdev framework (Eric Farman,
    Michael Kawano)
 
  - Replace DMA unmap notifier with callbacks (Jason Gunthorpe)
 
  - Clarify SPAPR support comment relative to iommu_ops (Alexey Kardashevskiy)
 
  - Revise page pinning API towards compatibility with future iommufd support
    (Nicolin Chen)
 
  - Resolve issues in vfio-ccw, including use of DMA unmap callback
    (Eric Farman)
 -----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCAA5FiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmLqvYMbHGFsZXgud2ls
 bGlhbXNvbkByZWRoYXQuY29tAAoJECObm247sIsiHM0P/1n/bszel20PRC7x+NLI
 P7b/0aonW4Qtei2HORwowmaznb4NgRE5GCm5RU+a9+AwQKnK44j3lqy0skcfgZXr
 f4viFlxOyd0H4blOhUZ+FuPNkUMAyz6HerzvJ9jQFG426pL5vr7UKWBuJPYB5RCT
 4jEy3EUTSH8/Zt8ApLysFTyR64xN3Sk7vSUcj9rEhu5T3FWq8t9+jb3tE/HW/Xaw
 pMwdC+ctYzYaBD/oA7Ns2IebNS9AUIUjKMXC25oCmc83WGgGOqgLB2mAthQ2NKB5
 5capKBYuYl7PWERvpGpsPILEWvR6m+Rxh8r4Pqjcoyfq4k7vp+A/AFKiD7AEYBdy
 BtfLWO59w6vuRQ5XXOa6Hu4ef6BcMvH4StrHxlHkKcgI4PJA0QscIXiJPQSt7Crr
 m+kCNgPPgrfZDu7lmZTiWbXOYSkJR3Mxkhf2iNHudW9SsJT9pUAVEiGVVA/kC1Y/
 fNBziRQeVF6JUW8M4pveXEWEbA8iE1HQeJA6aVRonxAkJk1KBaQgm/GKJlPXCHIR
 R6lI90NXZHz/3ndIX1znKOm0qli+8auX/FH8iWUffZxGmtINOGGMYebD6YxFdCCJ
 sWalL8vlQNCams2MZdovu/5BowXWtwOMm6KNG9RXSyWIWZEcNVbAzhTr+rrDdHZd
 AJiUNCGO9UlO9FZM+ntfQTSr
 =4BE8
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v6.0-rc1' of https://github.com/awilliam/linux-vfio

Pull VFIO updates from Alex Williamson:

 - Cleanup use of extern in function prototypes (Alex Williamson)

 - Simplify bus_type usage and convert to device IOMMU interfaces (Robin
   Murphy)

 - Check missed return value and fix comment typos (Bo Liu)

 - Split migration ops from device ops and fix races in mlx5 migration
   support (Yishai Hadas)

 - Fix missed return value check in noiommu support (Liam Ni)

 - Hardening to clear buffer pointer to avoid use-after-free (Schspa
   Shi)

 - Remove requirement that only the same mm can unmap a previously
   mapped range (Li Zhe)

 - Adjust semaphore release vs device open counter (Yi Liu)

 - Remove unused arg from SPAPR support code (Deming Wang)

 - Rework vfio-ccw driver to better fit new mdev framework (Eric Farman,
   Michael Kawano)

 - Replace DMA unmap notifier with callbacks (Jason Gunthorpe)

 - Clarify SPAPR support comment relative to iommu_ops (Alexey
   Kardashevskiy)

 - Revise page pinning API towards compatibility with future iommufd
   support (Nicolin Chen)

 - Resolve issues in vfio-ccw, including use of DMA unmap callback (Eric
   Farman)

* tag 'vfio-v6.0-rc1' of https://github.com/awilliam/linux-vfio: (40 commits)
  vfio/pci: fix the wrong word
  vfio/ccw: Check return code from subchannel quiesce
  vfio/ccw: Remove FSM Close from remove handlers
  vfio/ccw: Add length to DMA_UNMAP checks
  vfio: Replace phys_pfn with pages for vfio_pin_pages()
  vfio/ccw: Add kmap_local_page() for memcpy
  vfio: Rename user_iova of vfio_dma_rw()
  vfio/ccw: Change pa_pfn list to pa_iova list
  vfio/ap: Change saved_pfn to saved_iova
  vfio: Pass in starting IOVA to vfio_pin/unpin_pages API
  vfio/ccw: Only pass in contiguous pages
  vfio/ap: Pass in physical address of ind to ap_aqic()
  drm/i915/gvt: Replace roundup with DIV_ROUND_UP
  vfio: Make vfio_unpin_pages() return void
  vfio/spapr_tce: Fix the comment
  vfio: Replace the iommu notifier with a device list
  vfio: Replace the DMA unmapping notifier with a callback
  vfio/ccw: Move FSM open/close to MDEV open/close
  vfio/ccw: Refactor vfio_ccw_mdev_reset
  vfio/ccw: Create a CLOSE FSM event
  ...
2022-08-06 08:59:35 -07:00
Paolo Bonzini
63f4b21041 Merge remote-tracking branch 'kvm/next' into kvm-next-5.20
KVM/s390, KVM/x86 and common infrastructure changes for 5.20

x86:

* Permit guests to ignore single-bit ECC errors

* Fix races in gfn->pfn cache refresh; do not pin pages tracked by the cache

* Intel IPI virtualization

* Allow getting/setting pending triple fault with KVM_GET/SET_VCPU_EVENTS

* PEBS virtualization

* Simplify PMU emulation by just using PERF_TYPE_RAW events

* More accurate event reinjection on SVM (avoid retrying instructions)

* Allow getting/setting the state of the speaker port data bit

* Refuse starting the kvm-intel module if VM-Entry/VM-Exit controls are inconsistent

* "Notify" VM exit (detect microarchitectural hangs) for Intel

* Cleanups for MCE MSR emulation

s390:

* add an interface to provide a hypervisor dump for secure guests

* improve selftests to use TAP interface

* enable interpretive execution of zPCI instructions (for PCI passthrough)

* First part of deferred teardown

* CPU Topology

* PV attestation

* Minor fixes

Generic:

* new selftests API using struct kvm_vcpu instead of a (vm, id) tuple

x86:

* Use try_cmpxchg64 instead of cmpxchg64

* Bugfixes

* Ignore benign host accesses to PMU MSRs when PMU is disabled

* Allow disabling KVM's "MONITOR/MWAIT are NOPs!" behavior

* x86/MMU: Allow NX huge pages to be disabled on a per-vm basis

* Port eager page splitting to shadow MMU as well

* Enable CMCI capability by default and handle injected UCNA errors

* Expose pid of vcpu threads in debugfs

* x2AVIC support for AMD

* cleanup PIO emulation

* Fixes for LLDT/LTR emulation

* Don't require refcounted "struct page" to create huge SPTEs

x86 cleanups:

* Use separate namespaces for guest PTEs and shadow PTEs bitmasks

* PIO emulation

* Reorganize rmap API, mostly around rmap destruction

* Do not workaround very old KVM bugs for L0 that runs with nesting enabled

* new selftests API for CPUID
2022-08-01 03:21:00 -04:00
Nicolin Chen
34a255e676 vfio: Replace phys_pfn with pages for vfio_pin_pages()
Most of the callers of vfio_pin_pages() want "struct page *" and the
low-level mm code to pin pages returns a list of "struct page *" too.
So there's no gain in converting "struct page *" to PFN in between.

Replace the output parameter "phys_pfn" list with a "pages" list, to
simplify callers. This also allows us to replace the vfio_iommu_type1
implementation with a more efficient one.

And drop the pfn_valid check in the gvt code, as there is no need to
do such a check at a page-backed struct page pointer.

For now, also update vfio_iommu_type1 to fit this new parameter too.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Eric Farman <farman@linux.ibm.com>
Tested-by: Terrence Xu <terrence.xu@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20220723020256.30081-11-nicolinc@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-25 13:41:22 -06:00
Nicolin Chen
3fad3a2613 vfio/ap: Change saved_pfn to saved_iova
The vfio_ap_ops code maintains both nib address and its PFN, which
is redundant, merely because vfio_pin/unpin_pages API wanted pfn.
Since vfio_pin/unpin_pages() now accept "iova", change "saved_pfn"
to "saved_iova" and remove pfn in the vfio_ap_validate_nib().

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20220723020256.30081-7-nicolinc@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-25 13:41:22 -06:00
Nicolin Chen
44abdd1646 vfio: Pass in starting IOVA to vfio_pin/unpin_pages API
The vfio_pin/unpin_pages() so far accepted arrays of PFNs of user IOVA.
Among all three callers, there was only one caller possibly passing in
a non-contiguous PFN list, which is now ensured to have contiguous PFN
inputs too.

Pass in the starting address with "iova" alone to simplify things, so
callers no longer need to maintain a PFN list or to pin/unpin one page
at a time. This also allows VFIO to use more efficient implementations
of pin/unpin_pages.

For now, also update vfio_iommu_type1 to fit this new parameter too,
while keeping its input intact (being user_iova) since we don't want
to spend too much effort swapping its parameters and local variables
at that level.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Eric Farman <farman@linux.ibm.com>
Tested-by: Terrence Xu <terrence.xu@intel.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20220723020256.30081-6-nicolinc@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-25 13:41:22 -06:00
Nicolin Chen
10e19d492a vfio/ap: Pass in physical address of ind to ap_aqic()
The ap_aqic() is called by vfio_ap_irq_enable() where it passes in a
virt value that's casted from a physical address "h_nib". Inside the
ap_aqic(), it does virt_to_phys() again.

Since ap_aqic() needs a physical address, let's just pass in a pa of
ind directly. So change the "ind" to "pa_ind".

Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Link: https://lore.kernel.org/r/20220723020256.30081-4-nicolinc@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-23 07:29:10 -06:00
Jason Gunthorpe
ce4b4657ff vfio: Replace the DMA unmapping notifier with a callback
Instead of having drivers register the notifier with explicit code just
have them provide a dma_unmap callback op in their driver ops and rely on
the core code to wire it up.

Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v4-681e038e30fd+78-vfio_unmap_notif_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-07-20 11:57:59 -06:00
Heiko Carstens
0a5f9b382c s390/cpufeature: rework to allow more than only hwcap bits
Rework cpufeature implementation to allow for various cpu feature
indications, which is not only limited to hwcap bits. This is achieved
by adding a sequential list of cpu feature numbers, where each of them
is mapped to an entry which indicates what this number is about.

Each entry contains a type member, which indicates what feature
name space to look into (e.g. hwcap, or cpu facility). If wanted this
allows also to automatically load modules only in e.g. z/VM
configurations.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Steffen Eiden <seiden@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com>
Link: https://lore.kernel.org/r/20220713125644.16121-2-seiden@linux.ibm.com
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:49 +02:00
Tony Krowiak
eeb386aeb5 s390/vfio-ap: handle config changed and scan complete notification
This patch implements two new AP driver callbacks:

void (*on_config_changed)(struct ap_config_info *new_config_info,
                  struct ap_config_info *old_config_info);

void (*on_scan_complete)(struct ap_config_info *new_config_info,
                 struct ap_config_info *old_config_info);

The on_config_changed callback is invoked at the start of the AP bus scan
function when it determines that the host AP configuration information
has changed since the previous scan.

The vfio_ap device driver registers a callback function for this callback
that performs the following operations:

1. Unplugs the adapters, domains and control domains removed from the
host's AP configuration from the guests to which they are
assigned in a single operation.

2. Stores bitmaps identifying the adapters, domains and control domains
added to the host's AP configuration with the structure representing
the mediated device. When the vfio_ap device driver's probe callback is
subsequently invoked, the probe function will recognize that the
queue is being probed due to a change in the host's AP configuration
and the plugging of the queue into the guest will be bypassed.

The on_scan_complete callback is invoked after the ap bus scan is
completed if the host AP configuration data has changed. The vfio_ap
device driver registers a callback function for this callback that hot
plugs each queue and control domain added to the AP configuration for each
guest using them in a single hot plug operation.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:12 +02:00
Tony Krowiak
f7f795c54d s390/vfio-ap: sysfs attribute to display the guest's matrix
The matrix of adapters and domains configured in a guest's APCB may
differ from the matrix of adapters and domains assigned to the matrix mdev,
so this patch introduces a sysfs attribute to display the matrix of
adapters and domains that are or will be assigned to the APCB of a guest
that is or will be using the matrix mdev. For a matrix mdev denoted by
$uuid, the guest matrix can be displayed as follows:

   cat /sys/devices/vfio_ap/matrix/$uuid/guest_matrix

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:11 +02:00
Tony Krowiak
3f85d1df26 s390/vfio-ap: implement in-use callback for vfio_ap driver
Let's implement the callback to indicate when an APQN
is in use by the vfio_ap device driver. The callback is
invoked whenever a change to the apmask or aqmask would
result in one or more queue devices being removed from the driver. The
vfio_ap device driver will indicate a resource is in use
if the APQN of any of the queue devices to be removed are assigned to
any of the matrix mdevs under the driver's control.

There is potential for a deadlock condition between the
matrix_dev->guests_lock used to lock the guest during assignment of
adapters and domains and the ap_perms_mutex locked by the AP bus when
changes are made to the sysfs apmask/aqmask attributes.

The AP Perms lock controls access to the objects that store the adapter
numbers (ap_perms) and domain numbers (aq_perms) for the sysfs
/sys/bus/ap/apmask and /sys/bus/ap/aqmask attributes. These attributes
identify which queues are reserved for the zcrypt default device drivers.
Before allowing a bit to be removed from either mask, the AP bus must check
with the vfio_ap device driver to verify that none of the queues are
assigned to any of its mediated devices.

The apmask/aqmask attributes can be written or read at any time from
userspace, so care must be taken to prevent a deadlock with asynchronous
operations that might be taking place in the vfio_ap device driver. For
example, consider the following:

1. A system administrator assigns an adapter to a mediated device under the
   control of the vfio_ap device driver. The driver will need to first take
   the matrix_dev->guests_lock to potentially hot plug the adapter into
   the KVM guest.
2. At the same time, a system administrator sets a bit in the sysfs
   /sys/bus/ap/ap_mask attribute. To complete the operation, the AP bus
   must:
   a. Take the ap_perms_mutex lock to update the object storing the values
      for the /sys/bus/ap/ap_mask attribute.
   b. Call the vfio_ap device driver's in-use callback to verify that the
      queues now being reserved for the default zcrypt drivers are not
      assigned to a mediated device owned by the vfio_ap device driver. To
      do the verification, the in-use callback function takes the
      matrix_dev->guests_lock, but has to wait because it is already held
      by the operation in 1 above.
3. The vfio_ap device driver calls an AP bus function to verify that the
   new queues resulting from the assignment of the adapter in step 1 are
   not reserved for the default zcrypt device driver. This AP bus function
   tries to take the ap_perms_mutex lock but gets stuck waiting for the
   waiting for the lock due to step 2a above.

Consequently, we have the following deadlock situation:

matrix_dev->guests_lock locked (1)
ap_perms_mutex lock locked (2a)
Waiting for matrix_dev->gusts_lock (2b) which is currently held (1)
Waiting for ap_perms_mutex lock (3) which is currently held (2a)

To prevent this deadlock scenario, the function called in step 3 will no
longer take the ap_perms_mutex lock and require the caller to take the
lock. The lock will be the first taken by the adapter/domain assignment
functions in the vfio_ap device driver to maintain the proper locking
order.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:11 +02:00
Tony Krowiak
70aeefe574 s390/vfio-ap: reset queues after adapter/domain unassignment
When an adapter or domain is unassigned from an mdev attached to a KVM
guest, one or more of the guest's queues may get dynamically removed. Since
the removed queues could get re-assigned to another mdev, they need to be
reset. So, when an adapter or domain is unassigned from the mdev, the
queues that are removed from the guest's AP configuration (APCB) will be
reset.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:11 +02:00
Tony Krowiak
09d31ff787 s390/vfio-ap: hot plug/unplug of AP devices when probed/removed
When an AP queue device is probed or removed, if the mediated device is
attached to a KVM guest, the mediated device's adapter, domain and
control domain bitmaps must be filtered to update the guest's APCB and if
any changes are detected, the guest's APCB must then be hot plugged into
the guest to reflect those changes to the guest.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:11 +02:00
Tony Krowiak
51dc562af0 s390/vfio-ap: allow hot plug/unplug of AP devices when assigned/unassigned
Let's hot plug an adapter, domain or control domain into the guest when it
is assigned to a matrix mdev that is attached to a KVM guest. Likewise,
let's hot unplug an adapter, domain or control domain from the guest when
it is unassigned from a matrix_mdev that is attached to a KVM guest.

Whenever an assignment or unassignment of an adapter, domain or control
domain is performed, the APQNs and control domains assigned to the matrix
mdev will be filtered and assigned to the AP control block
(APCB) that supplies the AP configuration to the guest so that no
adapter, domain or control domain that is not in the host's AP
configuration nor any APQN that does not reference a queue device bound
to the vfio_ap device driver is assigned.

After updating the APCB, if the mdev is in use by a KVM guest, it is
hot plugged into the guest to dynamically provide access to the adapters,
domains and control domains provided via the newly refreshed APCB.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:10 +02:00
Tony Krowiak
2c1ee8983a s390/vfio-ap: prepare for dynamic update of guest's APCB on queue probe/remove
The callback functions for probing and removing a queue device must take
and release the locks required to perform a dynamic update of a guest's
APCB in the proper order.

The proper order for taking the locks is:

        matrix_dev->guests_lock => kvm->lock => matrix_dev->mdevs_lock

The proper order for releasing the locks is:

        matrix_dev->mdevs_lock => kvm->lock => matrix_dev->guests_lock

A new helper function is introduced to be used by the probe callback to
acquire the required locks. Since the probe callback only has
access to a queue device when it is called, the helper function will find
the ap_matrix_mdev object to which the queue device's APQN is assigned and
return it so the KVM guest to which the mdev is attached can be dynamically
updated.

Note that in order to find the ap_matrix_mdev (matrix_mdev) object, it is
necessary to search the matrix_dev->mdev_list. This presents a
locking order dilemma because the matrix_dev->mdevs_lock can't be taken to
protect against changes to the list while searching for the matrix_mdev to
which a queue device's APQN is assigned. This is due to the fact that the
proper locking order requires that the matrix_dev->mdevs_lock be taken
after both the matrix_mdev->kvm->lock and the matrix_dev->mdevs_lock.
Consequently, the matrix_dev->guests_lock will be used to protect against
removal of a matrix_mdev object from the list while a queue device is
being probed. This necessitates changes to the mdev probe/remove
callback functions to take the matrix_dev->guests_lock prior to removing
a matrix_mdev object from the list.

A new macro is also introduced to acquire the locks required to dynamically
update the guest's APCB in the proper order when a queue device is
removed.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:10 +02:00
Tony Krowiak
8ee13ad993 s390/vfio-ap: prepare for dynamic update of guest's APCB on assign/unassign
The functions backing the matrix mdev's sysfs attribute interfaces to
assign/unassign adapters, domains and control domains must take and
release the locks required to perform a dynamic update of a guest's APCB
in the proper order.

The proper order for taking the locks is:

matrix_dev->guests_lock => kvm->lock => matrix_dev->mdevs_lock

The proper order for releasing the locks is:

matrix_dev->mdevs_lock => kvm->lock => matrix_dev->guests_lock

Two new macros are introduced for this purpose: One to take the locks and
the other to release the locks. These macros will be used by the
assignment/unassignment functions to prepare for dynamic update of
the KVM guest's APCB.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:10 +02:00
Tony Krowiak
b84eb8e050 s390/vfio-ap: use proper locking order when setting/clearing KVM pointer
The group notifier that handles the VFIO_GROUP_NOTIFY_SET_KVM event must
use the required locks in proper locking order to dynamically update the
guest's APCB. The proper locking order is:

       1. matrix_dev->guests_lock: required to use the KVM pointer to
          update a KVM guest's APCB.

       2. matrix_mdev->kvm->lock: required to update a KVM guest's APCB.

       3. matrix_dev->mdevs_lock: required to store or access the data
          stored in a struct ap_matrix_mdev instance.

Two macros are introduced to acquire and release the locks in the proper
order. These macros are now used by the group notifier functions.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:10 +02:00
Tony Krowiak
21195eb038 s390/vfio-ap: introduce new mutex to control access to the KVM pointer
The vfio_ap device driver registers for notification when the pointer to
the KVM object for a guest is set. Recall that the KVM lock (kvm->lock)
mutex must be taken outside of the matrix_dev->lock mutex to prevent the
reporting by lockdep of a circular locking dependency (a.k.a., a lockdep
splat):

* see commit 0cc00c8d40 ("Fix circular lockdep when setting/clearing
  crypto masks")

* see commit 86956e7076 ("replace open coded locks for
  VFIO_GROUP_NOTIFY_SET_KVM notification")

With the introduction of support for hot plugging/unplugging AP devices
passed through to a KVM guest, a new guests_lock mutex is introduced to
ensure the proper locking order is maintained:

struct ap_matrix_dev {
        ...
        struct mutex guests_lock;
       ...
}

The matrix_dev->guests_lock controls access to the matrix_mdev instances
that hold the state for AP devices that have been passed through to a
KVM guest. This lock must be held to control access to the KVM pointer
(matrix_mdev->kvm) while the vfio_ap device driver is using it to
plug/unplug AP devices passed through to the KVM guest.

Keep in mind, the proper locking order must be maintained whenever
dynamically updating a KVM guest's APCB to plug/unplug adapters, domains
and control domains:

    1. matrix_dev->guests_lock: required to use the KVM pointer - stored in
       a struct ap_matrix_mdev instance - to update a KVM guest's APCB

    2. matrix_mdev->kvm->lock: required to update a guest's APCB

    3. matrix_dev->mdevs_lock: required to access data stored in a
       struct ap_matrix_mdev instance.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:10 +02:00
Tony Krowiak
d0786556ca s390/vfio-ap: rename matrix_dev->lock mutex to matrix_dev->mdevs_lock
The matrix_dev->lock mutex is being renamed to matrix_dev->mdevs_lock to
better reflect its purpose, which is to control access to the state of the
mediated devices under the control of the vfio_ap device driver.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:10 +02:00
Tony Krowiak
e2126a7374 s390/vfio-ap: allow assignment of unavailable AP queues to mdev device
The current implementation does not allow assignment of an AP adapter or
domain to an mdev device if each APQN resulting from the assignment
does not reference an AP queue device that is bound to the vfio_ap device
driver. This patch allows assignment of AP resources to the matrix mdev as
long as the APQNs resulting from the assignment:
   1. Are not reserved by the AP BUS for use by the zcrypt device drivers.
   2. Are not assigned to another matrix mdev.

The rationale behind this is that the AP architecture does not preclude
assignment of APQNs to an AP configuration profile that are not available
to the system.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:09 +02:00
Tony Krowiak
48cae940c3 s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev
Refresh the guest's APCB by filtering the APQNs and control domain numbers
assigned to the matrix mdev.

Filtering of APQNs:
-----------------
APQNs that do not reference an AP queue device bound to the vfio_ap device
driver must be filtered from the APQNs assigned to the matrix mdev before
they can be assigned to the guest's APCB. Given that the APQNs are
configured in the guest's APCB as a matrix of APIDs (adapters) and APQIs
(domains), it is not possible to filter an individual APQN. For example,
suppose the matrix of APQNs is structured as follows:

                   APIDs
             3      4      5
        0  (3,0)  (4,0)  (5,0)
APQIs   1  (3,1)  (4,1)  (5,1)
        2  (3,2)  (4,2)  (5,2)

Now suppose APQN (4,1) does not reference a queue device bound to the
vfio_ap device driver. If we filter APID 4, the APQNs (4,0), (4,1) and
(4,2) will be removed. Similarly, if we filter domain 1, APQNs (3,1),
(4,1) and (5,1) will be removed.

To resolve this dilemma, the choice was made to filter the APID - in this
case 4 - from the guest's APCB. The reason for this design decision is
because the APID references an AP adapter which is a real hardware device
that can be physically installed, removed, enabled or disabled; whereas, a
domain is a partition within the adapter. It therefore better reflects
reality to remove the APID from the guest's APCB.

Filtering of control domains:
----------------------------
Any control domains that are not assigned to the host's AP configuration
will be filtered from those assigned to the matrix mdev before assigning
them to the guest's APCB.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:09 +02:00
Tony Krowiak
49b0109fb3 s390/vfio-ap: introduce shadow APCB
The APCB is a field within the CRYCB that provides the AP configuration
to a KVM guest. Let's introduce a shadow copy of the KVM guest's APCB and
maintain it for the lifespan of the guest.

The shadow APCB serves the following purposes:

1. The shadow APCB can be maintained even when the mediated device is not
   currently in use by a KVM guest. Since the mediated device's AP
   configuration is filtered to ensure that no AP queues are passed through
   to the KVM guest that are not bound to the vfio_ap device driver or
   available to the host, the mediated device's AP configuration may differ
   from the guest's. Having a shadow of a guest's APCB allows us to provide
   a sysfs interface to view the guest's APCB even if the mediated device
   is not currently passed through to a KVM guest. This can aid in
   problem determination when the guest is unexpectedly missing AP
   resources.

2. If filtering was done in-place for the real APCB, the guest could pick
   up a transient state. Doing the filtering on a shadow and transferring
   the AP configuration to the real APCB after the guest is started or when
   AP resources are assigned to or unassigned from the mediated device, or
   when the host configuration changes, the guest's AP configuration will
   never be in a transient state.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:09 +02:00
Tony Krowiak
11cb2419fa s390/vfio-ap: manage link between queue struct and matrix mdev
Let's create links between each queue device bound to the vfio_ap device
driver and the matrix mdev to which the queue's APQN is assigned. The idea
is to facilitate efficient retrieval of the objects representing the queue
devices and matrix mdevs as well as to verify that a queue assigned to
a matrix mdev is bound to the driver.

The links will be created as follows:

 * When the queue device is probed, if its APQN is assigned to a matrix
   mdev, the structures representing the queue device and the matrix mdev
   will be linked.

 * When an adapter or domain is assigned to a matrix mdev, for each new
   APQN assigned that references a queue device bound to the vfio_ap
   device driver, the structures representing the queue device and the
   matrix mdev will be linked.

The links will be removed as follows:

 * When the queue device is removed, if its APQN is assigned to a matrix
   mdev, the link from the structure representing the matrix mdev to the
   structure representing the queue will be removed. Since the storage
   allocated for the vfio_ap_queue will be freed, there is no need to
   remove the link to the matrix_mdev to which the queue's APQN is
   assigned.

 * When an adapter or domain is unassigned from a matrix mdev, for each
   APQN unassigned that references a queue device bound to the vfio_ap
   device driver, the structures representing the queue device and the
   matrix mdev will be unlinked.

 * When an mdev is removed, the link from any queues assigned to the mdev
   to the mdev will be removed.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:09 +02:00
Tony Krowiak
260f3ea141 s390/vfio-ap: move probe and remove callbacks to vfio_ap_ops.c
Let's move the probe and remove callbacks into the vfio_ap_ops.c
file to keep all code related to managing queues in a single file. This
way, all functions related to queue management can be removed from the
vfio_ap_private.h header file defining the public interfaces for the
vfio_ap device driver.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:08 +02:00
Tony Krowiak
034921cdea s390/vfio-ap: use new AP bus interface to search for queue devices
This patch refactors the vfio_ap device driver to use the AP bus's
ap_get_qdev() function to retrieve the vfio_ap_queue struct containing
information about a queue that is bound to the vfio_ap device driver.
The bus's ap_get_qdev() function retrieves the queue device from a
hashtable keyed by APQN. This is much more efficient than looping over
the list of devices attached to the AP bus by several orders of
magnitude.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-19 16:18:08 +02:00
Tony Krowiak
2f23256c0e s390/ap: fix error handling in __verify_queue_reservations()
The AP bus's __verify_queue_reservations function increments the ref count
for the device driver passed in as a parameter, but fails to decrement it
before returning control to the caller. This will prevents any subsequent
removal of the module.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reported-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Fixes: 4f8206b882 ("s390/ap: driver callback to indicate resource in use")
Link: https://lore.kernel.org/r/20220706222619.602094-1-akrowiak@linux.ibm.com
Cc: stable@vger.kernel.org
[agordeev@linux.ibm.com fixed description, added Fixes and Link]
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2022-07-15 07:07:36 +02:00
Matthew Rosato
d2197485a1 s390/airq: pass more TPI info to airq handlers
A subsequent patch will introduce an airq handler that requires additional
TPI information beyond directed vs floating, so pass the entire tpi_info
structure via the handler.  Only pci actually uses this information today,
for the other airq handlers this is effectively a no-op.

Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220606203325.110625-6-mjrosato@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-07-11 09:54:10 +02:00
Linus Torvalds
176882156a VFIO updates for v5.19-rc1
- Improvements to mlx5 vfio-pci variant driver, including support
    for parallel migration per PF (Yishai Hadas)
 
  - Remove redundant iommu_present() check (Robin Murphy)
 
  - Ongoing refactoring to consolidate the VFIO driver facing API
    to use vfio_device (Jason Gunthorpe)
 
  - Use drvdata to store vfio_device among all vfio-pci and variant
    drivers (Jason Gunthorpe)
 
  - Remove redundant code now that IOMMU core manages group DMA
    ownership (Jason Gunthorpe)
 
  - Remove vfio_group from external API handling struct file ownership
    (Jason Gunthorpe)
 
  - Correct typo in uapi comments (Thomas Huth)
 
  - Fix coccicheck detected deadlock (Wan Jiabing)
 
  - Use rwsem to remove races and simplify code around container and
    kvm association to groups (Jason Gunthorpe)
 
  - Harden access to devices in low power states and use runtime PM to
    enable d3cold support for unused devices (Abhishek Sahu)
 
  - Fix dma_owner handling of fake IOMMU groups (Jason Gunthorpe)
 
  - Set driver_managed_dma on vfio-pci variant drivers (Jason Gunthorpe)
 
  - Pass KVM pointer directly rather than via notifier (Matthew Rosato)
 -----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCAA5FiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmKPvyMbHGFsZXgud2ls
 bGlhbXNvbkByZWRoYXQuY29tAAoJECObm247sIsihegP/3XamiYsS0GuA7awAq/X
 h9Jahb6kJ+sh0RXL1Gqzc9nxH5X9H/hBcL88VOV3GLwyOhNVNpVjQXGguL3aLaCE
 zUrs0+AFEJb990y9H+VgwIDom5BIpgdZ2naG42bz9wUeVGg4daJnkMwOgXwIBzfx
 IOddktN6UwuE+DyA57yqL93f+0cTrhYZx9R14sDoLR5lE4uGnbQwIknawEKVtoeR
 rEPaCFptxPxCUbqoOSR0Y3bu6rUYSH4iiMZpMviqm2ak3aNn76gru3q4QAnI4gTd
 l/w+2OJNFC0U7H5Cz7cdIn2StdJvfSkX0e753+qsFccFsViRCGdnW0Lht/xrYrFC
 i8AJxkrq2/bs00LXs7kzcruaD8pJ2UPe2x2+nupHSEsj99K4NraeHRB2CC1uwj0d
 gYliOSW5T3//wOpztK48s475VppgXeKWkXGoNY3JJlGjAPyd0vFrH8hRLhVZJ9uI
 /eLh6hQnOJuCDz1rQrVNRk6cZi9R1Wpl5dvCBRLqjK519nm569aTlVBra+iNyUCQ
 lU5/kN0ym8+X8CweE5ILPGiX2iEXBYMqv+Dm5yOimRUHRJZHYv900FX0GVEnCUCq
 23sMDaeHS1hyDCQk//bd2Ig7xjh7mbh7CrKcdJ7pL5Gc/A1zkCXd54hvxViiGwQq
 U5KIPTyJy+erpcpxjUApaoP2
 =etEI
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v5.19-rc1' of https://github.com/awilliam/linux-vfio

Pull vfio updates from Alex Williamson:

 - Improvements to mlx5 vfio-pci variant driver, including support for
   parallel migration per PF (Yishai Hadas)

 - Remove redundant iommu_present() check (Robin Murphy)

 - Ongoing refactoring to consolidate the VFIO driver facing API to use
   vfio_device (Jason Gunthorpe)

 - Use drvdata to store vfio_device among all vfio-pci and variant
   drivers (Jason Gunthorpe)

 - Remove redundant code now that IOMMU core manages group DMA ownership
   (Jason Gunthorpe)

 - Remove vfio_group from external API handling struct file ownership
   (Jason Gunthorpe)

 - Correct typo in uapi comments (Thomas Huth)

 - Fix coccicheck detected deadlock (Wan Jiabing)

 - Use rwsem to remove races and simplify code around container and kvm
   association to groups (Jason Gunthorpe)

 - Harden access to devices in low power states and use runtime PM to
   enable d3cold support for unused devices (Abhishek Sahu)

 - Fix dma_owner handling of fake IOMMU groups (Jason Gunthorpe)

 - Set driver_managed_dma on vfio-pci variant drivers (Jason Gunthorpe)

 - Pass KVM pointer directly rather than via notifier (Matthew Rosato)

* tag 'vfio-v5.19-rc1' of https://github.com/awilliam/linux-vfio: (38 commits)
  vfio: remove VFIO_GROUP_NOTIFY_SET_KVM
  vfio/pci: Add driver_managed_dma to the new vfio_pci drivers
  vfio: Do not manipulate iommu dma_owner for fake iommu groups
  vfio/pci: Move the unused device into low power state with runtime PM
  vfio/pci: Virtualize PME related registers bits and initialize to zero
  vfio/pci: Change the PF power state to D0 before enabling VFs
  vfio/pci: Invalidate mmaps and block the access in D3hot power state
  vfio: Change struct vfio_group::container_users to a non-atomic int
  vfio: Simplify the life cycle of the group FD
  vfio: Fully lock struct vfio_group::container
  vfio: Split up vfio_group_get_device_fd()
  vfio: Change struct vfio_group::opened from an atomic to bool
  vfio: Add missing locking for struct vfio_group::kvm
  kvm/vfio: Fix potential deadlock problem in vfio
  include/uapi/linux/vfio.h: Fix trivial typo - _IORW should be _IOWR instead
  vfio/pci: Use the struct file as the handle not the vfio_group
  kvm/vfio: Remove vfio_group from kvm
  vfio: Change vfio_group_set_kvm() to vfio_file_set_kvm()
  vfio: Change vfio_external_check_extension() to vfio_file_enforced_coherent()
  vfio: Remove vfio_external_group_match_file()
  ...
2022-06-01 13:49:15 -07:00
Linus Torvalds
2518f226c6 drm for 5.19-rc1
dma-buf:
 - add dma_resv_replace_fences
 - add dma_resv_get_singleton
 - make dma_excl_fence private
 
 core:
 - EDID parser refactorings
 - switch drivers to drm_mode_copy/duplicate
 - DRM managed mutex initialization
 
 display-helper:
 - put HDMI, SCDC, HDCP, DSC and DP into new module
 
 gem:
 - rework fence handling
 
 ttm:
 - rework bulk move handling
 - add common debugfs for resource managers
 - convert to kvcalloc
 
 format helpers:
 - support monochrome formats
 - RGB888, RGB565 to XRGB8888 conversions
 
 fbdev:
 - cfb/sys_imageblit fixes
 - pagelist corruption fix
 - create offb platform device
 - deferred io improvements
 
 sysfb:
 - Kconfig rework
 - support for VESA mode selection
 
 bridge:
 - conversions to devm_drm_of_get_bridge
 - conversions to panel_bridge
 - analogix_dp - autosuspend support
 - it66121 - audio support
 - tc358767 - DSI to DPI support
 - icn6211 - PLL/I2C fixes, DT property
 - adv7611 - enable DRM_BRIDGE_OP_HPD
 - anx7625 - fill ELD if no monitor
 - dw_hdmi - add audio support
 - lontium LT9211 support, i.MXMP LDB
 - it6505: Kconfig fix, DPCD set power fix
 - adv7511 - CEC support for ADV7535
 
 panel:
 - ltk035c5444t, B133UAN01, NV3052C panel support
 - DataImage FG040346DSSWBG04 support
 - st7735r - DT bindings fix
 - ssd130x - fixes
 
 i915:
 - DG2 laptop PCI-IDs ("motherboard down")
 - Initial RPL-P PCI IDs
 - compute engine ABI
 - DG2 Tile4 support
 - DG2 CCS clear color compression support
 - DG2 render/media compression formats support
 - ATS-M platform info
 - RPL-S PCI IDs added
 - Bump ADL-P DMC version to v2.16
 - Support static DRRS
 - Support multiple eDP/LVDS native mode refresh rates
 - DP HDR support for HSW+
 - Lots of display refactoring + fixes
 - GuC hwconfig support and query
 - sysfs support for multi-tile
 - fdinfo per-client gpu utilisation
 - add geometry subslices query
 - fix prime mmap with LMEM
 - fix vm open count and remove vma refcounts
 - contiguous allocation fixes
 - steered register write support
 - small PCI BAR enablement
 - GuC error capture support
 - sunset igpu legacy mmap support for newer devices
 - GuC version 70.1.1 support
 
 amdgpu:
 - Initial SoC21 support
 - SMU 13.x enablement
 - SMU 13.0.4 support
 - ttm_eu cleanups
 - USB-C, GPUVM updates
 - TMZ fixes for RV
 - RAS support for VCN
 - PM sysfs code cleanup
 - DC FP rework
 - extend CG/PG flags to 64-bit
 - SI dpm lockdep fix
 - runtime PM fixes
 
 amdkfd:
 - RAS/SVM fixes
 - TLB flush fixes
 - CRIU GWS support
 - ignore bogus MEC signals more efficiently
 
 msm:
 - Fourcc modifier for tiled but not compressed layouts
 - Support for userspace allocated IOVA (GPU virtual address)
 - DPU: DSC (Display Stream Compression) support
 - DP: eDP support
 - DP: conversion to use drm_bridge and drm_bridge_connector
 - Merge DPU1 and MDP5 MDSS driver
 - DPU: writeback support
 
 nouveau:
 - make some structures static
 - make some variables static
 - switch to drm_gem_plane_helper_prepare_fb
 
 radeon:
 - misc fixes/cleanups
 
 mxsfb:
 - rework crtc mode setting
 - LCDIF CRC support
 
 etnaviv:
 - fencing improvements
 - fix address space collisions
 - cleanup MMU reference handling
 
 gma500:
 - GEM/GTT improvements
 - connector handling fixes
 
 komeda:
 - switch to plane reset helper
 
 mediatek:
 - MIPI DSI improvements
 
 omapdrm:
 - GEM improvements
 
 qxl:
 - aarch64 support
 
 vc4:
 - add a CL submission tracepoint
 - HDMI YUV support
 - HDMI/clock improvements
 - drop is_hdmi caching
 
 virtio:
 - remove restriction of non-zero blob types
 
 vmwgfx:
 - support for cursormob and cursorbypass 4
 - fence improvements
 
 tidss:
 - reset DISPC on startup
 
 solomon:
 - SPI support
 - DT improvements
 
 sun4i:
 - allwinner D1 support
 - drop is_hdmi caching
 
 imx:
 - use swap() instead of open-coding
 - use devm_platform_ioremap_resource
 - remove redunant initializations
 
 ast:
 - Displayport support
 
 rockchip:
 - Refactor IOMMU initialisation
 - make some structures static
 - replace drm_detect_hdmi_monitor with drm_display_info.is_hdmi
 - support swapped YUV formats,
 - clock improvements
 - rk3568 support
 - VOP2 support
 
 mediatek:
 - MT8186 support
 
 tegra:
 - debugabillity improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmKNxkAACgkQDHTzWXnE
 hr4hghAAqSXeMEw1w34miyM28hcOpXqkDfT1VVooqFxBT8MBqamzpZvCH94qsZwm
 3DRXlhQ4pk8wzUcWJpGprdNakxNQPpFVs2UuxYxOyrxYpdkbOwqsEcM3d8VXD9Cy
 E36z+dr85A8Te/J0Yg/FLoZMHulTlidqEZeOz6SMaNUohtrmH/oPWR+cPIy4/Zpp
 yysfbBSKTwblJFDf4+nIpks/VvJYAUO3i6KClT/Rh79Yg6de582AU0YaNcEArTi6
 JqdiYIoYLx609Ecy5NVme6wR/ai46afFLMYt3ZIP4OfHRINk+YL01BYMo2JE2M8l
 xjOH0Iwb7evzWqLK/ESwqp3P7nyppmLlfbZOFHWUfNJsjq2H3ePaAGhzOlYx1c70
 XENzY4IvpYYdR0pJuh1gw1cNZfM9JDAynGJ5jvsATLGBGQbpFsy3w/PMZT17q8an
 DpBwqQmShUdCJ2m+6zznC3VsxJpbvWKNE1I93NxAWZXmFYxoHCzRihahUxKcNDrQ
 ZLH7RSlk9SE/ZtNSLkU15YnKtoW+ThFIssUpVio6U/fZot1+efZkmkXplSuFvj6R
 i7s14hMWQjSJzpJg1DXfhDMycEOujNiQppCG2EaDlVxvUtCqYBd3EHOI7KQON//+
 iVtmEEnWh5rcCM+WsxLGf3Y7sVP3vfo1LOCxshb1XVfDmeMksoI=
 =BYQA
 -----END PGP SIGNATURE-----

Merge tag 'drm-next-2022-05-25' of git://anongit.freedesktop.org/drm/drm

Pull drm updates from Dave Airlie:
 "Intel have enabled DG2 on certain SKUs for laptops, AMD has started
  some new GPU support, msm has user allocated VA controls

  dma-buf:
   - add dma_resv_replace_fences
   - add dma_resv_get_singleton
   - make dma_excl_fence private

  core:
   - EDID parser refactorings
   - switch drivers to drm_mode_copy/duplicate
   - DRM managed mutex initialization

  display-helper:
   - put HDMI, SCDC, HDCP, DSC and DP into new module

  gem:
   - rework fence handling

  ttm:
   - rework bulk move handling
   - add common debugfs for resource managers
   - convert to kvcalloc

  format helpers:
   - support monochrome formats
   - RGB888, RGB565 to XRGB8888 conversions

  fbdev:
   - cfb/sys_imageblit fixes
   - pagelist corruption fix
   - create offb platform device
   - deferred io improvements

  sysfb:
   - Kconfig rework
   - support for VESA mode selection

  bridge:
   - conversions to devm_drm_of_get_bridge
   - conversions to panel_bridge
   - analogix_dp - autosuspend support
   - it66121 - audio support
   - tc358767 - DSI to DPI support
   - icn6211 - PLL/I2C fixes, DT property
   - adv7611 - enable DRM_BRIDGE_OP_HPD
   - anx7625 - fill ELD if no monitor
   - dw_hdmi - add audio support
   - lontium LT9211 support, i.MXMP LDB
   - it6505: Kconfig fix, DPCD set power fix
   - adv7511 - CEC support for ADV7535

  panel:
   - ltk035c5444t, B133UAN01, NV3052C panel support
   - DataImage FG040346DSSWBG04 support
   - st7735r - DT bindings fix
   - ssd130x - fixes

  i915:
   - DG2 laptop PCI-IDs ("motherboard down")
   - Initial RPL-P PCI IDs
   - compute engine ABI
   - DG2 Tile4 support
   - DG2 CCS clear color compression support
   - DG2 render/media compression formats support
   - ATS-M platform info
   - RPL-S PCI IDs added
   - Bump ADL-P DMC version to v2.16
   - Support static DRRS
   - Support multiple eDP/LVDS native mode refresh rates
   - DP HDR support for HSW+
   - Lots of display refactoring + fixes
   - GuC hwconfig support and query
   - sysfs support for multi-tile
   - fdinfo per-client gpu utilisation
   - add geometry subslices query
   - fix prime mmap with LMEM
   - fix vm open count and remove vma refcounts
   - contiguous allocation fixes
   - steered register write support
   - small PCI BAR enablement
   - GuC error capture support
   - sunset igpu legacy mmap support for newer devices
   - GuC version 70.1.1 support

  amdgpu:
   - Initial SoC21 support
   - SMU 13.x enablement
   - SMU 13.0.4 support
   - ttm_eu cleanups
   - USB-C, GPUVM updates
   - TMZ fixes for RV
   - RAS support for VCN
   - PM sysfs code cleanup
   - DC FP rework
   - extend CG/PG flags to 64-bit
   - SI dpm lockdep fix
   - runtime PM fixes

  amdkfd:
   - RAS/SVM fixes
   - TLB flush fixes
   - CRIU GWS support
   - ignore bogus MEC signals more efficiently

  msm:
   - Fourcc modifier for tiled but not compressed layouts
   - Support for userspace allocated IOVA (GPU virtual address)
   - DPU: DSC (Display Stream Compression) support
   - DP: eDP support
   - DP: conversion to use drm_bridge and drm_bridge_connector
   - Merge DPU1 and MDP5 MDSS driver
   - DPU: writeback support

  nouveau:
   - make some structures static
   - make some variables static
   - switch to drm_gem_plane_helper_prepare_fb

  radeon:
   - misc fixes/cleanups

  mxsfb:
   - rework crtc mode setting
   - LCDIF CRC support

  etnaviv:
   - fencing improvements
   - fix address space collisions
   - cleanup MMU reference handling

  gma500:
   - GEM/GTT improvements
   - connector handling fixes

  komeda:
   - switch to plane reset helper

  mediatek:
   - MIPI DSI improvements

  omapdrm:
   - GEM improvements

  qxl:
   - aarch64 support

  vc4:
   - add a CL submission tracepoint
   - HDMI YUV support
   - HDMI/clock improvements
   - drop is_hdmi caching

  virtio:
   - remove restriction of non-zero blob types

  vmwgfx:
   - support for cursormob and cursorbypass 4
   - fence improvements

  tidss:
   - reset DISPC on startup

  solomon:
   - SPI support
   - DT improvements

  sun4i:
   - allwinner D1 support
   - drop is_hdmi caching

  imx:
   - use swap() instead of open-coding
   - use devm_platform_ioremap_resource
   - remove redunant initializations

  ast:
   - Displayport support

  rockchip:
   - Refactor IOMMU initialisation
   - make some structures static
   - replace drm_detect_hdmi_monitor with drm_display_info.is_hdmi
   - support swapped YUV formats,
   - clock improvements
   - rk3568 support
   - VOP2 support

  mediatek:
   - MT8186 support

  tegra:
   - debugabillity improvements"

* tag 'drm-next-2022-05-25' of git://anongit.freedesktop.org/drm/drm: (1740 commits)
  drm/i915/dsi: fix VBT send packet port selection for ICL+
  drm/i915/uc: Fix undefined behavior due to shift overflowing the constant
  drm/i915/reg: fix undefined behavior due to shift overflowing the constant
  drm/i915/gt: Fix use of static in macro mismatch
  drm/i915/audio: fix audio code enable/disable pipe logging
  drm/i915: Fix CFI violation with show_dynamic_id()
  drm/i915: Fix 'mixing different enum types' warnings in intel_display_power.c
  drm/i915/gt: Fix build error without CONFIG_PM
  drm/msm/dpu: handle pm_runtime_get_sync() errors in bind path
  drm/msm/dpu: add DRM_MODE_ROTATE_180 back to supported rotations
  drm/msm: don't free the IRQ if it was not requested
  drm/msm/dpu: limit writeback modes according to max_linewidth
  drm/amd: Don't reset dGPUs if the system is going to s2idle
  drm/amdgpu: Unmap legacy queue when MES is enabled
  drm: msm: fix possible memory leak in mdp5_crtc_cursor_set()
  drm/msm: Fix fb plane offset calculation
  drm/msm/a6xx: Fix refcount leak in a6xx_gpu_init
  drm/msm/dsi: don't powerup at modeset time for parade-ps8640
  drm/rockchip: Change register space names in vop2
  dt-bindings: display: rockchip: make reg-names mandatory for VOP2
  ...
2022-05-25 16:18:27 -07:00
Matthew Rosato
421cfe6596 vfio: remove VFIO_GROUP_NOTIFY_SET_KVM
Rather than relying on a notifier for associating the KVM with
the group, let's assume that the association has already been
made prior to device_open.  The first time a device is opened
associate the group KVM with the device.

This fixes a user-triggerable oops in GVT.

Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Zhi Wang <zhi.a.wang@intel.com>
Link: https://lore.kernel.org/r/20220519183311.582380-2-mjrosato@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-05-24 08:41:18 -06:00
Jason Gunthorpe
8e432bb015 vfio/mdev: Pass in a struct vfio_device * to vfio_pin/unpin_pages()
Every caller has a readily available vfio_device pointer, use that instead
of passing in a generic struct device. The struct vfio_device already
contains the group we need so this avoids complexity, extra refcountings,
and a confusing lifecycle model.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/3-v4-8045e76bf00b+13d-vfio_mdev_no_group_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-05-11 13:12:59 -06:00
Jason Gunthorpe
09ea48efff vfio: Make vfio_(un)register_notifier accept a vfio_device
All callers have a struct vfio_device trivially available, pass it in
directly and avoid calling the expensive vfio_group_get_from_dev().

Acked-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v4-8045e76bf00b+13d-vfio_mdev_no_group_jgg@nvidia.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2022-05-11 13:12:58 -06:00
Thomas Huth
d4b2945dc9 s390/vfio-ap: remove superfluous MODULE_DEVICE_TABLE declaration
The vfio_ap module tries to register for the vfio_ap bus - but that's
the interface that it provides itself, so this does not make much sense,
thus let's simply drop this statement now.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Link: https://lore.kernel.org/r/20220413094416.412114-1-thuth@redhat.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:15 +02:00
Harald Freudenberger
2004b57cde s390/zcrypt: code cleanup
This patch tries to fix as much as possible of the
checkpatch.pl --strict findings:
  CHECK: Logical continuations should be on the previous line
  CHECK: No space is necessary after a cast
  CHECK: Alignment should match open parenthesis
  CHECK: 'useable' may be misspelled - perhaps 'usable'?
  WARNING: Possible repeated word: 'is'
  CHECK: spaces preferred around that '*' (ctx:VxV)
  CHECK: Comparison to NULL could be written "!msg"
  CHECK: Prefer kzalloc(sizeof(*zc)...) over kzalloc(sizeof(struct...)...)
  CHECK: Unnecessary parentheses around resp_type->work
  CHECK: Avoid CamelCase: <xcRB>

There is no functional change comming with this patch, only
code cleanup, renaming, whitespaces, indenting, ... but no
semantic change in any way. Also the API (zcrypt and pkey
header file) is semantically unchanged.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:14 +02:00
Harald Freudenberger
6acb086d9f s390/zcrypt: cleanup CPRB struct definitions
This patch does a little cleanup on the CPRBX struct
in zcrypt.h and the redundant CPRB struct definition in
zcrypt_msgtype6.c. Especially some of the misleading
fields from the CPRBX struct have been removed.

There is no semantic change coming with this patch.
The field names changed in the XCRB struct are only related
to reserved fields which should never been used.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:13 +02:00
Harald Freudenberger
d9b38e9d0f s390/ap: uevent on apmask/aqpmask change
This patch introduces user space notifications for changes
on the apmask or aqmask attributes. So it could be possible
to write a udev rule to load/unload the vfio_ap kernel module
based on changes of these masks.

On chance of the apmask or aqmask an AP change event will
be produced with an uevent environment variable showing
the new APMASK or AQMASK mask.

So a change on the apmask triggers an uvevent like this:

  KERNEL[490.160396] change   /devices/ap (ap)
  ACTION=change
  DEVPATH=/devices/ap
  SUBSYSTEM=ap
  APMASK=0xffffffdfffffffffffffffffffffffffffffffffffffffffffffffffffffffff
  SEQNUM=13367

and a change on the aqmask looks like this:

  KERNEL[283.217642] change   /devices/ap (ap)
  ACTION=change
  DEVPATH=/devices/ap
  SUBSYSTEM=ap
  AQMASK=0xfbffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
  SEQNUM=13348

Only real changes to the masks are processed - the old and
new masks are compared and no action is done if the values
are equal (and thus no uevent). The emit of the uevent is
the very last action done when a mask change is processed.
However, there is no guarantee that all unbind/bind actions
caused by the apmask/aqmask changes are completed when the
apmask/aqmask change uevent is received in userspace.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:13 +02:00
Harald Freudenberger
28d3417a94 s390/zcrypt: add display of ASYM master key verification pattern
This patch extends the sysfs attribute mkvps for CCA cards
to show the states and master key verification patterns for
the old, current and new ASYM master key registers.

With this patch now all relevant master key verification
patterns related to a CCA HSM are available with the mkvps
sysfs attribute. This is a requirement for some exploiters
like the kubernetes cex plugin or initrd code needing to
verify the master key verification patterns on HSMs before
use.

A sample output:
  cat /sys/devices/ap/card04/04.0005/mkvps
  AES NEW: empty 0x0000000000000000
  AES CUR: valid 0xe9a49a58cd039bed
  AES OLD: valid 0x7d10d17bc8a409c4
  APKA NEW: empty 0x0000000000000000
  APKA CUR: valid 0x5f2f27aaa2d59b4a
  APKA OLD: valid 0x82a5e2cd5030d5ec
  ASYM NEW: empty 0x00000000000000000000000000000000
  ASYM CUR: valid 0x650c25a89c27e716d0e692b6c83f10e5
  ASYM OLD: valid 0xf8ae2acf8bfc57f0a0957c732c16078b

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jörg Schmidbauer <jschmidb@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-04-25 13:54:13 +02:00
Jason Gunthorpe
6b42f491e1 vfio/mdev: Remove mdev_parent_ops
The last useful member in this struct is the supported_type_groups, move
it to the mdev_driver and delete mdev_parent_ops.

Replace it with mdev_driver as an argument to mdev_register_device()

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20220411141403.86980-33-hch@lst.de
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
2022-04-21 07:36:56 -04:00
Jakob Koschel
97f32e1173 s390/zcrypt: fix using the correct variable for sizeof()
While the original code is valid, it is not the obvious choice for the
sizeof() call and in preparation to limit the scope of the list iterator
variable the sizeof should be changed to the size of the variable
being allocated.

Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-27 22:18:39 +02:00
Tony Krowiak
7107822004 s390/vfio-ap: fix kernel doc and signature of group notifier functions
The vfio_ap device driver registers a group notifier function to handle
the VFIO_GROUP_NOTIFY_SET_KVM event signalling the KVM pointer has been
set or cleared. There are two helper functions invoked by the handler
function: One called when the KVM pointer has been set, and the other
when the pointer is cleared.

The kernel doc for both of these functions contains a comment introduced
by commit 0cc00c8d40 (s390/vfio-ap: fix circular lockdep when
setting/clearing crypto masks) that is no longer valid. This patch removes
this comment from the kernel doc of each helper function.

Commit 86956e7076 (s390/vfio-ap: replace open coded locks for
VFIO_GROUP_NOTIFY_SET_KVM notification) added a parameter to the signature
of the helper function that handles the event indicating the KVM pointer
has been cleared. The parameter added was the KVM pointer itself.
One of the function's primary purposes is to clear the KVM pointer from the
ap_matrix_mdev instance in which it is stored. Since the callers of this
function derive the KVM pointer passed to the function from the
ap_matrix_mdev object itself, it is completely unnecessary to include this
parameter in the function's signature since it can simply be retrieved from
the ap_matrix_mdev object which is also passed in. This patch removes the
KVM pointer from the function's signature.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-27 22:18:39 +02:00
Haowen Bai
0f210fb39e s390: crypto: Use min_t() instead of doing it manually
Fix following coccicheck warning:
drivers/s390/crypto/zcrypt_ep11misc.c:1112:25-26: WARNING opportunity for min()

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-27 22:18:38 +02:00
Julia Lawall
f4272c03a3 s390/pkey: fix typos in comments
Various spelling mistakes in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Harald Freudenberger <freude@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-27 22:18:38 +02:00
Juergen Christ
cfd68b3309 s390/zcrypt: Filter admin CPRBs on custom devices
Add a filter for custom devices to check for allowed control domains of
admin CPRBs.  This filter only applies to custom devices and not to the
main device.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-27 22:18:38 +02:00
Juergen Christ
895ae58da4 s390/zcrypt: Add admask to zcdn
Zcrypt custom devices now support control domain masks.  Users can set and
modify this mask to allow custom devices to access certain control domains.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-27 22:18:38 +02:00
Jürgen Christ
1024063eff s390/zcrypt: Provide target domain for EP11 cprbs to scheduling function
The scheduling function will get an extension which will
process the target_id value from an EP11 cprb. This patch
extracts the value during preparation of the ap message.

Signed-off-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-08 00:33:00 +01:00
Harald Freudenberger
252a1ff777 s390/zcrypt: change reply buffer size offering
Instead of offering the user space given receive buffer size to
the crypto card firmware as limit for the reply message offer
the internal per queue reply buffer size. As the queue's reply
buffer is always adjusted to the max message size possible for
this card this may offer more buffer space. However, now it is
important to check the user space reply buffer on pushing back
the reply. If the reply does not fit into the user space provided
buffer the ioctl will fail with errno EMSGSIZE.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-08 00:33:00 +01:00
Harald Freudenberger
383366b580 s390/zcrypt: Support CPRB minor version T7
There is a new CPRB minor version T7 to be supported with
this patch. Together with this the functions which extract
the CPRB data from userspace and prepare the AP message do
now check the CPRB minor version and provide some info in
the flag field of the ap message struct for further processing.

The 3 functions doing this job have been renamed to
prep_cca_ap_msg, prep_ep11_ap_msg and prep_rng_ap_msg to
reflect their job better (old was get..fc).

This patch also introduces two new flags to be used internal
with the flag field of the struct ap_message:

AP_MSG_FLAG_USAGE is set when prep_cca_ap_msg or prep_ep11_ap_msg
come to the conclusion that this is a ordinary crypto load CPRB
(which means T2 for CCA CPRBs and no admin bit for EP11 CPRBs).

AP_MSG_FLAG_ADMIN is set when prep_cca_ap_msg or prep_ep11_ap_msg
think, this is an administrative (control) crypto load CPRB
(which means T3, T5, T6 or T7 for CCA CPRBs and admin bit set
for EP11 CPRBs).

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-08 00:33:00 +01:00
Harald Freudenberger
a7e701dba1 s390/zcrypt: handle checkstopped cards with new state
A crypto card may be in checkstopped state. With this
patch this is handled as a new state in the ap card and
ap queue structs. There is also a new card sysfs attribute

  /sys/devices/ap/cardxx/chkstop

and a new queue sysfs attribute

  /sys/devices/ap/cardxx/xx.yyyy/chkstop

displaying the checkstop state of the card or queue. Please
note that the queue's checkstop state is only a copy of the
card's checkstop state but makes maintenance much easier.

The checkstop state expressed here is the result of an
RC 0x04 (CHECKSTOP) during an AP command, mostly the
PQAP(TAPQ) command which is 'testing' the queue.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-08 00:33:00 +01:00
Harald Freudenberger
985214af93 s390/zcrypt: CEX8S exploitation support
This patch adds CEX8 exploitation support for the AP bus code,
the zcrypt device driver zoo and the vfio device driver.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-08 00:33:00 +01:00
Harald Freudenberger
d64e5e9120 s390/ap/zcrypt: debug feature improvements
This patch adds some debug feature improvements related
to some failures happened in the past. With CEX8 the max
request and response sizes have been extended but the
user space applications did not rework their code and
thus ran into receive buffer issues. This ffdc patch
here helps with additional checks and debug feature
messages in debugging and pointing to the root cause of
some failures related to wrong buffer sizes.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jürgen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-08 00:33:00 +01:00
Harald Freudenberger
8944d05f9b s390/ap: enable sysfs attribute scans to force AP bus rescan
This patch switches the sysfs attribute /sys/bus/ap/scans
from read-only to read-write. If there is something written
to this attribute, an AP bus rescan is forced. If an AP
bus scan is triggered this way a debug feature entry line
reports this in /sys/kernel/debug/s390dbf/ap/sprintf.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Jakob Naucke <naucke@linux.ibm.com>
Reviewed-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-01 21:05:10 +01:00
Tony Krowiak
283915850a s390/ap: notify drivers on config changed and scan complete callbacks
This patch introduces an extension to the ap bus to notify device drivers
when the host AP configuration changes - i.e., adapters, domains or
control domains are added or removed. When an adapter or domain is added to
the host's AP configuration, the AP bus will create the associated queue
devices in the linux sysfs device model. Each new type 10 (i.e., CEX4) or
newer queue device with an APQN that is not reserved for the default device
driver will get bound to the vfio_ap device driver. Likewise, whan an
adapter or domain is removed from the host's AP configuration, the AP bus
will remove the associated queue devices from the sysfs device model. Each
of the queues that is bound to the vfio_ap device driver will get unbound.

With the introduction of hot plug support, binding or unbinding of a
queue device will result in plugging or unplugging one or more queues from
a guest that is using the queue. If there are multiple changes to the
host's AP configuration, it could result in the probe and remove callbacks
getting invoked multiple times. Each time queues are plugged into or
unplugged from a guest, the guest's VCPUs must be taken out of SIE.
If this occurs multiple times due to changes in the host's AP
configuration, that can have an undesirable negative affect on the guest's
performance.

To alleviate this problem, this patch introduces two new callbacks: one to
notify the vfio_ap device driver when the AP bus scan routine detects a
change to the host's AP configuration; and, one to notify the driver when
the AP bus is done scanning. This will allow the vfio_ap driver to do
bulk processing of all affected adapters, domains and control domains for
affected guests rather than plugging or unplugging them one at a time when
the probe or remove callback is invoked. The two new callbacks are:

void (*on_config_changed)(struct ap_config_info *new_config_info,
                          struct ap_config_info *old_config_info);

This callback is invoked at the start of the AP bus scan
function when it determines that the host AP configuration information
has changed since the previous scan. This is done by storing
an old and current QCI info struct and comparing them. If there is any
difference, the callback is invoked.

void (*on_scan_complete)(struct ap_config_info *new_config_info,
                         struct ap_config_info *old_config_info);

The on_scan_complete callback is invoked after the ap bus scan is
completed if the host AP configuration data has changed.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-01 21:05:10 +01:00
Tony Krowiak
4f8206b882 s390/ap: driver callback to indicate resource in use
Introduces a new driver callback to prevent a root user from re-assigning
the APQN of a queue that is in use by a non-default host device driver to
a default host device driver and vice versa. The callback will be invoked
whenever a change to the AP bus's sysfs apmask or aqmask attributes would
result in one or more APQNs being re-assigned. If the callback responds
in the affirmative for any driver queried, the change to the apmask or
aqmask will be rejected with a device busy error.

For this patch, only non-default drivers will be queried. Currently,
there is only one non-default driver, the vfio_ap device driver. The
vfio_ap device driver facilitates pass-through of an AP queue to a
guest. The idea here is that a guest may be administered by a different
sysadmin than the host and we don't want AP resources to unexpectedly
disappear from a guest's AP configuration (i.e., adapters and domains
assigned to the matrix mdev). This will enforce the proper procedure for
removing AP resources intended for guest usage which is to
first unassign them from the matrix mdev, then unbind them from the
vfio_ap device driver.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-03-01 21:05:09 +01:00
Tony Krowiak
783f0a3ccd s390/vfio-ap: add s390dbf logging to the vfio_ap_irq_enable function
This patch adds s390dbf logging to the function that executes the
PQAP(AQIC) instruction on behalf of the guest to which the queue for which
interrupts are being enabled or disabled is attached.

Currently, the vfio_ap_irq_enable function sets status response code 06
(notification indicator byte address (nib) invalid) in the status word
when the vfio_pin_pages function - called to pin the page containing the
nib - returns an error or a different number of pages pinned than
requested.

Setting the response code returned to userspace without also logging a
message in the kernel makes it impossible to determine whether the response
was due to an error detected by the vfio_ap device driver or because the
response code was returned by the firmware in response to the PQAP(AQIC)
instruction.

In addition to logging a warning for the situation above, this patch adds
the following:

* A function to validate the nib address invoked prior to calling the
  vfio_pin_pages function. This allows for logging a message informing
  the reader of the reason the page containing the nib can not be pinned
  if the nib address is not valid. Response code 06 (invalid nib address)
  will be set in the status word returned to the guest from the
  instruction.

* Checks the return value from the kvm_s390_gisc_register and logs a
  message informing the reader of the failure. Status response code 08
  (invalid gisa) will be set in the status word returned to the guest from
  the PQAP(AQIC) instruction.

* Checks the status response code returned from execution of the PQAP(AQIC)
  instruction and if it indicates an error, logs a message informing the
  reader.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-02-06 23:31:29 +01:00
Tony Krowiak
68f554b7d2 s390/vfio-ap: add s390dbf logging to the handle_pqap function
This patch adds s390dbf logging to the function that handles interception
of the PQAP(AQIC) instruction. Several items of data are validated before
ultimately calling the functions that execute the PQAP(AQIC) instruction on
behalf of the guest to which the queue for which interrupts are being
enabled or disabled is attached.

Currently, the handle_pqap function sets status response code 01 (queue not
available) in the status word that is normally returned from the
PQAP(AQIC) instruction under the following conditions:

* Set when the function pointer to the handler is not set in the
  kvm_s390_crypto object (i.e., the PQAP hook is not registered).

* Set when the KVM pointer is not set in the ap_matrix_mdev object
  (i.e., the matrix mdev is not passed through to a guest).

* Set when the queue for which interrupts are being enabled or
  disabled is either not bound to the vfio_ap device driver or not assigned
  to the matrix mdev.

Setting the response code returned to userspace without also logging a
message in the kernel makes it impossible to determine whether the response
was due to an error detected by the vfio_ap device driver or because the
response code was returned by the firmware in response to the PQAP(AQIC)
instruction, so this patch logs a message to the s390dbf log for the
vfio_ap device driver for each of the situations described above.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-02-06 23:31:29 +01:00
Tony Krowiak
a084c44eaa s390-vfio-ap: introduces s390 kernel debug feature for vfio_ap device driver
Sets up an s390dbf debug log for the vfio_ap device driver for logging
events occurring during the lifetime of the driver.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2022-02-06 23:31:29 +01:00
Juergen Christ
cff2d3abc8 s390/zcrypt: CCA control CPRB sending
When sending a CCA CPRB to a control domain, the CPRB has to be sent via a
usage domain.  Previous code used the default domain to route this message.
If the default domain is not online and ready to send the CPRB, the ioctl will
fail even if other usage domains could be used to send the CPRB.

To improve this, instead of using the default domain, switch to auto-select of
the domain.

Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-12-16 19:58:07 +01:00
Tony Krowiak
f139862b92 s390/vfio-ap: add status attribute to AP queue device's sysfs dir
This patch adds a sysfs 'status' attribute to a queue device when it is
bound to the vfio_ap device driver. The field displays a string indicating
the status of the queue device:

Status String:  Indicates:
-------------   ---------
"assigned"      the queue is assigned to an mdev, but is not in use by a
                KVM guest.
"in use"        the queue is assigned to an mdev and is in use by a KVM
                guest.
"unassigned"    the queue is not assigned to an mdev.

The status string will be displayed by the 'lszcrypt' command if the queue
device is bound to the vfio_ap device driver.

Signed-off-by: Harald Freudenberger <freude@linux.ibm.com>
[akrowiak@linux.ibm.com: added check for queue in use by guest]
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-12-06 14:42:26 +01:00