With recent netfs apis changes, the bytes written
value was not getting updated in /proc/fs/cifs/Stats.
Fix this by updating tcon->bytes in write operations.
Fixes: 3ee1a1fc3981 ("cifs: Cut over to using netfslib")
Signed-off-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
There is no such thing as CPACR_ELx in the architecture.
What we have is CPACR_EL1, for which CPTR_EL12 is an accessor.
Rename CPACR_ELx_* to CPACR_EL1_*, and fix the bit of code using
these names.
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241219173351.1123087-5-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Perform a bulk convert of the remaining EL12 accessors to use the
Mapping qualifier, which makes things a bit clearer.
Reviewed-by: Mark Brown <broonie@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241219173351.1123087-4-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
TCR2_EL1x is a pretty bizarre construct, as it is shared between
TCR2_EL1 and TCR2_EL12. But the latter is obviously only an
accessor to the former.
In order to make things more consistent, upgrade TCR2_EL1x to
a full-blown sysreg definition for TCR2_EL1, and describe TCR2_EL12
as a mapping to TCR2_EL1.
This results in a couple of minor changes to the actual code.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241219173351.1123087-3-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
*EL02 and *_EL12 system registers are actually only accessors for
EL0 and EL1 registers accessed from EL2 when HCR_EL2.E2H==1. They
do not have fields of their own.
To that effect, introduce a 'Mapping' entry, describing which
system register an _EL12 register maps to.
Implementation wise, this is handled the same was as Fields,
which ls only a comment.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241219173351.1123087-2-maz@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
There is a spelling mistake in an ath12k_err error message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/20241217105505.306047-1-colin.i.king@gmail.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Currently, monitor status parse procedure handles all the supported TLV
tags. Each TLV tag has its own data structure for parsing. Now, this
handler is passed the tlv_data as a u8 pointer, so explicit type cast
conversion happens for every TLV tag parsing. Therefore, avoid the
explicit type conversion by changing the tlv_data type from a u8 pointer
to a const void pointer.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Acked-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/20241217084511.2981515-9-quic_periyasa@quicinc.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
The Tx monitor SRNG ring ID does not align with the ath12k 802.11be
hardware architecture. Currently, there is no issue since the Tx monitor
is not enabled. However, in the future, the Tx monitor will be enabled.
Therefore, change the HAL_SRNG_RING_ID_WMAC1_SW2TXMON_BUF0 SRNG ID and
assign the correct start ring ID for the ring type HAL_TX_MONITOR_BUF.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Acked-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/20241217084511.2981515-8-quic_periyasa@quicinc.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Currently, CODING and TXBF are unused masks defined in the HAL Rx monitor
status TLV parsing code path. Therefore, remove the unused masks to
prevent incorrect assumptions for code readers.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Acked-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/20241217084511.2981515-7-quic_periyasa@quicinc.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Currently, an incorrect TID value gets populated in the monitor status Rx
path due to an incorrect bitmap value given to the ffs() built-in helper
function. Therefore, avoid the decrement and directly provide the TID
bitmap to the ffs() built-in helper function for the correct TID update
in the monitor status Rx path.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Acked-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/20241217084511.2981515-6-quic_periyasa@quicinc.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
There is "HAL_PHYRX_GENERICHT_SIG" misspelled as
"HAL_PHYRX_GENERIC_EHT_SIG" in the comments. Fix the spelling.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Acked-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/20241217084511.2981515-5-quic_periyasa@quicinc.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Currently, unused fields are present in the Rx peer statistics
structure. These fields are already present in the same structure
under the ath12k_rx_peer_rate_stats container structure. Therefore,
remove the unused fields from the Rx peer statistics structure.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Acked-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/20241217084511.2981515-4-quic_periyasa@quicinc.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
The following TLV structures and bitmask definitions were inherited from
the ath11k but were not updated for the ath12k 802.11be hardware. These
data structure and bitmask will be used to parse the monitor status
TLV data in the Rx path.
1. hal_rx_ppdu_end_user_stats_ext structure
2. hal_rx_ppdu_end_duration structure
3. HAL_RX_HE_SIG_B2_OFDMA_INFO_INFO0_STA_TXBF bitmask
4. HAL_RX_MPDU_START_INFO1_PEERID bitmask
5. HAL_INVALID_PEERID
6. hal_rx_ppdu_end_user_stats bitmask
Currently, there is no issue since the monitor status Rx path is not
enabled. However, in the future, the monitor status Rx path will be
enabled. Therefore, update the above TLV structures and bitmask to align
with the ath12k 802.11be hardware.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Acked-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/20241217084511.2981515-3-quic_periyasa@quicinc.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Add missing field documentation for HTT_H2T_MSG_TYPE_RX_RING_SELECTION_CFG
command with indentation alignment.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Karthikeyan Periyasamy <quic_periyasa@quicinc.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Acked-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/20241217084511.2981515-2-quic_periyasa@quicinc.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Building the ath12k driver with llvm-18.1.7-x86_64 produces the warning:
drivers/net/wireless/ath/ath12k/mac.c:5606:12: warning: stack frame size (1176) exceeds limit (1024) in 'ath12k_mac_op_sta_state' [-Wframe-larger-than]
ath12k_mac_op_sta_state() itself does not consume much stack, but it
calls ath12k_mac_handle_link_sta_state() which in turn calls
ath12k_mac_station_add(). Since those are both static functions with
only one caller, it is suspected that these both get inlined, and
their stack usage is reported for ath12k_mac_op_sta_state().
A major contributor to the ath12k_mac_station_assoc() stack usage is:
struct ath12k_wmi_peer_assoc_arg peer_arg;
Avoid the excess stack usage by dynamically allocating peer_arg
instead of declaring it on the stack.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20241217202618.1329312-5-kvalo@kernel.org
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Currently when building ath12k with llvm-18.1.7-x86_64 the following warning is
observed:
drivers/net/wireless/ath/ath12k/mac.c:4946:13: warning: stack frame size (1112) exceeds limit (1024) in 'ath12k_sta_rc_update_wk' [-Wframe-larger-than]
A major contributor to the stack usage in this function is:
struct ath12k_wmi_peer_assoc_arg peer_arg;
Avoid the excess stack usage by dynamically allocating peer_arg
instead of declaring it on the stack.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20241217202618.1329312-4-kvalo@kernel.org
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Currently when building ath12k with gcc-14.2.0 the following warning
is observed:
drivers/net/wireless/ath/ath12k/mac.c: In function 'ath12k_bss_assoc':
drivers/net/wireless/ath/ath12k/mac.c:3080:1: warning: the frame size of 1040 bytes is larger than 1024 bytes [-Wframe-larger-than=]
A major contributor to the stack usage in this function is:
struct ath12k_wmi_peer_assoc_arg peer_arg;
Avoid the excess stack usage by dynamically allocating peer_arg
instead of declaring it on the stack.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20241217202618.1329312-3-kvalo@kernel.org
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Building the ath12k driver with llvm-18.1.7-x86_64 produces the warning:
drivers/net/wireless/ath/ath12k/mac.c:10028:12: warning: stack frame size (1080) exceeds limit (1024) in 'ath12k_mac_op_remain_on_channel' [-Wframe-larger-than]
A major contributor to the stack usage in this function is:
struct ath12k_wmi_scan_req_arg arg;
Avoid the excess stack usage by dynamically allocating arg instead of
declaring it on the stack. As part of the effort use __free() for both
this new allocation as well as the existing chan_list allocation, and
since then no central cleanup is required, replace all cleanup gotos
with returns.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://patch.msgid.link/20241217202618.1329312-2-kvalo@kernel.org
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
There is mismatch between the format of monitor destination TLVs received
and the expected format by the current implementation. The received TLVs
are in 64-bit format, while the implementation is designed to handle
32-bit TLVs. This leads to incorrect parsing. Fix it by adding support
for parsing 64-bit TLVs.
Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.3.1-00173-QCAHKSWPL_SILICONZ-1
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
Signed-off-by: P Praneesh <quic_ppranees@quicinc.com>
Acked-by: Kalle Valo <kvalo@kernel.org>
Acked-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Link: https://patch.msgid.link/20241217095058.2725755-1-quic_ppranees@quicinc.com
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Prefer 'ktime_t' over 'struct timespec64' for 'struct ath_chanctx' and
'struct ath_softc' timestamps, choose standard kernel time API over an
ad-hoc math in 'chanctx_event_delta()' and 'ath9k_hw_get_tsf_offset()',
adjust related users. Compile tested only.
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Link: https://patch.msgid.link/20241209155027.636400-3-dmantipov@yandex.ru
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Since 'txq' argument of 'ath_txq_skb_done()' is actually
(mis|un)used, convert the former to local variable and
adjust all related users. Compile tested only.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Link: https://patch.msgid.link/20241209155027.636400-1-dmantipov@yandex.ru
Signed-off-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Currently, the maximum supported physical address space can be
configured as either 48 bits or 52 bits. The only remaining difference
between these in practice is that the former omits the masking and
shifting required to construct TTBR and PTE values, which carry bits #48
and higher disjoint from the rest of the physical address.
The overhead of performing these additional calculations is negligible,
and so there is little reason to retain support for two different
configurations, and we can simply support whatever the hardware
supports.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241212081841.2168124-14-ardb+git@google.com
Signed-off-by: Will Deacon <will@kernel.org>
There are a couple of instances of Kconfig constraints where PAN must be
enabled too if TTBR0 sw PAN is enabled, primarily to avoid dealing with
the modified TTBR0_EL1 sysreg format that is used when 52-bit physical
addressing and/or CnP are enabled (support for either implies support
for hardware PAN as well, which will supersede PAN emulation if both are
available)
Let's simplify this, and always enable ARM64_PAN when enabling TTBR0 sw
PAN. This decouples the PAN configuration from the VA size selection,
permitting us to simplify the latter in subsequent patches. (Note that
PAN and TTBR0 sw PAN can still be disabled after this patch, but not
independently)
To avoid a convoluted circular Kconfig dependency involving KCSAN, make
ARM64_MTE select ARM64_PAN too, instead of depending on it.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241212081841.2168124-13-ardb+git@google.com
Signed-off-by: Will Deacon <will@kernel.org>
The pKVM stage2 mapping code relies on an invalid physical address to
signal to the internal API that only the annotations of descriptors
should be updated, and these are stored in the high bits of invalid
descriptors covering memory that has been donated to protected guests,
and is therefore unmapped from the host stage-2 page tables.
Given that these invalid PAs are never stored into the descriptors, it
is better to rely on an explicit flag, to clarify the API and to avoid
confusion regarding whether or not the output address of a descriptor
can ever be invalid to begin with (which is not the case with LPA2).
That removes a dependency on the logic that reasons about the maximum PA
range, which differs on LPA2 capable CPUs based on whether LPA2 is
enabled or not, and will be further clarified in subsequent patches.
Cc: Quentin Perret <qperret@google.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241212081841.2168124-12-ardb+git@google.com
Signed-off-by: Will Deacon <will@kernel.org>
When the host stage1 is configured for LPA2, the value currently being
programmed into TCR_EL2.T0SZ may be invalid unless LPA2 is configured
at HYP as well. This means kvm_lpa2_is_enabled() is not the right
condition to test when setting TCR_EL2.DS, as it will return false if
LPA2 is only available for stage 1 but not for stage 2.
Similary, programming TCR_EL2.PS based on a limited IPA range due to
lack of stage2 LPA2 support could potentially result in problems.
So use lpa2_is_enabled() instead, and set the PS field according to the
host's IPS, which is capped at 48 bits if LPA2 support is absent or
disabled. Whether or not we can make meaningful use of such a
configuration is a different question.
Cc: stable@vger.kernel.org
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241212081841.2168124-11-ardb+git@google.com
Signed-off-by: Will Deacon <will@kernel.org>
When FEAT_LPA{,2} are not implemented, the ID_AA64MMFR0_EL1.PARange and
TCR.IPS values corresponding with 52-bit physical addressing are
reserved.
Setting the TCR.IPS field to 0b110 (52-bit physical addressing) has side
effects, such as how the TTBRn_ELx.BADDR fields are interpreted, and so
it is important that disabling FEAT_LPA2 (by overriding the
ID_AA64MMFR0.TGran fields) also presents a PARange field consistent with
that.
So limit the field to 48 bits unless LPA2 is enabled, and update
existing references to use the override consistently.
Fixes: 352b0395b505 ("arm64: Enable 52-bit virtual addressing for 4k and 16k granule configs")
Cc: stable@vger.kernel.org
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241212081841.2168124-10-ardb+git@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Currently, LPA2 kernel support implies support for up to 52 bits of
physical addressing, and this is reflected in global definitions such as
PHYS_MASK_SHIFT and MAX_PHYSMEM_BITS.
This is potentially problematic, given that LPA2 hardware support is
modeled as a CPU feature which can be overridden, and with LPA2 hardware
support turned off, attempting to map physical regions with address bits
[51:48] set (which may exist on LPA2 capable systems booting with
arm64.nolva) will result in corrupted mappings with a truncated output
address and bogus shareability attributes.
This means that the accepted physical address range in the mapping
routines should be at most 48 bits wide when LPA2 support is configured
but not enabled at runtime.
Fixes: 352b0395b505 ("arm64: Enable 52-bit virtual addressing for 4k and 16k granule configs")
Cc: stable@vger.kernel.org
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241212081841.2168124-9-ardb+git@google.com
Signed-off-by: Will Deacon <will@kernel.org>
This documents EL3 requirements for FEAT_PMUv3. The register field MDCR_EL3
.TPM needs to be cleared for accesses into PMU registers without any trap
being generated into EL3. PMUv3 registers like PMCCFILTR_EL0, PMCCNTR_EL0
PMCNTENCLR_EL0, PMCNTENSET_EL0, PMCR_EL0, PMEVCNTR<n>_EL0, PMEVTYPER<n>_EL0
etc are already being accessed for perf HW PMU implementation.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20241211065425.1106683-3-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
This documents EL3 requirements for debug architecture. The register field
MDCR_EL3.TDA needs to be cleared for accesses into debug registers without
any trap being generated into EL3. CPU debug registers like DBGBCR<n>_EL1,
DBGBVR<n>_EL1, DBGWCR<n>_EL1, DBGWVR<n>_EL1 and MDSCR_EL1 are already being
accessed for HW breakpoint, watchpoint and debug monitor implementations on
the platform.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20241211065425.1106683-2-anshuman.khandual@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Fabrice Gasnier found and fixed a regression I introduced with v6.13-rc1
when converting the stm32 pwm driver to support the new waveform stuff.
On some hardware variants this completely broke the driver.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEP4GsaTp6HlmJrf7Tj4D7WH0S/k4FAmdj+fAACgkQj4D7WH0S
/k4I3QgAr4NqinQm3WrdNG2bTzTbd0TkBZDjm42l8A++v0ax+tqNlbh8SyF0jIw3
mlC/SSpAZGZJV8+uyJrjR4324DGLhGm0hfZxb5ijhbB0lI/VWmS8DorZGTmhplzX
mtfML9DL/mAfxqXSIhxaZgSnFfnajrx+8Gr7TqLaUUxXMt+7bWvIRxi2zNS+H9DR
acd2DJGxAG7hTHWyR/F8YYudQbZh/JNI29WRmDaf4ZzJCNe6/qFTd28tFOQ7MP0+
Gr6bhnnfZaP/2Mg1OJpHo+vG8d1bJXnnmwlPAmR4hccT94wtss5AsN6wzBlxXptW
x51wDv0hld/rz0Z8ojOqz1CY3PMdTw==
=i/KE
-----END PGP SIGNATURE-----
Merge tag 'pwm/for-6.13-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux
Pull pwm fix from Uwe Kleine-König:
"Fix regression in pwm-stm32 driver when converting to new waveform
support
Fabrice Gasnier found and fixed a regression I introduced with
v6.13-rc1 when converting the stm32 pwm driver to support the new
waveform stuff. On some hardware variants this completely broke the
driver"
* tag 'pwm/for-6.13-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux:
pwm: stm32: Fix complementary output in round_waveform_tohw()
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmdi9NcACgkQiiy9cAdy
T1EBZwv+JAGDnHiDVm4FuYg4Q0jNHjmSoJ4HkiSNSB/hKtdmA9q6xHJ76Nfrbzig
1EczCe9vAf5i9yYs4acPTT0MOK9lCyarFCYY02Dk3+/JPuyFLC5VkjplvryMGeXm
9Py07XSIFcx3EZ1Ws597OMVvTtdTzRp8EXGt6xoMmqgQhma7ggSgvXCmoC4nCf/S
I2CMGx3ieeF5aPiosf8YEAjn4+MZJr4CI5hut5XZYgxND27QiB5gdPMSiHReANyi
hBF8u2mwyayztlEkknmU/7TMSwJiO3zsCMLU/wECJaZgDBb5LcuWS0f9BmI8eCgj
6gk+aX6aQ6Xnx0rNdl5ezq/lQ2/j6hv2e542KQYZwX9d8wKQbSye/OLvpGxlTY6i
lir2XP1W1zVAz2mAhbOw9PbAKlI24cHZamGWhGp3mkMdNddslYxJ+iAKnX6z0cv+
KkPEWET4/uyTySygBPnAOMfhTzqMiuwvPswwjYr3TXugQhArnPtOoj1lVLXPArFx
CqJMrnAy
=vUtk
-----END PGP SIGNATURE-----
Merge tag 'v6.13-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- Two fixes for better handling maximum outstanding requests
- Fix simultaneous negotiate protocol race
* tag 'v6.13-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: conn lock to serialize smb2 negotiate
ksmbd: fix broken transfers when exceeding max simultaneous operations
ksmbd: count all requests in req_running counter
If a PCIe port only supports a single speed, enabling bandwidth control
is pointless: There's no need to monitor autonomous speed changes, nor
can the speed be changed.
Not enabling it saves a small amount of memory and compute resources,
but also fixes a boot hang reported by Niklas: It occurs when enabling
bandwidth control on Downstream Ports of Intel JHL7540 "Titan Ridge 2018"
Thunderbolt controllers. The ports only support 2.5 GT/s in accordance
with USB4 v2 sec 11.2.1, so the present commit works around the issue.
PCIe r6.2 sec 8.2.1 prescribes that:
"A device must support 2.5 GT/s and is not permitted to skip support
for any data rates between 2.5 GT/s and the highest supported rate."
Consequently, bandwidth control is currently only disabled if a port
doesn't support higher speeds than 2.5 GT/s. However the Implementation
Note in PCIe r6.2 sec 7.5.3.18 cautions:
"It is strongly encouraged that software primarily utilize the
Supported Link Speeds Vector instead of the Max Link Speed field,
so that software can determine the exact set of supported speeds on
current and future hardware. This can avoid software being confused
if a future specification defines Links that do not require support
for all slower speeds."
In other words, future revisions of the PCIe Base Spec may allow gaps
in the Supported Link Speeds Vector. To be future-proof, don't just
check whether speeds above 2.5 GT/s are supported, but rather check
whether *more than one* speed is supported.
Fixes: 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller")
Closes: https://lore.kernel.org/r/db8e457fcd155436449b035e8791a8241b0df400.camel@kernel.org
Link: https://lore.kernel.org/r/3564908a9c99fc0d2a292473af7a94ebfc8f5820.1734428762.git.lukas@wunner.de
Reported-by: Niklas Schnelle <niks@kernel.org>
Tested-by: Niklas Schnelle <niks@kernel.org>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Jonathan Cameron <Jonthan.Cameron@huawei.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
The Supported Link Speeds Vector in the Link Capabilities 2 Register
indicates the *supported* link speeds. The Max Link Speed field in the
Link Capabilities Register indicates the *maximum* of those speeds.
pcie_get_supported_speeds() neglects to honor the Max Link Speed field and
will thus incorrectly deem higher speeds as supported. Fix it.
One user-visible issue addressed here is an incorrect value in the sysfs
attribute "max_link_speed".
But the main motivation is a boot hang reported by Niklas: Intel JHL7540
"Titan Ridge 2018" Thunderbolt controllers supports 2.5-8 GT/s speeds,
but indicate 2.5 GT/s as maximum. Ilpo recalls seeing this on more
devices. It can be explained by the controller's Downstream Ports
supporting 8 GT/s if an Endpoint is attached, but limiting to 2.5 GT/s
if the port interfaces to a PCIe Adapter, in accordance with USB4 v2
sec 11.2.1:
"This section defines the functionality of an Internal PCIe Port that
interfaces to a PCIe Adapter. [...]
The Logical sub-block shall update the PCIe configuration registers
with the following characteristics: [...]
Max Link Speed field in the Link Capabilities Register set to 0001b
(data rate of 2.5 GT/s only).
Note: These settings do not represent actual throughput. Throughput
is implementation specific and based on the USB4 Fabric performance."
The present commit is not sufficient on its own to fix Niklas' boot hang,
but it is a prerequisite: A subsequent commit will fix the boot hang by
enabling bandwidth control only if more than one speed is supported.
The GENMASK() macro used herein specifies 0 as lowest bit, even though
the Supported Link Speeds Vector ends at bit 1. This is done on purpose
to avoid a GENMASK(0, 1) macro if Max Link Speed is zero. That macro
would be invalid as the lowest bit is greater than the highest bit.
Ilpo has witnessed a zero Max Link Speed on Root Complex Integrated
Endpoints in particular, so it does occur in practice. (The Link
Capabilities Register is optional on RCiEPs per PCIe r6.2 sec 7.5.3.)
Fixes: d2bd39c0456b ("PCI: Store all PCIe Supported Link Speeds")
Closes: https://lore.kernel.org/r/70829798889c6d779ca0f6cd3260a765780d1369.camel@kernel.org
Link: https://lore.kernel.org/r/fe03941e3e1cc42fb9bf4395e302bff53ee2198b.1734428762.git.lukas@wunner.de
Reported-by: Niklas Schnelle <niks@kernel.org>
Tested-by: Niklas Schnelle <niks@kernel.org>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Add a more advanced BAR test that writes all BARs in one go, and then reads
them back and verifies that the value matches the BAR number bitwise OR'ed
with offset, this allows us to verify:
- The BAR number was what we intended to read
- The offset was what we intended to read
This allows us to detect potential address translation issues on the EP.
Reading back the BAR directly after writing will not allow us to detect the
case where inbound address translation on the endpoint incorrectly causes
multiple BARs to be redirected to the same memory region (within the EP).
Link: https://lore.kernel.org/r/20241116032045.2574168-2-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Most boards using the pcie-dw-rockchip PCIe controller lack standard
hotplug support.
Thus, when an endpoint is attached to the SoC, users have to rescan the bus
manually to enumerate the device. This can be avoided by using the
'dll_link_up' interrupt in the combined system interrupt 'sys'.
Once the 'dll_link_up' irq is received, the bus underneath the host bridge
is scanned to enumerate PCIe endpoint devices.
This commit implements the same functionality that was implemented in the
DWC based pcie-qcom driver in commit 4581403f6792 ("PCI: qcom: Enumerate
endpoints based on Link up event in 'global_irq' interrupt").
The Root Complex specific device tree binding for pcie-dw-rockchip already
has the 'sys' interrupt marked as required, so there is no need to update
the device tree binding. This also means that we can request the 'sys' IRQ
unconditionally.
Link: https://lore.kernel.org/r/20241127145041.3531400-2-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
The test BAR is on the EP side is allocated using pci_epf_alloc_space(),
which allocates the backing memory using dma_alloc_coherent(), which will
return zeroed memory regardless of __GFP_ZERO was set or not.
This means that running a new version of pci-endpoint-test.c (host side)
with an old version of pci-epf-test.c (EP side) will not see any
capabilities being set (as intended), so this is backwards compatible.
Additionally, the EP side always allocates at least 128 bytes for the test
BAR (excluding the MSI-X table), this means that adding another register at
offset 0x30 is still within the 128 available bytes.
For now, we only add the CAP_UNALIGNED_ACCESS capability.
If CAP_UNALIGNED_ACCESS is set, that means that the EP side supports
reading/writing to an address without any alignment requirements.
Thus, if CAP_UNALIGNED_ACCESS is set, make sure that the host side does
not add any extra padding to the buffers that we allocate (which was only
done in order to get the buffers to satisfy certain alignment requirements
by the endpoint controller).
Link: https://lore.kernel.org/r/20241203063851.695733-6-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
The test BAR is on the EP side is allocated using pci_epf_alloc_space(),
which allocates the backing memory using dma_alloc_coherent(), which will
return zeroed memory regardless of __GFP_ZERO was set or not.
This means that running a new version of pci-endpoint-test.c (host side)
with an old version of pci-epf-test.c (EP side) will not see any
capabilities being set (as intended), so this is backwards compatible.
Additionally, the EP side always allocates at least 128 bytes for the test
BAR (excluding the MSI-X table), this means that adding another register at
offset 0x30 is still within the 128 available bytes.
For now, we only add the CAP_UNALIGNED_ACCESS capability.
Set CAP_UNALIGNED_ACCESS if the EPC driver can handle any address (because
it implements the .align_addr callback).
Link: https://lore.kernel.org/r/20241203063851.695733-5-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
After commit
746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM")
amdgpu started seeing the following warning:
[ ] workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work [gpu_sched] is flushing !WQ_MEM_RECLAIM events:amdgpu_device_delay_enable_gfx_off [amdgpu]
...
[ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
...
[ ] Call Trace:
[ ] <TASK>
...
[ ] ? check_flush_dependency+0xf5/0x110
...
[ ] cancel_delayed_work_sync+0x6e/0x80
[ ] amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu]
[ ] amdgpu_ring_alloc+0x40/0x50 [amdgpu]
[ ] amdgpu_ib_schedule+0xf4/0x810 [amdgpu]
[ ] ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched]
[ ] amdgpu_job_run+0xaa/0x1f0 [amdgpu]
[ ] drm_sched_run_job_work+0x257/0x430 [gpu_sched]
[ ] process_one_work+0x217/0x720
...
[ ] </TASK>
The intent of the verifcation done in check_flush_depedency is to ensure
forward progress during memory reclaim, by flagging cases when either a
memory reclaim process, or a memory reclaim work item is flushed from a
context not marked as memory reclaim safe.
This is correct when flushing, but when called from the
cancel(_delayed)_work_sync() paths it is a false positive because work is
either already running, or will not be running at all. Therefore
cancelling it is safe and we can relax the warning criteria by letting the
helper know of the calling context.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: fca839c00a12 ("workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue")
References: 746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM")
Cc: Tejun Heo <tj@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: <stable@vger.kernel.org> # v4.5+
Signed-off-by: Tejun Heo <tj@kernel.org>
Before, every time fdinfo is queried we try to lock all the BOs in the
VM and calculate memory usage from scratch. This works okay if the
fdinfo is rarely read and the VMs don't have a ton of BOs. If either of
these conditions is not true, we get a massive performance hit.
In this new revision, we track the BOs as they change states. This way
when the fdinfo is queried we only need to take the status lock and copy
out the usage stats with minimal impact to the runtime performance. With
this new approach however, we would no longer be able to track active
buffers.
Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241219151411.1150-6-Yunxiang.Li@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
amdgpu_vm_bo_invalidate doesn't use the adev parameter and not all
callers have a reference to adev handy, so remove it for cleanliness.
Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241219151411.1150-5-Yunxiang.Li@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
Define how to handle buffers with multiple possible placement so we
don't get incompatible implementations. Callout the resident requirement
for drm-purgeable- explicitly. Remove the requirement for there to be
only drm-memory- or only drm-resident-, it's not what's implemented and
having both is better for back-compat. Also re-order the paragraphs to
flow better.
Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241219151411.1150-4-Yunxiang.Li@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
When memory stats is generated fresh everytime by going though all the
BOs, their active information is quite easy to get. But if the stats are
tracked with BO's state this becomes harder since the job scheduling
part doesn't really deal with individual buffers.
Make drm-active- optional to enable amdgpu to switch to the second
method.
Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241219151411.1150-3-Yunxiang.Li@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
Add a helper to check if the memory stats is zero, this will be used to
check for memory accounting errors.
Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241219151411.1150-2-Yunxiang.Li@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
The imx93 is the first supported DDR PMU that supports read transaction,
write transaction and read beats events which corresponding respecitively
to counter 2, 3 and 4.
However, transaction-based AXI match has low accuracy when get total bits
compared to beats-based. And imx93 doesn't assign AXI_ID to each master.
So axi filter is not used widely on imx93. This could be regards as AXI
filter version 1.
To improve the AXI filter capability, imx95 supports 1 read beats and 3
write beats event which corresponding respecitively to counter 2-5. imx95
also detailed AXI_ID allocation so that most of the master could be count
individually. This could be regards as AXI filter version 2.
This will introduce AXI filter version to refactor the driver and support
better extension, such as coming imx943. This is also a potential fix on
imx91 when configure axi filter.
Fixes: 44798fe136dc ("perf: imx_perf: add support for i.MX91 platform")
Cc: stable@vger.kernel.org
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20241212065708.1353513-1-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
Remove the redundant .hwapic_irr_update() ops.
If a vCPU has APICv enabled, KVM updates its RVI before VM-enter to L1
in vmx_sync_pir_to_irr(). This guarantees RVI is up-to-date and aligned
with the vIRR in the virtual APIC. So, no need to update RVI every time
the vIRR changes.
Note that KVM never updates vmcs02 RVI in .hwapic_irr_update() or
vmx_sync_pir_to_irr(). So, removing .hwapic_irr_update() has no
impact to the nested case.
Signed-off-by: Chao Gao <chao.gao@intel.com>
Link: https://lore.kernel.org/r/20241111085947.432645-1-chao.gao@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Move the handling of a nested posted interrupt notification that is
unblocked by nested VM-Enter (unblocks L1 IRQs when ack-on-exit is enabled
by L1) from VM-Enter emulation to vmx_check_nested_events(). To avoid a
pointless forced immediate exit, i.e. to not regress IRQ delivery latency
when a nested posted interrupt is pending at VM-Enter, block processing of
the notification IRQ if and only if KVM must block _all_ events. Unlike
injected events, KVM doesn't need to actually enter L2 before updating the
vIRR and vmcs02.GUEST_INTR_STATUS, as the resulting L2 IRQ will be blocked
by hardware itself, until VM-Enter to L2 completes.
Note, very strictly speaking, moving the IRQ from L2's PIR to IRR before
entering L2 is still technically wrong. But, practically speaking, only
an L1 hypervisor or an L0 userspace that is deliberately checking event
priority against PIR=>IRR processing can even notice; L2 will see
architecturally correct behavior, as KVM ensures the VM-Enter is finished
before doing anything that would effectively preempt the PIR=>IRR movement.
Reported-by: Chao Gao <chao.gao@intel.com>
Link: https://lore.kernel.org/r/20241101191447.1807602-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>