Commit Graph

3096 Commits

Author SHA1 Message Date
Krzysztof Kozlowski
bc55630c65 thermal/drivers/qcom-tsens: Simplify with dev_err_probe()
Error handling in probe() can be a bit simpler with dev_err_probe().

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-10-241644e2b6e0@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15 13:31:41 +02:00
Krzysztof Kozlowski
ecfee9176b thermal/drivers/qcom-spmi-adc-tm5: Simplify with dev_err_probe()
Error handling in probe() can be a bit simpler with dev_err_probe().

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-9-241644e2b6e0@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15 13:31:41 +02:00
Krzysztof Kozlowski
d0b297e76b thermal/drivers/imx: Simplify with dev_err_probe()
Error handling in probe() can be a bit simpler with dev_err_probe().

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-8-241644e2b6e0@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15 13:31:40 +02:00
Krzysztof Kozlowski
e9ac90242b thermal/drivers/imx: Simplify probe() with local dev variable
Simplify the probe() function by using local 'dev' instead of
&pdev->dev.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-7-241644e2b6e0@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15 13:31:40 +02:00
Krzysztof Kozlowski
3e1a0680bb thermal/drivers/hisi: Simplify with dev_err_probe()
Error handling in probe() can be a bit simpler with dev_err_probe().

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-6-241644e2b6e0@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15 13:31:40 +02:00
Krzysztof Kozlowski
ca6176693f thermal/drivers/exynos: Simplify with dev_err_probe()
Error handling in probe() can be a bit simpler with dev_err_probe().

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-5-241644e2b6e0@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15 13:31:40 +02:00
Krzysztof Kozlowski
4a6cf76edf thermal/drivers/exynos: Simplify probe() with local dev variable
Simplify the probe() function by using local 'dev' instead of
&pdev->dev.  While touching devm_kzalloc(), use preferred sizeof(*)
syntax.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-4-241644e2b6e0@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15 13:31:40 +02:00
Krzysztof Kozlowski
9d55cb3ba3 thermal/drivers/broadcom: Simplify with dev_err_probe()
Error handling in probe() can be a bit simpler with dev_err_probe().

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-3-241644e2b6e0@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15 13:31:40 +02:00
Krzysztof Kozlowski
fd972a1745 thermal/drivers/broadcom: Simplify probe() with local dev variable
Simplify the probe() function by using local 'dev' instead of
&pdev->dev.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-2-241644e2b6e0@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15 13:31:40 +02:00
Krzysztof Kozlowski
e90c369cc2 thermal/drivers/broadcom: Fix race between removal and clock disable
During the probe, driver enables clocks necessary to access registers
(in get_temp()) and then registers thermal zone with managed-resources
(devm) interface.  Removal of device is not done in reversed order,
because:
1. Clock will be disabled in driver remove() callback - thermal zone is
   still registered and accessible to users,
2. devm interface will unregister thermal zone.

This leaves short window between (1) and (2) for accessing the
get_temp() callback with disabled clock.

Fix this by enabling clock also via devm-interface, so entire cleanup
path will be in proper, reversed order.

Fixes: 8454c8c09c ("thermal/drivers/bcm2835: Remove buggy call to thermal_of_zone_unregister")
Cc: stable@vger.kernel.org
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240709-thermal-probe-v1-1-241644e2b6e0@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15 13:31:40 +02:00
Chen-Yu Tsai
a5d4afb92e thermal/drivers/mediatek/lvts_thermal: Provide default calibration data
On some pre-production hardware, the SoCs do not contain calibration
data for the thermal sensors. The downstream drivers provide default
values that sort of work, instead of having the thermal sensors not
work at all.

Port the default values to the upstream driver. These values are from
the ChromeOS kernels, which sadly do not cover the MT7988.

Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20240620092306.2352606-1-wenst@chromium.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15 13:31:39 +02:00
Julien Panis
be3e224ec5 dt-bindings: thermal: mediatek: Fix thermal zone definitions for MT8188
Fix thermal zone names for consistency with the other SoCs:
- GPU0 must be used as the first GPU item.
- SOCx deal with audio DSP, video, and infra subsystems.

The naming must be fixed "atomically" so compilation does not break.
As a result, the change is made in the dt-bindings and in the LVTS
driver within a single commit, despite the checkpatch warning.

The definitions can be safely modified here because they are used only
in the LVTS driver, which is modified accordingly, and have not yet
been included in a released kernel.

Fixes: 78c88534e5 ("dt-bindings: thermal: mediatek: Add LVTS thermal controller definition for MT8188")
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Julien Panis <jpanis@baylibre.com>
Link: https://lore.kernel.org/r/20240603-mtk-thermal-mt818x-dtsi-v7-2-8c8e3c7a3643@baylibre.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15 13:31:39 +02:00
Julien Panis
6b04928e83 dt-bindings: thermal: mediatek: Fix thermal zone definition for MT8186
Fix a thermal zone name for consistency with the other SoCs:
MFG contains GPU, the latter is more specific and must be used here.

The naming must be fixed "atomically" so compilation does not break.
As a result, the change is made in the dt-bindings and in the LVTS
driver within a single commit, despite the checkpatch warning.

The definition can be safely modified here because it is used only
in the LVTS driver, which is modified accordingly, and has not yet
been included in a released kernel.

Fixes: a2ca202350 ("dt-bindings: thermal: mediatek: Add LVTS thermal controller definition for MT8186")
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Julien Panis <jpanis@baylibre.com>
Link: https://lore.kernel.org/r/20240603-mtk-thermal-mt818x-dtsi-v7-1-8c8e3c7a3643@baylibre.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15 13:31:39 +02:00
Théo Lebrun
854a8e208c thermal/drivers/k3_j72xx_bandgap: Implement suspend/resume support
This add suspend-to-ram support.

The derived_table is kept-as is, so the resume is only about
pm_runtime_* calls and restoring the same registers as the probe.

Extract the hardware initialization procedure to a function called at
both probe-time & resume-time.

The probe-time loop is split in two to ensure doing the hardware
initialization before registering thermal zones. That ensures our
callbacks cannot be called while in bad state.

The 100ms delay in the hardware initialization sequence was removed.
It was initially added to be sure the thresholds are programmed before
enabling the interrupt, but in fact it's not needed (tested on J7200
platform).

Signed-off-by: Théo Lebrun <theo.lebrun@bootlin.com>
Acked-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Thomas Richard <thomas.richard@bootlin.com>
Link: https://lore.kernel.org/r/20240425153238.498750-1-thomas.richard@bootlin.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-07-15 13:31:39 +02:00
Niklas Söderlund
f996e2b17a thermal/drivers/renesas/rcar: Add dependency on OF
The R-Car thermal driver depends on OF, describe this.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240506154011.344324-3-niklas.soderlund+renesas@ragnatech.se
2024-07-15 13:31:39 +02:00
Niklas Söderlund
9d617949d4 thermal/drivers/renesas: Group all renesas thermal drivers together
Move all Renesas thermal drivers to a vendor specific directory.

All drivers are moved verbatim apart from the updated include path for
thermal_hwmon.h.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240506154011.344324-2-niklas.soderlund+renesas@ragnatech.se
2024-07-15 13:31:38 +02:00
Rafael J. Wysocki
3669716401 thermal: core: Add sanity checks for polling_delay and passive_delay
If polling_delay is nonzero and passive_delay is greater than
polling_delay, the thermal zone temperature will be updated less
often when tz->passive is nonzero, which is not as expected.  Make
the thermal zone registration fail with -EINVAL in that case as
this is a clear thermal zone configuration mistake.

If polling_delay is nonzero and passive_delay is 0, which is regarded
as a valid thermal zone configuration, the thermal zone will use polling
except when tz->passive is nonzero.  However, the expected behavior in
that case is to continue temperature polling with the same delay value
regardless of tz->passive, so set passive_delay to the polling_delay
value then.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/5802156.DvuYhMxLoT@rjwysocki.net
2024-07-12 15:14:57 +02:00
Rafael J. Wysocki
5b674baa59 thermal: trip: Fold __thermal_zone_get_trip() into its caller
Because __thermal_zone_get_trip() is only called by thermal_zone_get_trip()
now, fold the former into the latter.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/22339769.EfDdHjke4D@rjwysocki.net
2024-07-12 15:14:56 +02:00
Rafael J. Wysocki
0728c81087 thermal: trip: Pass trip pointer to .set_trip_temp() thermal zone callback
Out of several drivers implementing the .set_trip_temp() thermal zone
operation, three don't actually use the trip ID argument passed to it,
two call __thermal_zone_get_trip() to get a struct thermal_trip
corresponding to the given trip ID, and the other use the trip ID as an
index into their own data structures with the assumption that it will
always match the ordering of entries in the trips table passed to the
core during thermal zone registration, which is fragile and not really
guaranteed.

Even though the trip IDs used by the core are in fact their indices in the
trips table passed to it by the thermal zone creator, that is purely a
matter of convenience and should not be relied on for correctness.

For this reason, modify trip_point_temp_store() to pass a (const) trip
pointer to .set_trip_temp() and adjust the drivers implementing it
accordingly.

This helps to simplify the drivers invoking __thermal_zone_get_trip()
from their .set_trip_temp() callback functions because they will not
need to do it now and the other drivers can store their internal
trip indices in the priv field in struct thermal_trip and their
.set_trip_temp() callback functions can get those indices from there.

The intel_quark_dts thermal driver can instead use the trip type to
determine the requisite trip index.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/8392906.T7Z3S40VBb@rjwysocki.net
[ rjw: Add missing colon and 2 empty code lines ]
[ rjw: Add missing change in imx_thermal.c and adjust the changelog ]
[ rjw: Drop an unused local variable ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-07-12 15:14:01 +02:00
Rafael J. Wysocki
81caa5d519 thermal: imx: Drop critical trip check from imx_set_trip_temp()
Because the IMX thermal driver does not flag its critical trip as
writable, imx_set_trip_temp() will never be invoked for it and so the
critical trip check can be dropped from there.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2272035.iZASKD2KPV@rjwysocki.net
2024-07-10 13:33:33 +02:00
Rafael J. Wysocki
462be1c353 Merge back thermal control material for 6.11. 2024-07-10 13:01:38 +02:00
Zhang Rui
b755367602 thermal: intel: hfi: Give HFI instances package scope
The Intel Software Developer's Manual defines the scope of HFI (registers
and memory buffer) as a package. Use package scope(*) in the software
representation of an HFI instance.

Using die scope in HFI instances has the effect of creating multiple
conflicting instances for the same package: each instance allocates its
own memory buffer and configures the same package-level registers.
Specifically, only one of the allocated memory buffers can be set in the
MSR_IA32_HW_FEEDBACK_PTR register. CPUs get incorrect HFI data from the
table.

The problem does not affect current HFI-capable platforms because they
all have single-die processors.

(*) We used die scope for HFI instances because there had been
    processors with packages enumerated as dies. None of those systems
    supported HFI, though. If such a system emerged, it would need to
    be quirked.

Co-developed-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Link: https://patch.msgid.link/20240703055445.125362-1-rui.zhang@intel.com
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-07-09 18:29:07 +02:00
Rafael J. Wysocki
d1fbf18a0f thermal: trip: Add conversion macros for thermal trip priv field
Some drivers will need to store integers in the priv field of struct
thermal_trip, so add conversion macros for doing this in a consistent
way and switch over the int340x_thermal driver that already does it and
uses custom conversion functions to using the new macros.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/3297884.aeNJFYEL58@rjwysocki.net
2024-07-09 18:22:59 +02:00
Rafael J. Wysocki
463b86fed2 thermal: helpers: Introduce thermal_trip_is_bound_to_cdev()
Introduce a new helper function thermal_trip_is_bound_to_cdev() for
checking whether or not a given trip point has been bound to a given
cooling device.

The primary user of it will be the Tegra thermal driver.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/13545762.uLZWGnKmhe@rjwysocki.net
2024-07-09 18:22:59 +02:00
Rafael J. Wysocki
d05374dee2 thermal: core: Change passive_delay and polling_delay data type
It is better to use unsigned int as the data type for the passive_delay
and polling_delay arguments of thermal_zone_device_register_with_trips()
because they are implicitly cast to unsigned int anyway in
thermal_set_delay_jiffies() and if they happen to be negative at that
point, the resulting behavior may not be as desired.

Update the thermal_zone_device_register_with_trips() definition
accordingly.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://patch.msgid.link/5803791.DvuYhMxLoT@rjwysocki.net
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-07-09 18:19:24 +02:00
Rafael J. Wysocki
94eacc1c58 thermal: core: Fix list sorting in __thermal_zone_device_update()
The order in which lists are sorted in __thermal_zone_device_update()
is reverse with respect to what it should be due to a mistake in
thermal_trip_notify_cmp().

Fix it and observe that it is not necessary to sort the lists in
different orders.  They can both be sorted in ascending order if
way_down_list is walked in reverse order which allows the code to
be slightly more straightforward (and less prone to silly mistakes).

Fixes: 7454f2c42c ("thermal: core: Sort trip point crossing notifications by temperature")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/12481676.O9o76ZdvQC@rjwysocki.net
2024-07-08 17:24:22 +02:00
Rafael J. Wysocki
a8a2617744 thermal: core: Call monitor_thermal_zone() if zone temperature is invalid
Commit 202aa0d4bb ("thermal: core: Do not call handle_thermal_trip()
if zone temperature is invalid") caused __thermal_zone_device_update()
to return early if the current thermal zone temperature was invalid.

This was done to avoid running handle_thermal_trip() and governor
callbacks in that case which led to confusion.  However, it went too
far because monitor_thermal_zone() still needs to be called even when
the zone temperature is invalid to ensure that it will be updated
eventually in case thermal polling is enabled and the driver has no
other means to notify the core of zone temperature changes (for example,
it does not register an interrupt handler or ACPI notifier).

Also if the .set_trips() zone callback is expected to set up monitoring
interrupts for a thermal zone, it has to be provided with valid
boundaries and that can only happen if the zone temperature is known.

Accordingly, to ensure that __thermal_zone_device_update() will
run again after a failing zone temperature check, make it call
monitor_thermal_zone() regardless of whether or not the zone
temperature is valid and make the latter schedule a thermal zone
temperature update if the zone temperature is invalid even if
polling is not enabled for the thermal zone.

Fixes: 202aa0d4bb ("thermal: core: Do not call handle_thermal_trip() if zone temperature is invalid")
Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2764814.mvXUDI8C0e@rjwysocki.net
[ rjw: Changed THERMAL_RECHECK_DELAY_MS to 250 ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-07-04 19:01:59 +02:00
Krzysztof Kozlowski
4acab508eb thermal: core: constify 'type' in devm_thermal_of_cooling_device_register()
The 'type' string passed to thermal_of_cooling_device_register() is a
'const char *', so do the same in the devm interface.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20240703083141.96013-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-07-04 14:25:20 +02:00
Nícolas F. R. A. Prado
aaa18ff54b thermal: gov_power_allocator: Return early in manage if trip_max is NULL
Commit da781936e7 ("thermal: gov_power_allocator: Allow binding
without trip points") allowed the governor to bind even when trip_max
is NULL. This allows a NULL pointer dereference to happen in the manage
callback.

Add an early return to prevent it, since the governor is expected to not do
anything in this case.

Fixes: da781936e7 ("thermal: gov_power_allocator: Allow binding without trip points")
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Link: https://patch.msgid.link/20240702-power-allocator-null-trip-max-v1-1-47a60dc55414@collabora.com
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-07-04 13:35:50 +02:00
Rafael J. Wysocki
6fb75c9671 thermal: uniphier: Use thermal_zone_for_each_trip() for walking trip points
It is generally inefficient to iterate over trip indices and call
thermal_zone_get_trip() every time to get the struct thermal_trip
corresponding to the given trip index, so modify the uniphier thermal
driver to use thermal_zone_for_each_trip() for walking trips.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Link: https://patch.msgid.link/2148114.bB369e8A3T@kreacher
[ rjw: Add missing return statement, remove unused local variable ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-07-04 13:25:31 +02:00
Rafael J. Wysocki
efde8bfdc1 Merge back thermal control material for v6.11. 2024-06-27 21:22:15 +02:00
Rafael J. Wysocki
06d55c4278 Merge branch 'thermal-core'
Merge thermal core changes for v6.11:

 - Fix and clean up several minor shortcomings in thermal debug (Rafael
   Wysocki).

 - Rename __thermal_zone_set_trips() to thermal_zone_set_trips() and
   make it use trip thresholds (Rafael Wysocki).

 - Use READ_ONCE() for lockless access to trip temperature and
   hysteresis (Rafael Wysocki).

 - Drop unnecessary cooling device target state checks from the
   Bang-Bang thermal governor (Rafael Wysocki).

 - Avoid invoking thermal governor .trip_crossed() callback for critical
   and hot trip points (Rafael Wysocki).

* thermal-core:
  thermal: core: Avoid calling .trip_crossed() for critical and hot trips
  thermal: gov_bang_bang: Drop unnecessary cooling device target state checks
  thermal: trip: Use READ_ONCE() for lockless access to trip properties
  thermal: trip: Make thermal_zone_set_trips() use trip thresholds
  thermal: trip: Rename __thermal_zone_set_trips() to thermal_zone_set_trips()
  thermal: trip: Use common set of trip type names
  thermal/debugfs: Move some statements from under thermal_dbg->lock
  thermal/debugfs: Compute maximum temperature for mitigation episode as a whole
  thermal/debugfs: Adjust check for trips without statistics in tze_seq_show()
  thermal/debugfs: Fix up units in "mitigations" files
  thermal/debugfs: Print mitigation timestamp value in milliseconds
  thermal/debugfs: Do not extend mitigation episodes beyond system resume
  thermal/debugfs: Use helper to update trip point overstepping duration
2024-06-25 16:27:38 +02:00
Rafael J. Wysocki
529038146b thermal: gov_step_wise: Go straight to instance->lower when mitigation is over
Commit b684682698 ("thermal: gov_step_wise: Restore passive polling
management") attempted to fix a Step-Wise thermal governor issue
introduced by commit 042a3d80f1 ("thermal: core: Move passive polling
management to the core"), which caused the governor to leave cooling
devices in high states, by partially reverting that commit.

However, this turns out to be insufficient on some systems due to
interactions between the governor code restored by commit b684682698
and the passive polling management in the thermal core.

For this reason, revert commit b684682698 and make the governor set
the target cooling device state to the "lower" one as soon as the zone
temperature falls below the threshold of the trip point corresponding
to the given thermal instance, which means that thermal mitigation is
not necessary any more.

Before this change the "lower" cooling device state would be reached in
steps through the passive polling mechanism which was questionable for
three reasons: (1) cooling device were kept in high states when that was
not necessary (and it could adversely impact performance), (2) it only
worked for thermal zones with nonzero passive_delay_jiffies value, and
(3) passive polling belongs to the core and should not be hijacked by
governors for their internal purposes.

Fixes: b684682698 ("thermal: gov_step_wise: Restore passive polling management")
Closes: https://lore.kernel.org/linux-pm/6759ce9f-281d-4fcd-bb4c-b784a1cc5f6e@oldschoolsolutions.biz
Reported-by: Jens Glathe <jens.glathe@oldschoolsolutions.biz>
Tested-by: Jens Glathe <jens.glathe@oldschoolsolutions.biz>
Link: https://patch.msgid.link/12464461.O9o76ZdvQC@rjwysocki.net
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Steev Klimaszewski <steev@kali.org>
Tested-by: Johan Hovold <johan+linaro@kernel.org>
2024-06-25 14:37:05 +02:00
Linus Torvalds
fbe7ef3f2f Thermal control fixes for 6.10-rc5
- Remove the filtered mode for mt8188 from lvts_thermal as it is not
    supported on this platform and fail the lvts_thermal initialization
    when the golden temperature is zero as that means the efuse data is
    not correctly set (Julien Panis).
 
  - Update the processor_thermal part of the Intel int340x driver to
    support shared interrupts as the processor thermal device interrupt
    may in fact be shared with PCI devices (Srinivas Pandruvada).
 
  - Synchronize the suspend-prepare and post-suspend actions of the
    thermal PM notifier to avoid a destructive race condition and
    change the priority of that notifier to the minimum to avoid
    interference between the work items spawned by it and the other
    PM notifiers during system resume (Rafael Wysocki).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmZ1clgSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxftEQAKE9MJHvo1zTgGq3jU198pYj6/oIf4F8
 GR8wTSIhmSO+YgUINbjcGaUoCbwROeaLzLDyAXsrG0vS1t1/c5TGGujvQnMCnrdz
 rSwGoXtJeRK2yYNUWz/Mt1e/ai04sn/i9/9gZJOagRQaas45SdUIxLqpQs17R5cP
 R/8tFAyjMVyQ1MvI4mP3zK1yvE4fBeC18ZGKXNM57tzGBr0dWcSLrVJH/vS1/3Zx
 947LCzngfw8rPx1v+1LSIaCja62StUHBfmkTHnXDaegCFaWCuyUGgHV7yV3RW5sc
 +jQfMo6WH+O10TWvP28tc9dwamB0kfGr7oZXDH0ucpTwg621tThQZL/urxcAX1BS
 i5HQlATfUD/qWq1c5KUEZMRVje/rU8+SnSbf/xZeQKF4albOoZUWAk6VWqyJiwrr
 VTnMsVVCEZ62Vawyk7tchdMu233GMoMndRPu/Xq+ourKf8cDzzXdzn678cofahpn
 5u+droa/n9O7gkku8Mz4zTmgVyXyJtLi1tMGHYT/M7XM/OrKlwvqmpAnUSuAfaQ8
 C8ywxyrk1GJQoy8LLStaSvUFcAtrdmBpv5hSFPDXkbmy/xHaKw7+vccjL9CybmXo
 P4u2psBQ4opeg0x5I+dS5Tp0WcFXcdo5UGSaw2dRT+7WDxA9ALk3hs/s+vHzv1zO
 Fe7iya1SwbZB
 =qXau
 -----END PGP SIGNATURE-----

Merge tag 'thermal-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control fixes from Rafael Wysocki:
 "These fix the Mediatek lvts_thermal driver, the Intel int340x driver,
  and the thermal core (two issues related to system suspend).

  Specifics:

   - Remove the filtered mode for mt8188 from lvts_thermal as it is not
     supported on this platform and fail the lvts_thermal initialization
     when the golden temperature is zero as that means the efuse data is
     not correctly set (Julien Panis)

   - Update the processor_thermal part of the Intel int340x driver to
     support shared interrupts as the processor thermal device interrupt
     may in fact be shared with PCI devices (Srinivas Pandruvada)

   - Synchronize the suspend-prepare and post-suspend actions of the
     thermal PM notifier to avoid a destructive race condition and
     change the priority of that notifier to the minimum to avoid
     interference between the work items spawned by it and the other
     PM notifiers during system resume (Rafael Wysocki)"

* tag 'thermal-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: int340x: processor_thermal: Support shared interrupts
  thermal: core: Change PM notifier priority to the minimum
  thermal: core: Synchronize suspend-prepare and post-suspend actions
  thermal/drivers/mediatek/lvts_thermal: Return error in case of invalid efuse data
  thermal/drivers/mediatek/lvts_thermal: Remove filtered mode for mt8188
2024-06-21 11:16:56 -07:00
Srinivas Pandruvada
55397323f3 thermal: intel: int340x: Enable WLT and power floor support for Lunar Lake
Add feature flgas for WLT and power floor for Lunar Lake.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240619172109.497639-4-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-21 15:20:32 +02:00
Srinivas Pandruvada
7a9a8c5faf thermal: intel: int340x: Support MSI interrupt for Lunar Lake
The legacy PCI interrupt is no longer supported for processor thermal
device on Lunar Lake. The support is via MSI.

Add feature PROC_THERMAL_FEATURE_MSI_SUPPORT to support MSI feature per
generation. Define this feature for Lunar Lake processors.

There are 4 MSI sources:
 0 - Package thermal
 1 - DDR Thermal
 2 - Power floor interrupt
 3 - Workload type hint

On interrupt, check the source and call the corresponding handler.

Here don't need to call proc_thermal_check_wt_intr() and
proc_thermal_check_power_floor_intr() to check if the interrupt is for
those sources as there is a dedicated MSI interrupt.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240619172109.497639-3-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-21 15:20:32 +02:00
Srinivas Pandruvada
a264cee31f thermal: intel: int340x: Remove unnecessary calls to free irq
Remove calls to devm_free_irq() and pci_free_irq_vectors().
They will be called on driver release anyway.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240619172109.497639-2-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-21 15:20:32 +02:00
Srinivas Pandruvada
5ba206213a thermal: intel: int340x: Add DLVR support for Lunar Lake
Add support for DLVR (Digital Linear Voltage Regulator) for Lunar Lake.
There are no new sysfs attributes or difference in operation compared
to prior generations.

MMIO offset and bit positions are changed compared to Meteor Lake
processors. Also for two attributes dlvr_frequency_mhz and
dlvr_frequency_select, the value presented or accepted by the firmware
is not raw frequency value but an index.
For example:
RFI_FREQ_SELECT and RFI_FREQ
	: 0 DLVR freq point 2227.2 MHz
	: 1 DLVR freq point 2140 MHz

Hence create a mapping table for Lunar Lake to map user space values
to the firmware accepted values.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240619124600.491168-4-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-21 15:20:28 +02:00
Srinivas Pandruvada
332ed4e5c4 thermal: intel: int340x: Capability to map user space to firmware values
To ensure compatibility between user inputs and firmware requirements,
a conversion mechanism is necessary for certain attributes. For instance,
on some platforms, the DLVR frequency must be translated into a predefined
index before being communicated to the firmware. On Lunar Lake platform:
RFI_FREQ_SELECT and RFI_FREQ:
Index 0 corresponds to a DLVR frequency of 2227.2 MHz
Index 1 corresponds to a DLVR frequency of 2140 MHz

Introduce a feature that enables the conversion of values between user
space inputs and firmware-accepted formats. This feature would also
facilitate the reverse process, converting firmware values back into user
friendly display values.

To support this functionality, a model-specific mapping table will be
utilized. When available, this table will provide the necessary
translations between user space values and firmware values, ensuring
seamless communication and accurate settings.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240619124600.491168-3-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-21 15:20:28 +02:00
Srinivas Pandruvada
822b2a7c95 thermal: intel: int340x: Cleanup of DLVR sysfs on driver remove
When only DLVR enabled without DVFS, during driver remove,
proc_thermal_rfim_remove() is not called. Hence the DLVR sysfs is not
deleted.

On Lunar Lake DLVR is enabled without DVFS, hence this issue can be
reproduced.

Check also PROC_THERMAL_FEATURE_DLVR to call proc_thermal_rfim_remove().

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Link: https://patch.msgid.link/20240619124600.491168-2-srinivas.pandruvada@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-21 15:20:28 +02:00
Ricardo Neri
be6bfb29c5 thermal: intel: intel_tcc_cooling: Use a model-specific bitmask for TCC offset
The TCC offset field in the register MSR_TEMPERATURE_TARGET is not
architectural. The TCC library provides a model-specific bitmask. Use it to
determine the maximum TCC offset.

Suggested-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Link: https://patch.msgid.link/20240614211606.5896-3-ricardo.neri-calderon@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-21 14:52:12 +02:00
Ricardo Neri
6ae0092ca7 thermal: intel: intel_tcc: Add model checks for temperature registers
The register MSR_TEMPERATURE_TARGET is not architectural. Its fields may be
defined differently for each processor model. TCC_OFFSET is an example of
such case.

Despite being specified as architectural, the registers IA32_[PACKAGE]_
THERM_STATUS have become model-specific: in recent processors, the
digital temperature readout uses bits [23:16] whereas the Intel Software
Developer's manual specifies bits [22:16].

Create an array of processor models and their bitmasks for TCC_OFFSET and
the digital temperature readout fields. Do not include recent processors.
Instead, use the bitmasks of these recent processors as default.

Use these model-specific bitmasks when reading TCC_OFFSET or the
temperature sensors.

Initialize a model-specific data structure during subsys_initcall() to
have it ready when thermal drivers are loaded.

Expose the new interface intel_tcc_get_offset_mask(). The
intel_tcc_cooling driver will use it.

Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Link: https://patch.msgid.link/20240614211606.5896-2-ricardo.neri-calderon@linux.intel.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-21 14:52:12 +02:00
Rafael J. Wysocki
73abe7002e Merge back thermal control changes related to Intel platforms for v6.11 2024-06-21 14:47:17 +02:00
Rafael J. Wysocki
3995904d54 Merge back thermal control material for v6.11. 2024-06-21 14:42:30 +02:00
Srinivas Pandruvada
096597cfe4 thermal: int340x: processor_thermal: Support shared interrupts
On some systems the processor thermal device interrupt is shared with
other PCI devices. In this case return IRQ_NONE from the interrupt
handler when the interrupt is not for the processor thermal device.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Fixes: f0658708e8 ("thermal: int340x: processor_thermal: Use non MSI interrupts by default")
Cc: 6.7+ <stable@vger.kernel.org> # 6.7+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-19 16:52:15 +02:00
Rafael J. Wysocki
d284d6cdaa - Remove the filtered mode for mt8188 as it is not supported on this
platform (Julien Panis)
 
 - Fail in case the golden temperature is zero as that means the efuse
   data is not correctly set (Julien Panis)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmZvS8YACgkQqDIjiipP
 6E+lHQgAsdwIcybdNpwqoD+D6heY4TsL+++h0RS4id6U5e1jvIemfUw1KaSbdaXH
 w6lOwSjycl1z8+pywRZ7P1psZGIe9SGXkqoV7WsHr22JfqWToV5IX3d9HJxnvOlq
 pvXPH9qJgwb4m2mRtcQq4E+WMAMoXDhbtcFNKtrRqZJOeJQlBm42pdCyTyFLU0Gg
 CkQ4d1pWlVZ6hixzsBF/69kDC8aeGVQnYN32X0CkERhqg/RO0DTneGS2hwT5F6oK
 icgV4aBh5oSn/gymbdqybZWWmm1dvEmgR24OtTrgNYCgPhb55/QsfHO+gU8r4HIu
 q6/D5V7vDr5jZ9FTofm08I2SY2kPvQ==
 =O7Iv
 -----END PGP SIGNATURE-----

Merge tag 'thermal-v6.10-rc4' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux

Merge thermal driver fixes for 6.10-rc5 from Daniel Lezcano:

"- Remove the filtered mode for mt8188 as it is not supported on this
   platform (Julien Panis)

 - Fail in case the golden temperature is zero as that means the efuse
   data is not correctly set (Julien Panis)"

* tag 'thermal-v6.10-rc4' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
  thermal/drivers/mediatek/lvts_thermal: Return error in case of invalid efuse data
  thermal/drivers/mediatek/lvts_thermal: Remove filtered mode for mt8188
2024-06-17 13:58:39 +02:00
Rafael J. Wysocki
494c7d0550 thermal: core: Change PM notifier priority to the minimum
It is reported that commit 5a5efdaffd ("thermal: core: Resume thermal
zones asynchronously") causes battery data in sysfs on Thinkpad P1 Gen2
to become invalid after a resume from S3 (and it is necessary to reboot
the machine to restore correct battery data).  Some investigation into
the problem indicated that it happened because, after the commit in
question, the ACPI battery PM notifier ran in parallel with
thermal_zone_device_resume() for one of the thermal zones which
apparently confused the platform firmware on the affected system.

While the exact reason for the firmware confusion remains unclear, it
is arguably not particularly relevant, and the expected behavior of the
affected system can be restored by making the thermal PM notifier run
at the lowest priority which avoids interference between work items
spawned by it and the other PM notifiers (that will run before those
work items now).

Fixes: 5a5efdaffd ("thermal: core: Resume thermal zones asynchronously")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218881
Reported-by: fhortner@yahoo.de
Tested-by: fhortner@yahoo.de
Cc: 6.8+ <stable@vger.kernel.org> # 6.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-14 17:36:59 +02:00
Rafael J. Wysocki
d2278f3533 thermal: core: Synchronize suspend-prepare and post-suspend actions
After commit 5a5efdaffd ("thermal: core: Resume thermal zones
asynchronously") it is theoretically possible that, if a system suspend
starts immediately after a system resume, thermal_zone_device_resume()
spawned by the thermal PM notifier for one of the thermal zones at the
end of the system resume will run after the PM thermal notifier for the
suspend-prepare action.  If that happens, tz->suspended set by the latter
will be reset by the former which may lead to unexpected consequences.

To avoid that race, synchronize thermal_zone_device_resume() with the
suspend-prepare thermal PM notifier with the help of additional bool
field and completion in struct thermal_zone_device.

Note that this also ensures running __thermal_zone_device_update() at
least once for each thermal zone between system resume and the following
system suspend in case it is needed to start thermal mitigation.

Fixes: 5a5efdaffd ("thermal: core: Resume thermal zones asynchronously")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-14 17:36:59 +02:00
Julien Panis
72cacd06e4 thermal/drivers/mediatek/lvts_thermal: Return error in case of invalid efuse data
This patch prevents from registering thermal entries and letting the
driver misbehave if efuse data is invalid. A device is not properly
calibrated if the golden temperature is zero.

Fixes: f5f633b182 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Signed-off-by: Julien Panis <jpanis@baylibre.com>
Reviewed-by: Nicolas Pitre <npitre@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20240604-mtk-thermal-calib-check-v2-1-8f258254051d@baylibre.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-12 19:07:34 +02:00
Rafael J. Wysocki
72196c20c3 thermal: core: Avoid calling .trip_crossed() for critical and hot trips
Invoking the governor .trip_crossed() callback for critical and hot
trips is pointless because they are handled directly by the core,
so make thermal_governor_trip_crossed() avoid doing that.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-11 21:06:44 +02:00
Rafael J. Wysocki
2c637af8a7 thermal: gov_bang_bang: Drop unnecessary cooling device target state checks
Some cooling device target state checks in bang_bang_control()
done before setting the new target state are not necessary after
recent changes, so drop them.

Also avoid updating the target state before checking it for
unexpected values.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-11 21:06:44 +02:00
Rafael J. Wysocki
a52641bc62 thermal: trip: Use READ_ONCE() for lockless access to trip properties
When accessing trip temperature and hysteresis without locking, it is
better to use READ_ONCE() to prevent compiler optimizations possibly
affecting the read from being applied.

Of course, for the READ_ONCE() to be effective, WRITE_ONCE() needs to
be used when updating their values.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-11 21:06:44 +02:00
Rafael J. Wysocki
893bae9223 thermal: trip: Make thermal_zone_set_trips() use trip thresholds
Modify thermal_zone_set_trips() to use trip thresholds instead of
computing the low temperature for each trip to avoid deriving both
the high and low temperature levels from the same trip (which may
happen if the zone temperature falls into the hysteresis range of
one trip).

Accordingly, make __thermal_zone_device_update() call
thermal_zone_set_trips() later, when threshold values have been
updated for all trips.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11 21:04:56 +02:00
Rafael J. Wysocki
28d5cc1267 thermal: trip: Rename __thermal_zone_set_trips() to thermal_zone_set_trips()
Drop the pointless double underline prefix from the function name as per
the subject.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11 21:04:48 +02:00
Rafael J. Wysocki
f41f23b0ca thermal: trip: Use common set of trip type names
Use the same set of trip type names in sysfs and in the thermal debug
code output.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11 21:04:40 +02:00
Rafael J. Wysocki
8f9025baf6 thermal/debugfs: Move some statements from under thermal_dbg->lock
The tz_dbg local variable assignments in thermal_debug_tz_trip_up(),
thermal_debug_tz_trip_down(), and thermal_debug_update_trip_stats()
need not be carried out under thermal_dbg->lock, so move them from
under that lock (to avoid possible future confusion).

While at it, reorder local variable definitions in
thermal_debug_tz_trip_up() for more clarity.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11 21:04:34 +02:00
Rafael J. Wysocki
881c084fc9 thermal/debugfs: Compute maximum temperature for mitigation episode as a whole
Notice that the maximum temperature above the trip point must be the
same for all of the trip points involved in a given mitigation episode,
so it need not be computerd for each of them separately.

It is sufficient to compute the maximum temperature for the mitigation
episode as a whole and print it accordingly, so do that.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11 21:04:28 +02:00
Rafael J. Wysocki
993c87047d thermal/debugfs: Adjust check for trips without statistics in tze_seq_show()
Initialize the trip_temp field in struct trip_stats to
THERMAL_TEMP_INVALID and adjust the check for trips without
statistics in tze_seq_show() to look at that field instead of
comparing min and max.

This will mostly be useful to simplify subsequent changes.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11 21:04:21 +02:00
Rafael J. Wysocki
ea6a3c5202 thermal/debugfs: Fix up units in "mitigations" files
Print temperature units as m°C rather than °mC (the meaning of which is
unclear) and add time unit to the duration column.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11 21:04:13 +02:00
Rafael J. Wysocki
cc86c139ae thermal/debugfs: Print mitigation timestamp value in milliseconds
Because mitigation episode duration is printed in milliseconds, there
is no reason to print timestamp information for mitigation episodes in
smaller units which also makes it somewhat harder to interpret the
numbers.

Print it in milliseconds for consistency.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11 21:04:06 +02:00
Rafael J. Wysocki
9b73b5052a thermal/debugfs: Do not extend mitigation episodes beyond system resume
Because thermal zone handling by the thermal core is started from
scratch during resume from system-wide suspend, prevent the debug
code from extending mitigation episodes beyond that point by ending
the mitigation episode currently in progress, if any, for each thermal
zone.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11 21:04:00 +02:00
Rafael J. Wysocki
8b95bed0ce thermal/debugfs: Use helper to update trip point overstepping duration
Add a helper for updating trip point overstepping duration to be called
from thermal_debug_tz_trip_down().

Subsequently, it will also be used during resume from system-wide
suspend.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-11 21:01:21 +02:00
Rafael J. Wysocki
b684682698 thermal: gov_step_wise: Restore passive polling management
Consider a thermal zone with one passive trip point, a cooling device
with 3 states (0, 1, 2) bound to it, passive polling enabled (nonzero
passive_delay_jiffies) and no regular polling (polling_delay_jiffies
equal to 0) that is managed by the Step-Wise governor.  Suppose that
the initial state of the cooling device is 0 and the zone temperature
is below the trip point to start with.

When the trip point is crossed, tz->passive is incremented by the
thermal core and the governor's .manage() callback is invoked.  It
sets 'throttle' to 'true' for the trip in question and
get_target_state() returns 1 for the instance corresponding to the
cooling device (say that 'upper' and 'lower' are set to 2 and 0 for
it, respectively), so its state changes to 1.

Passive polling is still active for the zone, so next time the
temperature is updated, the governor's .manage() callback will be
invoked again.  If the temperature is still rising, it will change
the state of the cooling device to 2.

Now suppose that next time the zone temperature is updated, it falls
below the trip point, so tz->passive is decremented for the zone (say
it becomes 0 then) and the governor's .manage() callbacks runs.

It finds that the temperature trend for the zone is 'falling' and
'throttle' will be set to 'false' for the trip in question, so the
cooling device's state will be changed to 1.  However, because
tz->polling is 0 for the zone, the governor's .manage() callback
may not be invoked again for a long time and the cooling device's
state will not be reset back to 0.

This can happen because commit 042a3d80f1 ("thermal: core: Move
passive polling management to the core") removed passive polling
management from the Step-Wise governor.

Before that change, thermal_zone_trip_update() would bump up
tz->passive when changing the target state for a thermal instance
from "no target" to a specific value and it would drop tz->passive
when changing it back to "no target" which would cause passive
polling to be active for the zone until the governor has reset the
states of all cooling devices.  In particular, in the example above
tz->passive would be incremented when changing the state of the
cooling device from 0 to 1 and then it would be still nonzero when
the state of the cooling device was changed from 2 to 1.

To prevent this problem from occurring, restore the passive polling
management in the Step-Wise governor by partially reverting the
commit in question and update the comment in the restored code
to explain its role more clearly.

Fixes: 042a3d80f1 ("thermal: core: Move passive polling management to the core")
Closes: https://lore.kernel.org/linux-pm/ZmVfcEOxmjUHZTSX@hovoldconsulting.com
Reported-by: Johan Hovold <johan+linaro@kernel.org>
Tested-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-11 21:00:44 +02:00
Zhang Rui
68de0ae4d6 thermal: intel: intel_pch: Improve cooling log
The intel_pch_thermal cooling mechanism currently only provides one of
the following final conclusions:
1. intel_pch_thermal 0000:00:12.0: CPU-PCH is cool [48C]
2. intel_pch_thermal 0000:00:12.0: CPU-PCH is cool [49C] after 30700 ms delay
3. intel_pch_thermal 0000:00:12.0: CPU-PCH is hot [60C] after 60000 ms delay. S0ix might fail
4. intel_pch_thermal 0000:00:12.0: Wakeup event detected, abort cooling

This does not provide sufficient context about what is happening,
especially for case 4.

Add one line log to indicate when PCH overheats and the cooling delay
has started.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-07 21:24:37 +02:00
Dr. David Alan Gilbert
68f2d1dc48 thermal: int3403: remove unused struct 'int3403_performance_state'
'int3403_performance_state' has never been used since the original
commit 4384b8fe16 ("Thermal: introduce int3403 thermal driver").

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-07 21:22:02 +02:00
Erick Archer
1007d2c5d7 thermal: int3400: Use sizeof(*pointer) instead of sizeof(type)
It is preferred to use sizeof(*pointer) instead of sizeof(type)
due to the type of the variable can change and one needs not
change the former (unlike the latter). This patch has no effect
on runtime behavior.

Signed-off-by: Erick Archer <erick.archer@outlook.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-07 20:59:33 +02:00
Tony Luck
a31a0a3e90 thermal: intel: intel_soc_dts_thermal: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-07 20:47:05 +02:00
Tony Luck
0f46ecc424 thermal: intel: intel_tcc_cooling: Switch to new Intel CPU model defines
New CPU #defines encode vendor and family as well as model.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-07 20:37:32 +02:00
Rafael J. Wysocki
1af89dedc8 thermal: core: Do not fail cdev registration because of invalid initial state
It is reported that commit 31a0fa0019 ("thermal/debugfs: Pass cooling
device state to thermal_debug_cdev_add()") causes the ACPI fan driver
to fail probing on some systems which turns out to be due to the _FST
control method returning an invalid value until _FSL is first evaluated
for the given fan.  If this happens, the .get_cur_state() cooling device
callback returns an error and __thermal_cooling_device_register() fails
as uses that callback after commit 31a0fa0019.

Arguably, _FST should not return an invalid value even if it is
evaluated before _FSL, so this may be regarded as a platform firmware
issue, but at the same time it is not a good enough reason for failing
the cooling device registration where the initial cooling device state
is only needed to initialize a thermal debug facility.

Accordingly, modify __thermal_cooling_device_register() to avoid
calling thermal_debug_cdev_add() instead of returning an error if the
initial .get_cur_state() callback invocation fails.

Fixes: 31a0fa0019 ("thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()")
Closes: https://lore.kernel.org/linux-acpi/20240530153727.843378-1-laura.nao@collabora.com
Reported-by: Laura Nao <laura.nao@collabora.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Laura Nao <laura.nao@collabora.com>
2024-06-07 13:51:51 +02:00
Julien Panis
d9fef76e89 thermal/drivers/mediatek/lvts_thermal: Remove filtered mode for mt8188
Filtered mode is not supported on mt8188 SoC and is the source of bad
results. Move to immediate mode which provides good temperatures.

Fixes: f4745f546e ("thermal/drivers/mediatek/lvts_thermal: Add MT8188 support")
Reviewed-by: Nicolas Pitre <npitre@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Julien Panis <jpanis@baylibre.com>
Link: https://lore.kernel.org/r/20240516-mtk-thermal-mt8188-mode-fix-v2-1-40a317442c62@baylibre.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-06-04 19:40:51 +02:00
Rafael J. Wysocki
ae2170d6ea thermal: trip: Trigger trip down notifications when trips involved in mitigation become invalid
When a trip point becomes invalid after being crossed on the way up,
it is involved in a mitigation episode that needs to be adjusted to
compensate for the trip going away.

For this reason, introduce thermal_zone_trip_down() as a wrapper
around thermal_trip_crossed() and make thermal_zone_set_trip_temp()
call it if the new temperature of the trip at hand is equal to
THERMAL_TEMP_INVALID and it has been crossed on the way up to trigger
all of the necessary adjustments in user space, the thermal debug
code and the zone governor.

Fixes: 8c69a777e4 ("thermal: core: Fix the handling of invalid trip points")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-27 13:00:00 +02:00
Rafael J. Wysocki
cb573eec60 thermal: core: Introduce thermal_trip_crossed()
Add a helper function called thermal_trip_crossed() to be invoked by
__thermal_zone_device_update() in order to notify user space, the
thermal debug code and the zone governor about trip crossing.

Subsequently, this will also be used in the case when a trip point
becomes invalid after being crossed on the way up.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-27 13:00:00 +02:00
Rafael J. Wysocki
5a599e10e5 thermal/debugfs: Allow tze_seq_show() to print statistics for invalid trips
Commit a6258fde8d ("thermal/debugfs: Make tze_seq_show() skip invalid
trips and trips with no stats") modified tze_seq_show() to skip invalid
trips, but it overlooked the fact that a trip may become invalid during
a mitigation eposide involving it, in which case its statistics should
still be reported.

For this reason, remove the invalid trip temperature check from the
main loop in tze_seq_show().

The trips that have never been valid will still be skipped after this
change because there are no statistics to report for them.

Fixes: a6258fde8d ("thermal/debugfs: Make tze_seq_show() skip invalid trips and trips with no stats")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-27 13:00:00 +02:00
Rafael J. Wysocki
9e69acc1de thermal/debugfs: Print initial trip temperature and hysteresis in tze_seq_show()
The temperature and hysteresis of a trip point may change during a
mitigation episode it is involved in (it may even become invalid
altogether), so in order to avoid possible confusion related to that,
store the temperature and hysteresis of trip points at the time they
are crossed on the way up and print those values instead of their
current temperature and hysteresis.

Fixes: 7ef01f228c ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-27 13:00:00 +02:00
Steven Rostedt (Google)
2c92ca849f tracing/treewide: Remove second parameter of __assign_str()
With the rework of how the __string() handles dynamic strings where it
saves off the source string in field in the helper structure[1], the
assignment of that value to the trace event field is stored in the helper
value and does not need to be passed in again.

This means that with:

  __string(field, mystring)

Which use to be assigned with __assign_str(field, mystring), no longer
needs the second parameter and it is unused. With this, __assign_str()
will now only get a single parameter.

There's over 700 users of __assign_str() and because coccinelle does not
handle the TRACE_EVENT() macro I ended up using the following sed script:

  git grep -l __assign_str | while read a ; do
      sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file;
      mv /tmp/test-file $a;
  done

I then searched for __assign_str() that did not end with ';' as those
were multi line assignments that the sed script above would fail to catch.

Note, the same updates will need to be done for:

  __assign_str_len()
  __assign_rel_str()
  __assign_rel_str_len()

I tested this with both an allmodconfig and an allyesconfig (build only for both).

[1] https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org/

Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba6a0@rorschach.local.home

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Christian König <christian.koenig@amd.com> for the amdgpu parts.
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #for
Acked-by: Rafael J. Wysocki <rafael@kernel.org> # for thermal
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Darrick J. Wong <djwong@kernel.org>	# xfs
Tested-by: Guenter Roeck <linux@roeck-us.net>
2024-05-22 20:14:47 -04:00
Linus Torvalds
d90be6e4aa Driver core changes for 6.10-rc1
Here is the small set of driver core and kernfs changes for 6.10-rc1.
 
 Nothing major here at all, just a small set of changes for some driver
 core apis, and minor fixups.  Included in here are:
   - sysfs_bin_attr_simple_read() helper added and used
   - device_show_string() helper added and used
 All usages of these were acked by the various maintainers.  Also in here
 are:
   - kernfs minor cleanup
   - removed unused functions
   - typo fix in documentation
   - pay attention to sysfs_create_link() failures in module.c finally.
 
 All of these have been in linux-next for a very long time with no
 reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZk3+hQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylfTwCfUyHWkDZuZ7ehdtjzfmcd4EKZBK8An3AAV99G
 ox8PXMxuFTaUEdT/69FQ
 =2sEo
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the small set of driver core and kernfs changes for 6.10-rc1.

  Nothing major here at all, just a small set of changes for some driver
  core apis, and minor fixups. Included in here are:

   - sysfs_bin_attr_simple_read() helper added and used

   - device_show_string() helper added and used

  All usages of these were acked by the various maintainers. Also in
  here are:

   - kernfs minor cleanup

   - removed unused functions

   - typo fix in documentation

   - pay attention to sysfs_create_link() failures in module.c finally

  All of these have been in linux-next for a very long time with no
  reported problems"

* tag 'driver-core-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  device property: Fix a typo in the description of device_get_child_node_count()
  kernfs: mount: Remove unnecessary ‘NULL’ values from knparent
  scsi: Use device_show_string() helper for sysfs attributes
  platform/x86: Use device_show_string() helper for sysfs attributes
  perf: Use device_show_string() helper for sysfs attributes
  IB/qib: Use device_show_string() helper for sysfs attributes
  hwmon: Use device_show_string() helper for sysfs attributes
  driver core: Add device_show_string() helper for sysfs attributes
  treewide: Use sysfs_bin_attr_simple_read() helper
  sysfs: Add sysfs_bin_attr_simple_read() helper
  module: don't ignore sysfs_create_link() failures
  driver core: Remove unused platform_notify, platform_notify_remove
2024-05-22 12:13:40 -07:00
Linus Torvalds
5b5a5ad5a5 Thermal control fixes for 6.10-rc1
- Fix and clean up the MediaTek lvts_thermal driver (Julien Panis).
 
  - Prevent invalid trip point handling from triggering spurious trip
    point crossing events and allow passive polling to stop when a
    passive trip point involved in it becomes invalid (Rafael Wysocki).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmZMYXoSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxWKsP/R4PjdTgXYKg1NOGjsBJB5kYUVJZZkgZ
 Kr123fp2pmJZ7MI7AsoD3I6d0Z9Kfj2MEHgUYyU7anN+Ub+2JpQufK48CV/KoQSW
 LSC2hJsgHqRaOmP9i+tNGvKY3lnRTCKlLGhK+uSNC5ghHaFJuF/3ihEXHm0RWm4N
 w4VBN3y721wcHGNXCOeQRxM4dq6Z2Z2L5qWsgvEszPz9PwxflJ5ek2KfCyZDjtWu
 zh7xhk6WQ8kHkft2XhrCD9/AVIESeBLO9arvHh3TXUiJHHijyY/oTOHFP6VZ2n/A
 HZiJeRGKMNaBktkkuhCzQQbu0PpZ97GqCyrFc18wDFzdeiFSwd4vNYeeOBK+RyNo
 tUrlfm9X1O5e4Rary9mQ7DW9LJJ6DJSao4FFH0+wh2cSfJtl2/03lg6/obmb8hXh
 uuvKCLJUqoE6IF5bXL1943QWENGyCEHjvJWS46NuGSy+Ly8+khS49rPEfJg1Qj7g
 F12+1GL0uryTDLbL5V+IOtjFtUE786j5bHrxHTsasIvzbNvbm/fHbP/7JJW3l/Wf
 MDwMx7MaijNeHeO4cLGWnaWA4U9kaxsphYii7icuUjw55khqBbt+EmE3Npc/fVlM
 mWVkEZBGWKKpZb2zww7WX6FgBSZb/SZjMoxTUu/2YQgrwwiMe1LIHUFC3U5OATVG
 vKXn8ALYHSly
 =EjP7
 -----END PGP SIGNATURE-----

Merge tag 'thermal-6.10-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control fixes from Rafael Wysocki:
 "These fix the MediaTek lvts_thermal driver and the handling of trip
  points that start as invalid and are adjusted later by user space via
  sysfs.

  Specifics:

   - Fix and clean up the MediaTek lvts_thermal driver (Julien Panis)

   - Prevent invalid trip point handling from triggering spurious trip
     point crossing events and allow passive polling to stop when a
     passive trip point involved in it becomes invalid (Rafael Wysocki)"

* tag 'thermal-6.10-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: core: Fix the handling of invalid trip points
  thermal/drivers/mediatek/lvts_thermal: Fix wrong lvts_ctrl index
  thermal/drivers/mediatek/lvts_thermal: Remove unused members from struct lvts_ctrl_data
  thermal/drivers/mediatek/lvts_thermal: Check NULL ptr on lvts_data
2024-05-21 11:35:45 -07:00
Rafael J. Wysocki
8c69a777e4 thermal: core: Fix the handling of invalid trip points
Commit 9ad18043fb ("thermal: core: Send trip crossing notifications
at init time if needed") overlooked the case when a trip point that
has started as invalid is set to a valid temperature later.  Namely,
the initial threshold value for all trips is zero, so if a previously
invalid trip becomes valid and its (new) low temperature is above the
zone temperature, a spurious trip crossing notification will occur and
it may trigger the WARN_ON() in handle_thermal_trip().

To address this, set the initial threshold for all trips to INT_MAX.

There is also the case when a valid writable trip becomes invalid that
requires special handling.  First, in accordance with the change
mentioned above, the trip's threshold needs to be set to INT_MAX to
avoid the same issue.  Second, if the trip in question is passive and
it has been crossed by the thermal zone temperature on the way up, the
zone's passive count has been incremented and it is in the passive
polling mode, so its passive count needs to be adjusted to allow the
passive polling to be turned off eventually.

Fixes: 9ad18043fb ("thermal: core: Send trip crossing notifications at init time if needed")
Fixes: 042a3d80f1 ("thermal: core: Move passive polling management to the core")
Reported-by: Zhang Rui <zhang.rui@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Wendy Wang <wendy.wang@intel.com>
2024-05-17 12:37:33 +02:00
Rafael J. Wysocki
9dbbcd6c83 - Check for a NULL pointer before using it in the probe routine of the
Mediatek LVTS driver (Julien Panis)
 
 - Remove the num_lvts_sensor and cal_offset fields of the
   lvts_ctrl_data as they are not used. These are not functional fixes
   but slight memory usage fix of the Mediatek LVTS driver (Julien
   Panis)
 
 - Fix wrong lvts_ctrl index leading to a NULL pointer dereference in
   the Mediatek LVTS driver (Julien Panis)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmZDL8IACgkQqDIjiipP
 6E9IGwgAlzaIbugHk6qgrkC1s6JhxdAay/sUAk4zJ7WYirtykdWS0/56hh2DL/Ub
 VttvTWgDf6W4k/+gaV/hVWjEq5og82n249DSFlJmcAAcvTQ0Stk2LG2TTqRBmfb4
 qPKqLoVWlMhDUlbhnNZoS04jApowRitgfOvtC1G0owqkmbzIOKryJJIbvK6VRWpm
 qPo7aR99DJW7Cm3lLujyIC41PGb4vKGS2jFFFrf/s69vj40IVHw2jAHOkSJDS+H1
 Jro7fTHtlf7eskg4abL9zUzDSJMJTNOTyyj8vJ0524V2I7SavZxbUBoWVfYc/OSf
 gWqCZFlY6sWX7k331hhxrpWjdALlyA==
 =e53Q
 -----END PGP SIGNATURE-----

Merge tag 'thermal-v6.10-rc1-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux

Merge thermal driver fixes for 6.10-rc1 from Daniel Lezcano:

"- Check for a NULL pointer before using it in the probe routine of the
   Mediatek LVTS driver (Julien Panis)

 - Remove the num_lvts_sensor and cal_offset fields of the
   lvts_ctrl_data as they are not used. These are not functional fixes
   but slight memory usage fix of the Mediatek LVTS driver (Julien
   Panis)

 - Fix wrong lvts_ctrl index leading to a NULL pointer dereference in
   the Mediatek LVTS driver (Julien Panis)"

* tag 'thermal-v6.10-rc1-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
  thermal/drivers/mediatek/lvts_thermal: Fix wrong lvts_ctrl index
  thermal/drivers/mediatek/lvts_thermal: Remove unused members from struct lvts_ctrl_data
  thermal/drivers/mediatek/lvts_thermal: Check NULL ptr on lvts_data
2024-05-15 12:00:38 +02:00
Linus Torvalds
101b7a9714 ACPI updates for 6.10-rc1
- Add EINJ CXL error types to actbl1.h (Ben Cheatham).
 
  - Add support for RAS2 table to ACPICA (Shiju Jose).
 
  - Fix various spelling mistakes in text files and code comments in
    ACPICA (Colin Ian King).
 
  - Fix spelling and typos in ACPICA (Saket Dumbre).
 
  - Modify ACPI_OBJECT_COMMON_HEADER (lijun).
 
  - Add RISC-V RINTC affinity structure support to ACPICA (Haibo Xu).
 
  - Fix CXL 3.0 structure (RDPAS) in the CEDT table (Hojin Nam).
 
  - Add missin increment of registered GPE count to ACPICA (Daniil
    Tatianin).
 
  - Mark new ACPICA release 20240322 (Saket Dumbre).
 
  - Add support for the AEST V2 table to ACPICA (Ruidong Tian).
 
  - Disable -Wstringop-truncation for some ACPICA code in the kernel to
    avoid a compiler warning  that is not very useful (Arnd Bergmann).
 
  - Make the kernel indicate support for several ACPI features that are
    in fact supported to the platform firmware through _OSC and fix the
    Generic Initiator Affinity _OSC bit (Armin Wolf).
 
  - Make the ACPI core set the owner value for ACPI drivers, drop the
    owner setting from a number of drivers and eliminate the owner
    field from struct acpi_driver (Krzysztof Kozlowski).
 
  - Rearrange fields in several structures to effectively eliminate
    computations from container_of() in some cases (Andy Shevchenko).
 
  - Do some assorted cleanups of the ACPI device enumeration code (Andy
    Shevchenko).
 
  - Make the ACPI device enumeration code skip devices with _STA values
    clearly identified by the specification as invalid (Rafael Wysocki).
 
  - Rework the handling of the NHLT table to simplify and clarify it and
    drop some obsolete pieces (Cezary Rojewski).
 
  - Add ACPI IRQ override quirks for Asus Vivobook Pro N6506MV, TongFang
    GXxHRXx and GMxHGxx, and XMG APEX 17 M23 (Guenter Schafranek, Tamim
    Khan, Christoffer Sandberg).
 
  - Add reference to UEFI DSD Guide to the documentation related to the
    ACPI handling of device properties (Sakari Ailus).
 
  - Fix SRAT lookup of CFMWS ranges with numa_fill_memblks(), remove
    lefover architecture-dependent code from the ACPI NUMA handling code
    and simplify it on top of that (Robert Richter).
 
  - Add a num-cs device property to specify the number of chip selects
    for Intel Braswell to the ACPI LPSS (Intel SoC) driver and remove a
    nested CONFIG_PM #ifdef from it (Andy Shevchenko).
 
  - Move three x86-specific ACPI files to the x86 directory (Andy
    Shevchenko).
 
  - Mark SMO8810 accel on Dell XPS 15 9550 as always present and add a
    PNP_UART1_SKIP quirk for Lenovo Blade2 tablets (Hans de Goede).
 
  - Move acpi_blacklisted() declaration to asm/acpi.h (Kuppuswamy
    Sathyanarayanan).
 
  - Add Lunar Lake support to the ACPI DPTF driver (Sumeet Pawnikar).
 
  - Mark the einj_driver driver's remove callback as __exit because it
    cannot get unbound via sysfs (Uwe Kleine-König).
 
  - Fix a typo in the ACPI documentation regarding the layout of sysfs
    subdirectory representing the ACPI namespace (John Watts).
 
  - Make the ACPI pfrut utility print the update_cap field during
    capability query (Chen Yu).
 
  - Add HAS_IOPORT dependencies to PNP (Niklas Schnelle).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmZCZksSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxQHMQAIaab/0AY6MXznyZtf9bfVXzwa/HXWJV
 CWVhubByKHKSJjGcsMbxAXvRq5EDX91QiuaoExEzKY7G9q+jFl1rUMvGWVrySveS
 ae1+jMF9Kd3QXTXB6bCbre4/7bn/LZN0U5nvD5yFpuEEl8GqGafKiUVRVEIQmMuB
 1yi5HU2orKxLYU84mCGIiiWlTAi/6MlSeJ5NZwxPfj/+B24pPWTdu4AsD+b7gC2G
 g36KlAZpnGJqRZ1pg5KQDCV0eKGCJuh9X7ACM8t/XDg4S7FL75X2fWMRUh+4sYp9
 12DdxaEPYCNaVebm2Ae7vrLtcifHNsMoirs3Me0RUOmFqRPaH5YILw3Un2jabiBp
 OWgVuIAa1+M4pKoVVnimM1f9i8rUr8XEl/kheb0c0OinQlecYpPeza9T5U+mMOsy
 knOs5jTm5oIhHS4sQxPu4I1/HX4/5kczDuMs1XW+6ypuT28WCbTj+mrryAQOQHyD
 vFsDBdInU5Yiit4APuJjiTSeEtFkClNI54oKz29mesPRRdDMuYBubdEuXG4lbsF1
 n4SWds3ELFqWlWvyshCMQdyjTN7UUMc2Af+xY3AEIq7uvPTdTQWUmf5LQi2vfgSO
 y3vhCjjt0ONyTXqbkP7HYcOXx5DZsFb/1uXqAAo4/RlZ6LWx/K1+onttECsoQIRH
 rR3pwW76z0Vy
 =UiEQ
 -----END PGP SIGNATURE-----

Merge tag 'acpi-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI updates from Rafael Wysocki:
 "These are ACPICA updates coming from the 20240322 release upstream, an
  ACPI DPTF driver update adding new platform support for it, some new
  quirks and some assorted fixes and cleanups.

  Specifics:

   - Add EINJ CXL error types to actbl1.h (Ben Cheatham)

   - Add support for RAS2 table to ACPICA (Shiju Jose)

   - Fix various spelling mistakes in text files and code comments in
     ACPICA (Colin Ian King)

   - Fix spelling and typos in ACPICA (Saket Dumbre)

   - Modify ACPI_OBJECT_COMMON_HEADER (lijun)

   - Add RISC-V RINTC affinity structure support to ACPICA (Haibo Xu)

   - Fix CXL 3.0 structure (RDPAS) in the CEDT table (Hojin Nam)

   - Add missin increment of registered GPE count to ACPICA (Daniil
     Tatianin)

   - Mark new ACPICA release 20240322 (Saket Dumbre)

   - Add support for the AEST V2 table to ACPICA (Ruidong Tian)

   - Disable -Wstringop-truncation for some ACPICA code in the kernel to
     avoid a compiler warning that is not very useful (Arnd Bergmann)

   - Make the kernel indicate support for several ACPI features that are
     in fact supported to the platform firmware through _OSC and fix the
     Generic Initiator Affinity _OSC bit (Armin Wolf)

   - Make the ACPI core set the owner value for ACPI drivers, drop the
     owner setting from a number of drivers and eliminate the owner
     field from struct acpi_driver (Krzysztof Kozlowski)

   - Rearrange fields in several structures to effectively eliminate
     computations from container_of() in some cases (Andy Shevchenko)

   - Do some assorted cleanups of the ACPI device enumeration code (Andy
     Shevchenko)

   - Make the ACPI device enumeration code skip devices with _STA values
     clearly identified by the specification as invalid (Rafael Wysocki)

   - Rework the handling of the NHLT table to simplify and clarify it
     and drop some obsolete pieces (Cezary Rojewski)

   - Add ACPI IRQ override quirks for Asus Vivobook Pro N6506MV,
     TongFang GXxHRXx and GMxHGxx, and XMG APEX 17 M23 (Guenter
     Schafranek, Tamim Khan, Christoffer Sandberg)

   - Add reference to UEFI DSD Guide to the documentation related to the
     ACPI handling of device properties (Sakari Ailus)

   - Fix SRAT lookup of CFMWS ranges with numa_fill_memblks(), remove
     lefover architecture-dependent code from the ACPI NUMA handling
     code and simplify it on top of that (Robert Richter)

   - Add a num-cs device property to specify the number of chip selects
     for Intel Braswell to the ACPI LPSS (Intel SoC) driver and remove a
     nested CONFIG_PM #ifdef from it (Andy Shevchenko)

   - Move three x86-specific ACPI files to the x86 directory (Andy
     Shevchenko)

   - Mark SMO8810 accel on Dell XPS 15 9550 as always present and add a
     PNP_UART1_SKIP quirk for Lenovo Blade2 tablets (Hans de Goede)

   - Move acpi_blacklisted() declaration to asm/acpi.h (Kuppuswamy
     Sathyanarayanan)

   - Add Lunar Lake support to the ACPI DPTF driver (Sumeet Pawnikar)

   - Mark the einj_driver driver's remove callback as __exit because it
     cannot get unbound via sysfs (Uwe Kleine-König)

   - Fix a typo in the ACPI documentation regarding the layout of sysfs
     subdirectory representing the ACPI namespace (John Watts)

   - Make the ACPI pfrut utility print the update_cap field during
     capability query (Chen Yu)

   - Add HAS_IOPORT dependencies to PNP (Niklas Schnelle)"

* tag 'acpi-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (72 commits)
  ACPI/NUMA: Squash acpi_numa_memory_affinity_init() into acpi_parse_memory_affinity()
  ACPI/NUMA: Squash acpi_numa_slit_init() into acpi_parse_slit()
  ACPI/NUMA: Remove architecture dependent remainings
  x86/numa: Fix SRAT lookup of CFMWS ranges with numa_fill_memblks()
  ACPI: video: Add backlight=native quirk for Lenovo Slim 7 16ARH7
  ACPI: scan: Avoid enumerating devices with clearly invalid _STA values
  ACPI: Move acpi_blacklisted() declaration to asm/acpi.h
  ACPI: resource: Skip IRQ override on Asus Vivobook Pro N6506MV
  ACPICA: AEST: Add support for the AEST V2 table
  ACPI: tools: pfrut: Print the update_cap field during capability query
  ACPI: property: Add reference to UEFI DSD Guide
  Documentation: firmware-guide: ACPI: Fix namespace typo
  PNP: add HAS_IOPORT dependencies
  ACPI: resource: Do IRQ override on TongFang GXxHRXx and GMxHGxx
  ACPI: resource: Do IRQ override on GMxBGxx (XMG APEX 17 M23)
  ACPICA: Update acpixf.h for new ACPICA release 20240322
  ACPICA: events/evgpeinit: don't forget to increment registered GPE count
  ACPICA: Fix CXL 3.0 structure (RDPAS) in the CEDT table
  ACPICA: SRAT: Add dump and compiler support for RINTC affinity structure
  ACPICA: SRAT: Add RISC-V RINTC affinity structure
  ...
2024-05-14 13:31:24 -07:00
Linus Torvalds
f952b6c863 Thermal control updates for 6.10-rc1
- Redesign the thermal governor interface to allow the governors to
    work in a more straightforward way (Rafael Wysocki).
 
  - Make thermal governors take the current trip point thresholds into
    account in their computations which allows trip hysteresis to be
    observed more accurately (Rafael Wysocki).
 
  - Make the thermal core manage passive polling for thermal zones and
    remove passive polling management from thermal governors  (Rafael
    Wysocki).
 
  - Refactor trip point representation and move the definition of
    thermal governor and thermal zone device structures to the thermal
    core (Rafael Wysocki).
 
  - Sort trip point crossing notifications and debug recording of trip
    point crossing events by temperature (Rafael Wysocki).
 
  - Improve the handling of cooling device states and thermal mitigation
    episodes in progress in the thermal debug code (Rafael Wysocki).
 
  - Avoid excessive updates of trip point statistics and clean up the
    printing of thermal mitigation episode information (Rafael Wysocki).
 
  - Clean up thermal governors and thermal core (Rafael Wysocki).
 
  - Allow thermal drivers to register notifiers that will be invoked
    on netlink events like BIND and UNBIND, so that they can adjust
    their activity depending on whether or not there are any
    subscribers of netlink messages coming from them, and make
    the Intel HFI driver use this mechanism (Stanislaw Gruszka).
 
  - Adjust the update delay and capabilities-per-event values in the
    Intel HFI thermal driver to prevent it from missing events and allow
    it to process more data in one go (Ricardo Neri).
 
  - Add missing MODULE_DESCRIPTION() to multiple files in the
    int340x_thermal and intel_soc_dts_iosf drivers (Srinivas Pandruvada).
 
  - Replace deprecated strncpy() with strscpy() in the int340x_thermal
    driver (Justin Stitt).
 
  - Add QCM2290 compatible DT bindings for Lmh and fix a NULL pointer
    dereference in the lmh driver when the SCM is not present (Konrad
    Dybcio).
 
  - Use the strreplace() function instead of doing it manually in the
    Armada driver (Rasmus Villemoes).
 
  - Convert st,stih407-thermal to DT schema and fix up missing
    properties (Raphael Gallais-Pou).
 
  - Add suspend/resume by restoring the context of the tsens sensor
    (Priyansh Jain).
 
  - Support A1 SoC family Thermal Sensor controller and add the DT
    bindings (Dmitry Rokosov).
 
  - Improve the temperature approximation calculation and consolidate
    the Tj constant into a shared area of the structure instead of
    duplicating it on the Rcar Gen3 (Niklas Söderlund).
 
  - Fix the Mediatek LVTS sensor coefficient for the MT8192 in order to support
    it correctly (Hsin-Te Yuan).
 
  - Fix a NULL pointer dereference in the tsens driver when the function
    compute_intercept_slope() is called with a NULL parameter (Aleksandr
    Mishin).
 
  - Remove some unused fields in struct qpnp_tm_chip and k3_bandgap
    (Christophe Jaillet).
 
  - Fix up calibration efuse data decoding, consolidate the code by
    checking boundaries and refactor some part of the LVTS Mediatek
    driver. After setting the scene, add MT8186 and MT8188 along with
    the DT bindings (Nicolas Pitre).
 
  - Add Loongson-2K2000 support after some minor code adjustements and
    providing the DT bindings definition (Binbin Zhou).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmZCZd4SHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxg28P/03RJlnog56dF6Isv5m+wPf6S66aDwQ/
 Dv5tUTk7Fy9JU7ICny5TMfWCwAwaxJVmZ08s/BvowQsLFUnZvuKjxjX3ALTe68GU
 u93wX8xN/FGSTY/SnGxwhcS12TpZ33khlM2Ci16Nlfl2RFRkA7CJsjoEpUYMNU9h
 4WNS3gVRamKdrT+KHxZmJ4+ZwPxG3KOpSMgYtOW8Bg7uDTUgMzezL7au7z5YDlNd
 uxxWR2F/Ts0HceuYIXOIun5N+sqy9QEGHT9lJVaMYvQHzx+xUvz10nsUOpB56dZv
 m1CzP5IOy+JloldVEjt9BohzSlEfx4cvkqPePoToOVWlCCt++LUm2tJGEWvZU4XZ
 vo+9Ed3y/5Mp7ws5InSrM51PyC2l+P1Hdh6prgsoaq3XHn5b+DH0Nytly2fut9zF
 rKIN9xBO/UI8k7jYgB9Gk3WfekJWFu9QkA1+udzf8vmPYFyJOt1PdxBFXRy7DdpU
 vXF//E2SIZ428LVolHyQlUCw72ZiHuc4lV1UmqK/9s1vpfp4Ksn0ou3mZIwqfEio
 doI8A1GSym7a8YrnZnrHCyRNCyPj+Wmk42oIDQRGw3VEA9q4jLCSsmYwVptvQM4C
 tgAogbbRfbZ9hTWGhMyMM9f60oDdjbxBZ2uG607UXO/Hqz4P+C1P0I/J55Fp2OM2
 B8KoOSd82URw
 =CzKP
 -----END PGP SIGNATURE-----

Merge tag 'thermal-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control updates from Rafael Wysocki:
 "The most significant part of this is a rework of thermal governors,
  including a redesign of the thermal governor interface and changes to
  make some of them take trip point hysteresis into account properly, as
  well as some related cleanups of the thermal governors and thermal
  core.

  The above is based on preliminary changes refactoring thermal data
  structures and moving the definitions of some of them into the thermal
  core which also ensure that trip point crossing notifications will be
  sent to user space via netlink and recorded in the debug statistics in
  temperature order.

  In addition, netlink bind/unbind notifications are added to the
  thermal core and the Intel HFI driver is modified to use them to avoid
  sending netlink messages until there are subscribers.

  Apart from that, multiple thermal drivers are updated which includes
  new hardware support (MediaTek MT8188 and MT8186, Amlogic A1 thermal
  sensor, Loongson-2K2000, Lmh QCM2290), fixes, cleanups and
  documentation updates, and the recently added thermal debug code is
  fixed and cleaned up.

  Specifics:

   - Redesign the thermal governor interface to allow the governors to
     work in a more straightforward way (Rafael Wysocki)

   - Make thermal governors take the current trip point thresholds into
     account in their computations which allows trip hysteresis to be
     observed more accurately (Rafael Wysocki)

   - Make the thermal core manage passive polling for thermal zones and
     remove passive polling management from thermal governors (Rafael
     Wysocki)

   - Refactor trip point representation and move the definition of
     thermal governor and thermal zone device structures to the thermal
     core (Rafael Wysocki)

   - Sort trip point crossing notifications and debug recording of trip
     point crossing events by temperature (Rafael Wysocki)

   - Improve the handling of cooling device states and thermal
     mitigation episodes in progress in the thermal debug code (Rafael
     Wysocki)

   - Avoid excessive updates of trip point statistics and clean up the
     printing of thermal mitigation episode information (Rafael Wysocki)

   - Clean up thermal governors and thermal core (Rafael Wysocki)

   - Allow thermal drivers to register notifiers that will be invoked on
     netlink events like BIND and UNBIND, so that they can adjust their
     activity depending on whether or not there are any subscribers of
     netlink messages coming from them, and make the Intel HFI driver
     use this mechanism (Stanislaw Gruszka)

   - Adjust the update delay and capabilities-per-event values in the
     Intel HFI thermal driver to prevent it from missing events and
     allow it to process more data in one go (Ricardo Neri)

   - Add missing MODULE_DESCRIPTION() to multiple files in the
     int340x_thermal and intel_soc_dts_iosf drivers (Srinivas
     Pandruvada)

   - Replace deprecated strncpy() with strscpy() in the int340x_thermal
     driver (Justin Stitt)

   - Add QCM2290 compatible DT bindings for Lmh and fix a NULL pointer
     dereference in the lmh driver when the SCM is not present (Konrad
     Dybcio)

   - Use the strreplace() function instead of doing it manually in the
     Armada driver (Rasmus Villemoes)

   - Convert st,stih407-thermal to DT schema and fix up missing
     properties (Raphael Gallais-Pou)

   - Add suspend/resume by restoring the context of the tsens sensor
     (Priyansh Jain)

   - Support A1 SoC family Thermal Sensor controller and add the DT
     bindings (Dmitry Rokosov)

   - Improve the temperature approximation calculation and consolidate
     the Tj constant into a shared area of the structure instead of
     duplicating it on the Rcar Gen3 (Niklas Söderlund)

   - Fix the Mediatek LVTS sensor coefficient for the MT8192 in order to
     support it correctly (Hsin-Te Yuan)

   - Fix a NULL pointer dereference in the tsens driver when the
     function compute_intercept_slope() is called with a NULL parameter
     (Aleksandr Mishin)

   - Remove some unused fields in struct qpnp_tm_chip and k3_bandgap
     (Christophe Jaillet)

   - Fix up calibration efuse data decoding, consolidate the code by
     checking boundaries and refactor some part of the LVTS Mediatek
     driver. After setting the scene, add MT8186 and MT8188 along with
     the DT bindings (Nicolas Pitre)

   - Add Loongson-2K2000 support after some minor code adjustements and
     providing the DT bindings definition (Binbin Zhou)"

* tag 'thermal-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (72 commits)
  thermal: intel: hfi: Increase the number of CPU capabilities per netlink event
  thermal: intel: hfi: Rename HFI_MAX_THERM_NOTIFY_COUNT
  thermal: intel: hfi: Shorten the thermal netlink event delay to 100ms
  thermal: intel: hfi: Rename HFI_UPDATE_INTERVAL
  thermal: intel: Add missing module description
  thermal: core: Move passive polling management to the core
  thermal: core: Do not call handle_thermal_trip() if zone temperature is invalid
  thermal: trip: Add missing empty code line
  thermal/debugfs: Avoid printing zero duration for mitigation events in progress
  thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()
  thermal/debugfs: Create records for cdev states as they get used
  thermal: core: Introduce thermal_governor_trip_crossed()
  thermal/debugfs: Make tze_seq_show() skip invalid trips and trips with no stats
  thermal/debugfs: Rename thermal_debug_update_temp() to thermal_debug_update_trip_stats()
  thermal/debugfs: Clean up thermal_debug_update_temp()
  thermal/debugfs: Avoid excessive updates of trip point statistics
  thermal: core: Relocate critical and hot trip handling
  thermal: core: Drop the .throttle() governor callback
  thermal: gov_user_space: Use .trip_crossed() instead of .throttle()
  thermal: gov_fair_share: Eliminate unnecessary integer divisions
  ...
2024-05-14 12:53:26 -07:00
Linus Torvalds
6e5a0c30b6 Scheduler changes for v6.10:
- Add cpufreq pressure feedback for the scheduler
 
  - Rework misfit load-balancing wrt. affinity restrictions
 
  - Clean up and simplify the code around ::overutilized and
    ::overload access.
 
  - Simplify sched_balance_newidle()
 
  - Bump SCHEDSTAT_VERSION to 16 due to a cleanup of CPU_MAX_IDLE_TYPES
    handling that changed the output.
 
  - Rework & clean up <asm/vtime.h> interactions wrt. arch_vtime_task_switch()
 
  - Reorganize, clean up and unify most of the higher level
    scheduler balancing function names around the sched_balance_*()
    prefix.
 
  - Simplify the balancing flag code (sched_balance_running)
 
  - Miscellaneous cleanups & fixes
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmZBtA0RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1gQEw//WiCiV7zTlWShSiG/g8GTfoAvl53QTWXF
 0jQ8TUcoIhxB5VeGgxVG1srYt8f505UXjH7L0MJLrbC3nOgRCg4NK57WiQEachKK
 HORIJHT0tMMsKIwX9D5Ovo4xYJn+j7mv7j/caB+hIlzZAbWk+zZPNWcS84p0ZS/4
 appY6RIcp7+cI7bisNMGUuNZS14+WMdWoX3TgoI6ekgDZ7Ky+kQvkwGEMBXsNElO
 qZOj6yS/QUE4Htwz0tVfd6h5svoPM/VJMIvl0yfddPGurfNw6jEh/fjcXnLdAzZ6
 9mgcosETncQbm0vfSac116lrrZIR9ygXW/yXP5S7I5dt+r+5pCrBZR2E5g7U4Ezp
 GjX1+6J9U6r6y12AMLRjadFOcDvxdwtszhZq4/wAcmS3B9dvupnH/w7zqY9ho3wr
 hTdtDHoAIzxJh7RNEHgeUC0/yQX3wJ9THzfYltDRIIjHTuvl4d5lHgsug+4Y9ClE
 pUIQm/XKouweQN9TZz2ULle4ZhRrR9sM9QfZYfirJ/RppmuKool4riWyQFQNHLCy
 mBRMjFFsTpFIOoZXU6pD4EabOpWdNrRRuND/0yg3WbDat2gBWq6jvSFv2UN1/v7i
 Un5jijTuN7t8yP5lY5Tyf47kQfLlA9bUx1v56KnF9mrpI87FyiDD3MiQVhDsvpGX
 rP96BIOrkSo=
 =obph
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:

 - Add cpufreq pressure feedback for the scheduler

 - Rework misfit load-balancing wrt affinity restrictions

 - Clean up and simplify the code around ::overutilized and
   ::overload access.

 - Simplify sched_balance_newidle()

 - Bump SCHEDSTAT_VERSION to 16 due to a cleanup of CPU_MAX_IDLE_TYPES
   handling that changed the output.

 - Rework & clean up <asm/vtime.h> interactions wrt arch_vtime_task_switch()

 - Reorganize, clean up and unify most of the higher level
   scheduler balancing function names around the sched_balance_*()
   prefix

 - Simplify the balancing flag code (sched_balance_running)

 - Miscellaneous cleanups & fixes

* tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits)
  sched/pelt: Remove shift of thermal clock
  sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure()
  thermal/cpufreq: Remove arch_update_thermal_pressure()
  sched/cpufreq: Take cpufreq feedback into account
  cpufreq: Add a cpufreq pressure feedback for the scheduler
  sched/fair: Fix update of rd->sg_overutilized
  sched/vtime: Do not include <asm/vtime.h> header
  s390/irq,nmi: Include <asm/vtime.h> header directly
  s390/vtime: Remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftover
  sched/vtime: Get rid of generic vtime_task_switch() implementation
  sched/vtime: Remove confusing arch_vtime_task_switch() declaration
  sched/balancing: Simplify the sg_status bitmask and use separate ->overloaded and ->overutilized flags
  sched/fair: Rename set_rd_overutilized_status() to set_rd_overutilized()
  sched/fair: Rename SG_OVERLOAD to SG_OVERLOADED
  sched/fair: Rename {set|get}_rd_overload() to {set|get}_rd_overloaded()
  sched/fair: Rename root_domain::overload to ::overloaded
  sched/fair: Use helper functions to access root_domain::overload
  sched/fair: Check root_domain::overload value before update
  sched/fair: Combine EAS check with root_domain::overutilized access
  sched/fair: Simplify the continue_balancing logic in sched_balance_newidle()
  ...
2024-05-13 17:18:51 -07:00
Rafael J. Wysocki
d9f87a7e9a Merge branches 'acpi-x86', 'acpi-dptf' and 'acpi-apei'
Merge x86-specific ACPI updates, an ACPI DPTF driver update adding new
platform support to it, and an ACPI APEI update:

 - Add a num-cs device property to specify the number of chip selects
   for Intel Braswell to the ACPI LPSS (Intel SoC) driver and remove a
   nested CONFIG_PM #ifdef from it (Andy Shevchenko).

 - Move three x86-specific ACPI files to the x86 directory (Andy
   Shevchenko).

 - Mark SMO8810 accel on Dell XPS 15 9550 as always present and add a
   PNP_UART1_SKIP quirk for Lenovo Blade2 tablets (Hans de Goede).

 - Move acpi_blacklisted() declaration to asm/acpi.h (Kuppuswamy
   Sathyanarayanan).

 - Add Lunar Lake support to the ACPI DPTF driver (Sumeet Pawnikar).

 - Mark the einj_driver driver's remove callback as __exit because it
   cannot get unbound via sysfs (Uwe Kleine-König).

* acpi-x86:
  ACPI: Move acpi_blacklisted() declaration to asm/acpi.h
  ACPI: x86: Add PNP_UART1_SKIP quirk for Lenovo Blade2 tablets
  ACPI: x86: utils: Mark SMO8810 accel on Dell XPS 15 9550 as always present
  ACPI: x86: Move LPSS to x86 folder
  ACPI: x86: Move blacklist to x86 folder
  ACPI: x86: Move acpi_cmos_rtc to x86 folder
  ACPI: x86: Introduce a Makefile
  ACPI: LPSS: Remove nested ifdeffery for CONFIG_PM
  ACPI: LPSS: Advertise number of chip selects via property

* acpi-dptf:
  ACPI: DPTF: Add Lunar Lake support

* acpi-apei:
  ACPI: APEI: EINJ: mark remove callback as __exit
2024-05-13 20:58:14 +02:00
Rafael J. Wysocki
3a47fbdd1a Merge branch 'thermal-intel'
Merge updates of Intel thermal drivers for v6.10:

 - Add missing MODULE_DESCRIPTION() to multiple files in the
   int340x_thermal and intel_soc_dts_iosf drivers (Srinivas Pandruvada).

 - Adjust the update delay and capabilities-per-event values in the
   Intel HFI thermal driver to prevent it from missing events and allow
   it to process more data in one go (Ricardo Neri).

* thermal-intel:
  thermal: intel: hfi: Increase the number of CPU capabilities per netlink event
  thermal: intel: hfi: Rename HFI_MAX_THERM_NOTIFY_COUNT
  thermal: intel: hfi: Shorten the thermal netlink event delay to 100ms
  thermal: intel: hfi: Rename HFI_UPDATE_INTERVAL
  thermal: intel: Add missing module description
2024-05-10 21:37:57 +02:00
Ricardo Neri
608fa85235 thermal: intel: hfi: Increase the number of CPU capabilities per netlink event
The number of updated CPU capabilities per netlink event is hard-coded to
16. On systems with more than 16 CPUs (a common case), it takes more than
one thermal netlink event to relay all the new capabilities after an HFI
interrupt. This adds unnecessary overhead to both the kernel and user space
entities.

Increase the number of CPU capabilities updated per event to 64. Any system
with 64 CPUs or less can now update all the capabilities in a single
thermal netlink event.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08 14:02:02 +02:00
Ricardo Neri
07c6f3a7ff thermal: intel: hfi: Rename HFI_MAX_THERM_NOTIFY_COUNT
When processing a hardware update, HFI generates as many thermal netlink
events as needed to relay all the updated CPU capabilities to user space.
The constant HFI_MAX_THERM_NOTIFY_COUNT is the number of CPU capabilities
updated per each of those events.

Give this constant a more descriptive name.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08 14:02:02 +02:00
Ricardo Neri
ba1a587ed6 thermal: intel: hfi: Shorten the thermal netlink event delay to 100ms
The delay between an HFI interrupt and its corresponding thermal netlink
event has so far been hard-coded to CONFIG_HZ jiffies (1 second). This
delay is too long for hardware that generates updates every tens of
milliseconds.

The HFI driver uses a delayed workqueue to send thermal netlink events. No
subsequent events will be sent if there is pending work.

As a result, much of the information of consecutive hardware updates will
be lost if the workqueue delay is too long. User space entities may act on
obsolete data. If the delay is too short, multiple events may overwhelm
listeners.

Set the delay to 100ms to strike a balance between too many and too few
events. Use milliseconds instead of jiffies to improve readability.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08 14:02:02 +02:00
Ricardo Neri
564a88eb7a thermal: intel: hfi: Rename HFI_UPDATE_INTERVAL
The name of the constant HFI_UPDATE_INTERVAL is misleading. It is not a
periodic interval at which HFI updates are processed. It is the delay in
the processing of an HFI update after the arrival of an HFI interrupt.

Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-08 14:02:02 +02:00
Rafael J. Wysocki
9396b2a669 Merge branch 'thermal-core'
This includes a major rework of thermal governors and part of the
thermal core interacting with them as well as some fixes and cleanups
of the thermal debug code:

 - Redesign the thermal governor interface to allow the governors to
   work in a more straightforward way.

 - Make thermal governors take the current trip point thresholds into
   account in their computations which allows trip hysteresis to be
   observed more accurately.

 - Clean up thermal governors.

 - Make the thermal core manage passive polling for thermal zones and
   remove passive polling management from thermal governors.

 - Improve the handling of cooling device states and thermal mitigation
   episodes in progress in the thermal debug code.

 - Avoid excessive updates of trip point statistics and clean up the
   printing of thermal mitigation episode information.

* thermal-core: (27 commits)
  thermal: core: Move passive polling management to the core
  thermal: core: Do not call handle_thermal_trip() if zone temperature is invalid
  thermal: trip: Add missing empty code line
  thermal/debugfs: Avoid printing zero duration for mitigation events in progress
  thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()
  thermal/debugfs: Create records for cdev states as they get used
  thermal: core: Introduce thermal_governor_trip_crossed()
  thermal/debugfs: Make tze_seq_show() skip invalid trips and trips with no stats
  thermal/debugfs: Rename thermal_debug_update_temp() to thermal_debug_update_trip_stats()
  thermal/debugfs: Clean up thermal_debug_update_temp()
  thermal/debugfs: Avoid excessive updates of trip point statistics
  thermal: core: Relocate critical and hot trip handling
  thermal: core: Drop the .throttle() governor callback
  thermal: gov_user_space: Use .trip_crossed() instead of .throttle()
  thermal: gov_fair_share: Eliminate unnecessary integer divisions
  thermal: gov_fair_share: Use trip thresholds instead of trip temperatures
  thermal: gov_fair_share: Use .manage() callback instead of .throttle()
  thermal: gov_step_wise: Clean up thermal_zone_trip_update()
  thermal: gov_step_wise: Use trip thresholds instead of trip temperatures
  thermal: gov_step_wise: Use .manage() callback instead of .throttle()
  ...
2024-05-06 12:24:30 +02:00
Rafael J. Wysocki
0021102527 Merge back thermal cotntrol material for v6.10. 2024-05-06 12:23:50 +02:00
Julien Panis
b66c079aab thermal/drivers/mediatek/lvts_thermal: Fix wrong lvts_ctrl index
In 'lvts_should_update_thresh()' and 'lvts_ctrl_start()' functions,
the parameter passed to 'lvts_for_each_valid_sensor()' macro is always
'lvts_ctrl->lvts_data->lvts_ctrl'. In other words, the array index 0
is systematically passed as 'struct lvts_ctrl_data' type item, even
when another item should be consumed instead.

Hence, the 'valid_sensor_mask' value which is selected can be wrong
because unrelated to the 'struct lvts_ctrl_data' type item that should
be used. Hence, some thermal zone can be registered for a sensor 'i'
that does not actually exist. Because of the invalid address used
as 'lvts_sensor[i].msr', this situation ends up with a crash in
'lvts_get_temp()' function, where this 'msr' pointer is passed to
'readl_poll_timeout()' function. The following message is output:
"Unable to handle kernel NULL pointer dereference at virtual
address <msr>", with <msr> = 0.

This patch fixes the issue.

Fixes: 11e6f4c314 ("thermal/drivers/mediatek/lvts_thermal: Allow early empty sensor slots")
Signed-off-by: Julien Panis <jpanis@baylibre.com>
Reviewed-by: Nicolas Pitre <npitre@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240503-mtk-thermal-lvts-ctrl-idx-fix-v1-2-f605c50ca117@baylibre.com
2024-05-06 10:33:26 +02:00
Julien Panis
e2d2266a2a thermal/drivers/mediatek/lvts_thermal: Remove unused members from struct lvts_ctrl_data
In struct lvts_ctrl_data, num_lvts_sensor and cal_offset[] are not used.

Signed-off-by: Julien Panis <jpanis@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240503-mtk-thermal-lvts-ctrl-idx-fix-v1-1-f605c50ca117@baylibre.com
2024-05-06 10:33:26 +02:00
Julien Panis
a1191a7735 thermal/drivers/mediatek/lvts_thermal: Check NULL ptr on lvts_data
Verify that lvts_data is not NULL before using it.

Signed-off-by: Julien Panis <jpanis@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240502-mtk-thermal-lvts-data-v1-1-65f1b0bfad37@baylibre.com
2024-05-02 15:56:41 +02:00
Srinivas Pandruvada
48d722fd39 thermal: intel: Add missing module description
Fix warnings displayed by "make W=1" build:

WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/intel_soc_dts_iosf.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_rapl.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_rfim.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_mbox.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_wt_req.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_wt_hint.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/thermal/intel/int340x_thermal/processor_thermal_power_floor

Reported-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-05-02 12:51:48 +02:00
Rafael J. Wysocki
042a3d80f1 thermal: core: Move passive polling management to the core
Passive polling is enabled by setting the 'passive' field in
struct thermal_zone_device to a positive value so long as the
'passive_delay_jiffies' field is greater than zero.  It causes
the thermal core to actively check the thermal zone temperature
periodically which in theory should be done after crossing a
passive trip point on the way up in order to allow governors to
react more rapidly to temperature changes and adjust mitigation
more precisely.

However, the 'passive' field in struct thermal_zone_device is currently
managed by governors which is quite problematic.  First of all, only
two governors, Step-Wise and Power Allocator, update that field at
all, so the other governors do not benefit from passive polling,
although in principle they should.  Moreover, if the zone governor is
changed from, say, Step-Wise to Fair-Share after 'passive' has been
incremented by the former, it is not going to be reset back to zero by
the latter even if the zone temperature falls down below all passive
trip points.

For this reason, make handle_thermal_trip() increment 'passive'
to enable passive polling for the given thermal zone whenever a
passive trip point is crossed on the way up and decrement it
whenever a passive trip point is crossed on the way down.  Also
remove the 'passive' field updates from governors and additionally
clear it in thermal_zone_device_init() to prevent passive polling
from being enabled after a system resume just beacuse it was enabled
before suspending the system.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-30 21:16:13 +02:00
Rafael J. Wysocki
202aa0d4bb thermal: core: Do not call handle_thermal_trip() if zone temperature is invalid
Make __thermal_zone_device_update() bail out if update_temperature()
fails to update the zone temperature because __thermal_zone_get_temp()
has returned an error and the current zone temperature is
THERMAL_TEMP_INVALID (user space receiving netlink thermal messages,
thermal debug code and thermal governors may get confused otherwise).

Fixes: 9ad18043fb ("thermal: core: Send trip crossing notifications at init time if needed")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-30 21:16:13 +02:00
Rafael J. Wysocki
1502718a66 thermal: trip: Add missing empty code line
Add missing empty line of code to thermal_zone_trip_id().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-30 13:04:56 +02:00
Rafael J. Wysocki
bd700ba9ad thermal/debugfs: Avoid printing zero duration for mitigation events in progress
If a thermal mitigation event is in progress, its duration value has
not been updated yet, so 0 will be printed as the event duration by
tze_seq_show() which is confusing.

Avoid doing that by marking the beginning of the event with the
KTIME_MIN duration value and making tze_seq_show() compute the current
event duration on the fly, in which case '>' will be printed instead of
'=' in the event duration value field.

Similarly, for trip points that have been crossed on the down, mark
the end of mitigation with the KTIME_MAX timestamp value and make
tze_seq_show() compute the current duration on the fly for the trip
points still involved in the mitigation, in which cases the duration
value printed by it will be prepended with a '>' character.

Fixes: 7ef01f228c ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-26 15:01:56 +02:00
Rafael J. Wysocki
31a0fa0019 thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add()
If cdev_dt_seq_show() runs before the first state transition of a cooling
device, it will not print any state residency information for it, even
though it might be reasonably expected to print residency information for
the initial state of the cooling device.

For this reason, rearrange the code to get the initial state of a cooling
device at the registration time and pass it to thermal_debug_cdev_add(),
so that the latter can create a duration record for that state which will
allow cdev_dt_seq_show() to print its residency information.

Fixes: 755113d767 ("thermal/debugfs: Add thermal cooling device debugfs information")
Reported-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-26 15:01:56 +02:00
Rafael J. Wysocki
f4ae18fcb6 thermal/debugfs: Create records for cdev states as they get used
Because thermal_debug_cdev_state_update() only creates a duration record
for the old state of a cooling device, if its new state is used for the
first time, there will be no record for it and cdev_dt_seq_show() will
not print the duration information for it even though it contains code
to compute the duration value in that case.

Address this by making thermal_debug_cdev_state_update() create a
duration record for the new state if there is none.

Fixes: 755113d767 ("thermal/debugfs: Add thermal cooling device debugfs information")
Reported-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-26 15:01:56 +02:00
Rafael J. Wysocki
8c882f172f Merge back earlier thermal core changes for v6.10. 2024-04-26 15:00:56 +02:00
Rafael J. Wysocki
d351eb0ab0 thermal/debugfs: Prevent use-after-free from occurring after cdev removal
Since thermal_debug_cdev_remove() does not run under cdev->lock, it can
run in parallel with thermal_debug_cdev_state_update() and it may free
the struct thermal_debugfs object used by the latter after it has been
checked against NULL.

If that happens, thermal_debug_cdev_state_update() will access memory
that has been freed already causing the kernel to crash.

Address this by using cdev->lock in thermal_debug_cdev_remove() around
the cdev->debugfs value check (in case the same cdev is removed at the
same time in two different threads) and its reset to NULL.

Fixes: 755113d767 ("thermal/debugfs: Add thermal cooling device debugfs information")
Cc :6.8+ <stable@vger.kernel.org> # 6.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-26 14:57:50 +02:00
Rafael J. Wysocki
c7f7c37271 thermal/debugfs: Fix two locking issues with thermal zone debug
With the current thermal zone locking arrangement in the debugfs code,
user space can open the "mitigations" file for a thermal zone before
the zone's debugfs pointer is set which will result in a NULL pointer
dereference in tze_seq_start().

Moreover, thermal_debug_tz_remove() is not called under the thermal
zone lock, so it can run in parallel with the other functions accessing
the thermal zone's struct thermal_debugfs object.  Then, it may clear
tz->debugfs after one of those functions has checked it and the
struct thermal_debugfs object may be freed prematurely.

To address the first problem, pass a pointer to the thermal zone's
struct thermal_debugfs object to debugfs_create_file() in
thermal_debug_tz_add() and make tze_seq_start(), tze_seq_next(),
tze_seq_stop(), and tze_seq_show() retrieve it from s->private
instead of a pointer to the thermal zone object.  This will ensure
that tz_debugfs will be valid across the "mitigations" file accesses
until thermal_debugfs_remove_id() called by thermal_debug_tz_remove()
removes that file.

To address the second problem, use tz->lock in thermal_debug_tz_remove()
around the tz->debugfs value check (in case the same thermal zone is
removed at the same time in two different threads) and its reset to NULL.

Fixes: 7ef01f228c ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
Cc :6.8+ <stable@vger.kernel.org> # 6.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-26 11:11:30 +02:00
Rafael J. Wysocki
72c1afffa4 thermal/debugfs: Free all thermal zone debug memory on zone removal
Because thermal_debug_tz_remove() does not free all memory allocated for
thermal zone diagnostics, some of that memory becomes unreachable after
freeing the thermal zone's struct thermal_debugfs object.

Address this by making thermal_debug_tz_remove() free all of the memory
in question.

Fixes: 7ef01f228c ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
Cc :6.8+ <stable@vger.kernel.org> # 6.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-26 11:11:00 +02:00
Rafael J. Wysocki
f831892e23 thermal: core: Introduce thermal_governor_trip_crossed()
Add a wrapper around the .trip_crossed() governor callback invocation
to reduce code duplications slightly and improve the code layout in
__thermal_zone_device_update().

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-24 20:45:06 +02:00
Rafael J. Wysocki
a6258fde8d thermal/debugfs: Make tze_seq_show() skip invalid trips and trips with no stats
Currently, tze_seq_show() output includes all of the trips in the zone
except for critical ones, including invalid trips and trips with no stats
which is confusing.

Make it skip the trips for which there is not mitigation information.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 20:43:11 +02:00
Rafael J. Wysocki
8dff6e8438 thermal/debugfs: Rename thermal_debug_update_temp() to thermal_debug_update_trip_stats()
Rename thermal_debug_update_temp() to thermal_debug_update_trip_stats()
which is a better match for the purpose of the function.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 20:43:09 +02:00
Rafael J. Wysocki
e271f9974d thermal/debugfs: Clean up thermal_debug_update_temp()
Notice that it is not necessary to compute tze in every iteration of the
for () loop in thermal_debug_update_temp() because it is the same for all
trips, so compute it once before the loop starts.

Also use a trip_stats local variable to make the code in that loop easier
to follow and move the trip_id variable definition into that loop because
it is not used elsewhere in the function.

While at it, change to order of local variable definitions in the function
to follow the reverse-xmas-tree pattern.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 20:43:06 +02:00
Rafael J. Wysocki
0a293c7758 thermal/debugfs: Avoid excessive updates of trip point statistics
Since thermal_debug_update_temp() is called before invoking
thermal_debug_tz_trip_down() for the trips that were crossed by the
zone temperature on the way up, it updates the statistics for them
as though the current zone temperature was above the low temperature
of each of them.  However, if a given trip has just been crossed on the
way down, the zone temperature is in fact below its low temperature,
but this is handled by thermal_debug_tz_trip_down() running after the
update of the trip statistics.

The remedy is to call thermal_debug_update_temp() after
thermal_debug_tz_trip_down() has been invoked for all of the
trips in question, but then thermal_debug_tz_trip_up() needs to
be adjusted, so it does not update the statistics for the trips
that has just been crossed on the way up, as that will be taken
care of by thermal_debug_update_temp() down the road.

Modify the code accordingly.

Fixes: 7ef01f228c ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 20:43:02 +02:00
Rafael J. Wysocki
2ae0998c67 thermal: core: Relocate critical and hot trip handling
Modify handle_thermal_trip() to call handle_critical_trips() only after
finding that the trip temperature has been crossed on the way up and
remove the redundant temperature check from the latter.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 20:42:39 +02:00
Rafael J. Wysocki
ad2f8bccd0 thermal: core: Drop the .throttle() governor callback
Since all of the governors in the tree have been switched over to using
the new callbacks, either .trip_crossed() or .manage(), the .throttle()
governor callback is not used any more, so drop it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 20:42:31 +02:00
Rafael J. Wysocki
c1beda1cfc thermal: gov_user_space: Use .trip_crossed() instead of .throttle()
Notifying user space about trip points that have not been crossed is
not particularly useful, so modify the User Space governor to use the
.trip_crossed() callback, which is only invoked for trips that have been
crossed, instead of .throttle() that is invoked for all trips in a
thermal zone every time the zone is updated.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 20:42:10 +02:00
Vincent Guittot
c281afe24f thermal/cpufreq: Remove arch_update_thermal_pressure()
arch_update_thermal_pressure() aims to update fast changing signal which
should be averaged using PELT filtering before being provided to the
scheduler which can't make smart use of fast changing signal.
cpufreq now provides the maximum freq_qos pressure on the capacity to the
scheduler, which includes cpufreq cooling device. Remove the call to
arch_update_thermal_pressure() in cpufreq cooling device as this is
handled by cpufreq_get_pressure().

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Dhruva Gole <d-gole@ti.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Link: https://lore.kernel.org/r/20240326091616.3696851-4-vincent.guittot@linaro.org
2024-04-24 12:08:00 +02:00
Rafael J. Wysocki
c98e24795e thermal: gov_fair_share: Eliminate unnecessary integer divisions
The computations carried out by fair_share_throttle() for each trip
point include at least one redundant integer division which introduces
superfluous rounding errors.  Also the multiplications by 100 in it are
not really necessary and can be eliminated.

Rearrange fair_share_throttle() to carry out only one integer division per
trip and only as many integer multiplications as necessary and rename one
variable in it (while at it).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-24 10:15:08 +02:00
Rafael J. Wysocki
0292991ce4 thermal: gov_fair_share: Use trip thresholds instead of trip temperatures
In principle, the Fair Share governor should take trip hysteresis
into account.  After all, once a trip has been crossed on the way up,
mitigation is still needed until it is crossed on the way down.

For this reason, make it use trip thresholds that are computed by
the core when trips are crossed, so as to apply mitigations if the
zone temperature is in a hysteresis rage of one or more trips that
were crossed on the way up, but have not been crossed on the way
down yet.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-24 10:15:06 +02:00
Rafael J. Wysocki
bec55332c2 thermal: gov_fair_share: Use .manage() callback instead of .throttle()
The Fair Share governor tries very hard to be stateless and so it
calls get_trip_level() from fair_share_throttle() every time, even
though the number produced by this function for all of the trips
during a given thermal zone update is actually the same.  Since
get_trip_level() walks all of the trips in the thermal zone every
time it is called, doing this may generate quite a bit of completely
useless overhead.

For this reason, make the governor use the new .manage() callback
instead of .throttle() which allows it to call get_trip_level() just
once and use the value computed by it to handle all of the trips.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-24 10:15:04 +02:00
Rafael J. Wysocki
fe03626650 thermal: gov_step_wise: Clean up thermal_zone_trip_update()
Do some assorted cleanups in thermal_zone_trip_update():

 * Compute the trend value upfront.
 * Move old_target definition to the block where it is used.
 * Adjust white space around diagnostic messages and locking.
 * Use suitable field formatting in a message to avoid an explicit
   cast to int.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-24 10:14:58 +02:00
Rafael J. Wysocki
e4065f144f thermal: gov_step_wise: Use trip thresholds instead of trip temperatures
In principle, the Step-Wise governor should take trip hysteresis into
account.  After all, once a trip has been crossed on the way up,
mitigation is still needed until it is crossed on the way down.

For this reason, make it use trip thresholds that are computed by
the core when trips are crossed, so as to apply mitigations in the
hysteresis rages of trips that were crossed on the way up, but have
not been crossed on the way down yet.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 10:14:33 +02:00
Rafael J. Wysocki
a6ce8c7da5 thermal: gov_step_wise: Use .manage() callback instead of .throttle()
Make the Step-Wise governor use the new .manage() callback instead of
.throttle().

Even though using .throttle() is not particularly problematic for the
Step-Wise governor, using .manage() instead still allows it to reduce
overhead by updating all of the cooling devices once after setting
target values for all of the thermal instances.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-24 10:13:53 +02:00
Rafael J. Wysocki
ca0e9728d3 thermal: gov_power_allocator: Eliminate a redundant variable
Notice that the passive field in struct thermal_zone_device is not
used by the Power Allocator governor itself and so the ordering of
its updates with respect to allow_maximum_power() or allocate_power()
does not matter.

Accordingly, make power_allocator_manage() update that field right
before returning, which allows the current value of it to be passed
directly to allow_maximum_power() without using the additional update
variable that can be dropped.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-23 20:39:50 +02:00
Rafael J. Wysocki
41ddbcc6fd thermal: gov_power_allocator: Use .manage() callback instead of .throttle()
The Power Allocator governor really only wants to be called once per
thermal zone update and it does a special check to skip the extra,
from its perspective, invocations of the .throttle() callback.

Make it use .manage() instead of .throttle().

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-23 20:39:50 +02:00
Rafael J. Wysocki
976f44133f thermal: core: Introduce .manage() callback for thermal governors
Introduce a new thermal governor callback called .manage() that will be
invoked once per thermal zone update after processing all of the trip
points in the core.

This will allow governors that look at multiple trip points together
to check all of them in a consistent configuration, so they don't need
to play tricks with skipping .throttle() invocations that they are not
interested in and they can avoid carrying out the same computations for
multiple times in one cycle.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-23 20:38:41 +02:00
Rafael J. Wysocki
0ae204a667 thermal: gov_bang_bang: Fold thermal_zone_trip_update() into its caller
Fold thermal_zone_trip_update() into bang_bang_control() which is the
only caller of it to reduce code size and make it easier to follow.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-23 20:38:26 +02:00
Rafael J. Wysocki
4526c58109 thermal: gov_bang_bang: Clean up thermal_zone_trip_update()
Do the following cleanups in thermal_zone_trip_update():

 * Drop the useless "zero hysteresis" message.
 * Eliminate the trip_index local variable that is redundant.
 * Drop 2 comments that are not useful.
 * Downgrade a diagnostic message from pr_warn() to pr_debug().
 * Use consistent field formatting in diagnostic messages.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-23 20:38:06 +02:00
Rafael J. Wysocki
530c932bdf thermal: gov_bang_bang: Use .trip_crossed() instead of .throttle()
The Bang-Bang governor really is only concerned about trip point
crossing, so it can use the new .trip_crossed() callback instead of
.throttle() that is not particularly suitable for it.

Modify it to do so which also takes trip hysteresis into account, so the
governor does not need to use it directly any more.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-23 20:37:41 +02:00
Greg Kroah-Hartman
e5019b1423 Merge 6.9-rc5 into driver-core-next
We want the kernfs fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-23 13:27:43 +02:00
Binbin Zhou
734b5def91 thermal/drivers/loongson2: Add Loongson-2K2000 support
The Loongson-2K2000 and Loongson-2K1000 have similar thermal sensors,
except that the temperature is read differently.

In particular, the temperature output registers of the Loongson-2K2000
are defined in the chip configuration domain and are read in a different
way.

Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Acked-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/fdbfdcc3231a36a4ee0bcf1377ddcbd6f8c944b5.1713837379.git.zhoubinbin@loongson.cn
2024-04-23 12:40:30 +02:00
Binbin Zhou
6ef6d5ff5e thermal/drivers/loongson2: Trivial code style adjustment
Here are some minor code style adjustment. Such as fix whitespace code
style; align function call arguments to opening parenthesis.

Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Acked-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/ccca50f2ad3fd8c84fcbfcb1f875427ea7f637a0.1713837379.git.zhoubinbin@loongson.cn
2024-04-23 12:40:30 +02:00
Nicolas Pitre
f4745f546e thermal/drivers/mediatek/lvts_thermal: Add MT8188 support
Various values extracted from the vendor's kernel driver.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-15-nico@fluxnic.net
2024-04-23 12:40:30 +02:00
Nicolas Pitre
11e6f4c314 thermal/drivers/mediatek/lvts_thermal: Allow early empty sensor slots
Some systems don't always populate sensor controller slots starting
at slot 0. Use a bitmap instead of a count to indicate valid sensor
slots. Also create a pretty iterator for that.

About that iterator: it causes checkpatch to complain with "ERROR:
Macros with multiple statements should be enclosed in a do - while
loop". However this is not possible here. And many similar iterators
do exist using the same form in the tree already.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-12-nico@fluxnic.net
2024-04-23 12:40:30 +02:00
Nicolas Pitre
684cbb49f9 thermal/drivers/mediatek/lvts_thermal: Provision for gt variable location
The golden temperature calibration value in nvram is not always the
3rd byte. A future commit will prove this assumption wrong.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-11-nico@fluxnic.net
2024-04-23 12:40:30 +02:00
Nicolas Pitre
a4c1ab2f4c thermal/drivers/mediatek/lvts_thermal: Add MT8186 support
Various values extracted from the vendor's kernel driver.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-9-nico@fluxnic.net
2024-04-23 12:40:30 +02:00
Nicolas Pitre
2cc0b1a216 thermal/drivers/mediatek/lvts_thermal: Guard against efuse data buffer overflow
We don't want to silently fetch garbage past the actual buffer.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-6-nico@fluxnic.net
2024-04-23 12:40:29 +02:00
Nicolas Pitre
5b3367e28a thermal/drivers/mediatek/lvts_thermal: Use offsets for every calibration byte
Current code assumes calibration values are always stored contiguously
in host endian order. A future patch will prove this wrong.

Let's specify the offset for each calibration byte instead.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-5-nico@fluxnic.net
2024-04-23 12:40:29 +02:00
Nicolas Pitre
554bca3130 thermal/drivers/mediatek/lvts_thermal: Remove .hw_tshut_temp
All the .hw_tshut_temp instances are initialized with the same value.
Let's remove those and use a common definition instead. If ever a
different value must be used in the future then an override parameter
could be added back.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-4-nico@fluxnic.net
2024-04-23 12:40:29 +02:00
Nicolas Pitre
62194e637d thermal/drivers/mediatek/lvts_thermal: Move comment
Move efuse data interpretation inside lvts_golden_temp_init() alongside
the actual code retrieving wanted value.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-3-nico@fluxnic.net
2024-04-23 12:40:29 +02:00
Nicolas Pitre
8c25958f71 thermal/drivers/mediatek/lvts_thermal: Retrieve all calibration bytes
Calibration values are 24-bit wide. Those values so far appear to span
only 16 bits but let's not push our luck.

Found while looking at the original Mediatek driver code.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240402032729.2736685-2-nico@fluxnic.net
2024-04-23 12:40:29 +02:00
Christophe JAILLET
61fad0a906 thermal/drivers/k3_bandgap: Remove some unused fields in struct k3_bandgap
In "struct k3_bandgap", the 'conf' field is unused.
Remove it.

Found with cppcheck, unusedStructMember.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/206d39bed9ec6df9b4d80b1fc064e499389fc7fc.1712687420.git.christophe.jaillet@wanadoo.fr
2024-04-23 12:40:29 +02:00
Christophe JAILLET
0a0c8db884 thermal/drivers/qcom: Remove some unused fields in struct qpnp_tm_chip
In "struct qpnp_tm_chip", the 'prev_stage' field is unused.
Remove it.

Found with cppcheck, unusedStructMember.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/d1c3a3c455f485dae46290e3488daf1dcc1d355a.1712687589.git.christophe.jaillet@wanadoo.fr
2024-04-23 12:40:29 +02:00
Aleksandr Mishin
d998ddc86a thermal/drivers/tsens: Fix null pointer dereference
compute_intercept_slope() is called from calibrate_8960() (in tsens-8960.c)
as compute_intercept_slope(priv, p1, NULL, ONE_PT_CALIB) which lead to null
pointer dereference (if DEBUG or DYNAMIC_DEBUG set).
Fix this bug by adding null pointer check.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: dfc1193d4d ("thermal/drivers/tsens: Replace custom 8960 apis with generic apis")
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240411114021.12203-1-amishin@t-argos.ru
2024-04-23 12:40:29 +02:00
Hsin-Te Yuan
7954c92ede thermal/drivers/mediatek/lvts_thermal: Add coeff for mt8192
In order for lvts_raw_to_temp to function properly on mt8192,
temperature coefficients for mt8192 need to be added.

Fixes: 288732242d ("thermal/drivers/mediatek/lvts_thermal: Add mt8192 support")
Signed-off-by: Hsin-Te Yuan <yuanhsinte@chromium.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240416-lvts_thermal-v2-1-f8a36882cc53@chromium.org
2024-04-23 12:40:29 +02:00
Niklas Söderlund
63d23fb781 thermal/drivers/rcar_gen3: Update temperature approximation calculation
The initial driver used a formula to approximate the temperature and
register values reversed engineered from an out-of-tree BSP driver. This
was needed as the datasheet at the time did not contain any information
on how to do this. Later Gen3 (Rev 2.30) and Gen4 (all) now contains
this information.

Update the approximation formula to use the datasheet's information
instead of the reversed-engineered one.

On an idle M3-N without fused calibration values for PTAT and THCODE the
old formula reports,

    zone0: 52000
    zone1: 53000
    zone2: 52500

While the new formula under the same circumstances reports,

    zone0: 52500
    zone1: 54000
    zone2: 54000

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240327133013.3982199-3-niklas.soderlund+renesas@ragnatech.se
2024-04-23 12:40:29 +02:00
Niklas Söderlund
566e0ea7d0 thermal/drivers/rcar_gen3: Move Tj_T storage to shared private data
The calculated Tj_T constant is calculated from the PTAT data either
read from the first TSC zone on the device if calibration data is fused,
or from fallback values in the driver itself. The value calculated is
shared among all TSC zones.

Move the Tj_T constant to the shared private data structure instead of
duplicating it in each TSC private data. This requires adding a pointer
to the shared data to the TSC private data structure. This back pointer
make it easier to further rework the temperature conversion logic.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240327133013.3982199-2-niklas.soderlund+renesas@ragnatech.se
2024-04-23 12:40:29 +02:00
Dmitry Rokosov
7fcd7dfa5e thermal/drivers/amlogic: Support A1 SoC family Thermal Sensor controller
In comparison to other Amlogic chips, there is one key difference.
The offset for the sec_ao base, also known as u_efuse_off, is special,
while other aspects remain the same.

Signed-off-by: Dmitry Rokosov <ddrokosov@salutedevices.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240328191322.17551-3-ddrokosov@salutedevices.com
2024-04-23 12:40:29 +02:00
Priyansh Jain
34b9a92b68 thermal/drivers/tsens: Add suspend to RAM support for tsens
As part of suspend to RAM, tsens hardware will be turned off.
While resume callback, re-initialize tsens hardware.

Signed-off-by: Priyansh Jain <quic_priyjain@quicinc.com>
Acked-by: Amit Kucheria <amitk@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240328050230.31770-1-quic_priyjain@quicinc.com
2024-04-23 12:40:29 +02:00
Rasmus Villemoes
58b1569244 thermal/drivers/armada: Simplify name sanitization
Simplify the code by using the helper we have for doing exactly this.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240320104940.65031-1-linux@rasmusvillemoes.dk
2024-04-23 12:40:29 +02:00
Konrad Dybcio
d9d3490c48 thermal/drivers/qcom/lmh: Check for SCM availability at probe
Up until now, the necessary scm availability check has not been
performed, leading to possible null pointer dereferences (which did
happen for me on RB1).

Fix that.

Fixes: 53bca371cd ("thermal/drivers/qcom: Add support for LMh driver")
Cc: <stable@vger.kernel.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240308-topic-rb1_lmh-v2-2-bac3914b0fe3@linaro.org
2024-04-23 12:40:29 +02:00
Rafael J. Wysocki
80f5fd45c7 thermal: core: Introduce .trip_crossed() callback for thermal governors
Introduce a new thermal governor callback called .trip_crossed()
that will be invoked whenever a trip point is crossed by the zone
temperature, either on the way up or on the way down.

The trip crossing direction information will be passed to it and if
multiple trips are crossed in the same direction during one thermal zone
update, the new callback will be invoked for them in temperature order,
either ascending or descending, depending on the trip crossing
direction.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-19 22:11:06 +02:00
Rafael J. Wysocki
5c897a9a12 Merge back earlier thermal control material for v6.10. 2024-04-19 15:17:21 +02:00
Rafael J. Wysocki
b552f63cd4 thermal/debugfs: Add missing count increment to thermal_debug_tz_trip_up()
The count field in struct trip_stats, representing the number of times
the zone temperature was above the trip point, needs to be incremented
in thermal_debug_tz_trip_up(), for two reasons.

First, if a trip point is crossed on the way up for the first time,
thermal_debug_update_temp() called from update_temperature() does
not see it because it has not been added to trips_crossed[] array
in the thermal zone's struct tz_debugfs object yet.  Therefore, when
thermal_debug_tz_trip_up() is called after that, the trip point's
count value is 0, and the attempt to divide by it during the average
temperature computation leads to a divide error which causes the kernel
to crash.  Setting the count to 1 before the division by incrementing it
fixes this problem.

Second, if a trip point is crossed on the way up, but it has been
crossed on the way up already before, its count value needs to be
incremented to make a record of the fact that the zone temperature is
above the trip now.  Without doing that, if the mitigations applied
after crossing the trip cause the zone temperature to drop below its
threshold, the count will not be updated for this episode at all and
the average temperature in the trip statistics record will be somewhat
higher than it should be.

Fixes: 7ef01f228c ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
Cc :6.8+ <stable@vger.kernel.org> # 6.8+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-19 15:08:19 +02:00
Rafael J. Wysocki
0dbf608717 Merge branch 'thermal-intel' into thermal
* thermal-intel:
  thermal: intel: int340x_thermal: replace deprecated strncpy() with strscpy()
  thermal: intel: hfi: Enable HFI only when required
  thermal: netlink: Rename thermal_gnl_family
  thermal: netlink: Add genetlink bind/unbind notifications
2024-04-15 15:45:32 +02:00
Lukas Wunner
66bc1a1733 treewide: Use sysfs_bin_attr_simple_read() helper
Deduplicate ->read() callbacks of bin_attributes which are backed by a
simple buffer in memory:

Use the newly introduced sysfs_bin_attr_simple_read() helper instead,
either by referencing it directly or by declaring such bin_attributes
with BIN_ATTR_SIMPLE_RO() or BIN_ATTR_SIMPLE_ADMIN_RO().

Aside from a reduction of LoC, this shaves off a few bytes from vmlinux
(304 bytes on an x86_64 allyesconfig).

No functional change intended.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Zhi Wang <zhiwang@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/92ee0a0e83a5a3f3474845db6c8575297698933a.1712410202.git.lukas@wunner.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-11 16:02:25 +02:00
Sumeet Pawnikar
79b510c492 ACPI: DPTF: Add Lunar Lake support
Add Lunar Lake ACPI IDs for DPTF devices.

Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-08 16:10:27 +02:00
Rafael J. Wysocki
0444d574fb thermal: core: Relocate the struct thermal_governor definition
Notice that struct thermal_governor is only used by the thermal core
and so move its definition to thermal_core.h.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08 16:01:50 +02:00
Rafael J. Wysocki
7454f2c42c thermal: core: Sort trip point crossing notifications by temperature
If multiple trip points are crossed in one go and the trips table in
the thermal zone device object is not sorted, the corresponding trip
point crossing notifications sent to user space will not be ordered
either.

Moreover, if the trips table is sorted by trip temperature in ascending
order, the trip crossing notifications on the way up will be sent in that
order too, but the trip crossing notifications on the way down will be
sent in the reverse order.

This is generally confusing and it is better to make the kernel send the
notifications in the order of growing (on the way up) or falling (on the
way down) trip temperature.

To achieve that, instead of sending a trip crossing notification and
recording a trip crossing event in the statistics right away from
handle_thermal_trip(), put the trip in question on a list that will be
sorted by __thermal_zone_device_update() after processing all of the trips
and before sending the notifications and recording trip crossing events.

Link: https://lore.kernel.org/linux-pm/20240306085428.88011-1-daniel.lezcano@linaro.org/
Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08 16:01:20 +02:00
Rafael J. Wysocki
9ad18043fb thermal: core: Send trip crossing notifications at init time if needed
If a trip point is already exceeded by the zone temperature at the
initialization time, no trip crossing notification is send regarding
this even though mitigation should be started then.

Address this by rearranging the code in handle_thermal_trip() to
send a trip crossing notification for trip points already exceeded
by the zone temperature initially which also allows to reduce its
size by using the observation that the initialization and regular
trip crossing on the way up become the same case then.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08 16:01:20 +02:00
Rafael J. Wysocki
f99c1b87a9 thermal: core: Rewrite comments in handle_thermal_trip()
Make the comments regarding trip crossing and threshold updates in
handle_thermal_trip() slightly more clear.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-08 16:01:20 +02:00
Rafael J. Wysocki
b1ae92dcfa thermal: core: Make struct thermal_zone_device definition internal
Move the definitions of struct thermal_trip_desc and struct
thermal_zone_device to an internal header file in the thermal core,
as they don't need to be accessible to any code other than the thermal
core and so they don't need to be present in a global header.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-04-08 16:01:20 +02:00
Rafael J. Wysocki
daeeb032f4 thermal: core: Move threshold out of struct thermal_trip
The threshold field in struct thermal_trip is only used internally by
the thermal core and it is better to prevent drivers from misusing it.
It also takes some space unnecessarily in the trip tables passed by
drivers to the core during thermal zone registration.

For this reason, introduce struct thermal_trip_desc as a wrapper around
struct thermal_trip, move the threshold field directly into it and make
the thermal core store struct thermal_trip_desc objects in the internal
thermal zone trip tables.  Adjust all of the code using trip tables in
the thermal core accordingly.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08 16:01:20 +02:00
Rafael J. Wysocki
053b852c46 thermal: gov_step_wise: Simplify checks related to passive trips
Make it more clear from the code flow that the passive polling status
updates only take place for passive trip points.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08 15:57:50 +02:00
Rafael J. Wysocki
f1164d333c thermal: gov_step_wise: Simplify get_target_state()
The step-wise governor's get_target_state() function contains redundant
braces, redundant parens and a redundant next_target local variable, so
get rid of all that stuff.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
2024-04-08 15:57:50 +02:00
Nikita Travkin
da781936e7 thermal: gov_power_allocator: Allow binding without trip points
IPA probe function was recently refactored to perform extra error checks
and make sure the thermal zone has trip points necessary for the IPA
operation. With this change, if a thermal zone is probed such that it
has no trip points that IPA can use, IPA will fail and the TZ won't be
created. This is the case if a platform defines a TZ without cooling
devices and only with "hot"/"critical" trip points, often found on some
Qualcomm devices [1].

Documentation across IPA code (notably get_governor_trips() kerneldoc)
suggests that IPA is supposed to handle such TZ even if it won't
actually do anything.

This commit partially reverts the previous change to allow IPA to bind
to such "empty" thermal zones.

Fixes: e83747c2f8 ("thermal: gov_power_allocator: Set up trip points earlier")
Link: arch/arm64/boot/dts/qcom/sc7180.dtsi#n4776 # [1]
Signed-off-by: Nikita Travkin <nikita@trvn.ru>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-03 16:32:15 +02:00
Nikita Travkin
1057c4c36e thermal: gov_power_allocator: Allow binding without cooling devices
IPA was recently refactored to split out memory allocation into a
separate funciton. That funciton was made to return -EINVAL if there is
zero power_actors and thus no memory to allocate. This causes IPA to
fail probing when the thermal zone has no attached cooling devices.

Since cooling devices can attach after the thermal zone is created and
the governer is attached to it, failing probe due to the lack of cooling
devices is incorrect.

Change the allocate_actors_buffer() to return success when there is no
cooling devices present.

Fixes: 912e97c67c ("thermal: gov_power_allocator: Move memory allocation out of throttle()")
Signed-off-by: Nikita Travkin <nikita@trvn.ru>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-04-03 16:32:14 +02:00
Ye Zhang
a26de34b3c thermal: devfreq_cooling: Fix perf state when calculate dfc res_util
The issue occurs when the devfreq cooling device uses the EM power model
and the get_real_power() callback is provided by the driver.

The EM power table is sorted ascending,can't index the table by cooling
device state,so convert cooling state to performance state by
dfc->max_state - dfc->capped_state.

Fixes: 615510fe13 ("thermal: devfreq_cooling: remove old power model and use EM")
Cc: 5.11+ <stable@vger.kernel.org> # 5.11+
Signed-off-by: Ye Zhang <ye.zhang@rock-chips.com>
Reviewed-by: Dhruva Gole <d-gole@ti.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-27 16:27:39 +01:00
Justin Stitt
03fa9a3ad1 thermal: intel: int340x_thermal: replace deprecated strncpy() with strscpy()
strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

psvt->limit.string can only be 8 bytes so let's use the appropriate size
macro ACPI_LIMIT_STR_MAX_LEN.

Neither psvt->limit.string or psvt_user[i].limit.string requires the
NUL-padding behavior that strncpy() provides as they have both been
filled with NUL-bytes prior to the string operation.
|	memset(&psvt->limit, 0, sizeof(u64));
and
| 	psvt_user = kzalloc(psvt_len, GFP_KERNEL);

Let's use `strscpy` [2] due to the fact that it guarantees
NUL-termination on the destination buffer without unnecessarily
NUL-padding.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings # [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-27 14:57:44 +01:00
Stanislaw Gruszka
b33f3d2677 thermal: intel: hfi: Enable HFI only when required
Enable and disable hardware feedback interface (HFI) when user space
handler is present. For example, enable HFI, when intel-speed-select or
Intel Low Power daemon is running and subscribing to thermal netlink
events. When user space handlers exit or remove subscription for
thermal netlink events, disable HFI.

Summary of changes:

 - Register a thermal genetlink notifier

 - In the notifier, process THERMAL_NOTIFY_BIND and THERMAL_NOTIFY_UNBIND
   reason codes to count number of thermal event group netlink multicast
   clients. If thermal netlink group has any listener enable HFI on all
   packages. If there are no listener disable HFI on all packages.

- When CPU is online, instead of blindly enabling HFI, check if
  the thermal netlink group has any listener. This will make sure that
  HFI is not enabled by default during boot time.

- Actual processing to enable/disable matches what is done in
  suspend/resume callbacks. Create two functions hfi_enable_instance()
  and hfi_disable_instance(), which can be called from the netlink
  notifier callback and suspend/resume callbacks.

Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-27 14:50:26 +01:00
Stanislaw Gruszka
afdaff3706 thermal: netlink: Rename thermal_gnl_family
Almost all thermal netlink structures use thermal_genl prefix.
Change thermal_gnl_family name accordingly for consistency.

No functional impact.

Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-27 14:50:26 +01:00
Stanislaw Gruszka
cf580ad490 thermal: netlink: Add genetlink bind/unbind notifications
Introduce a new feature to the thermal netlink framework, enabling the
registration of sub drivers to receive events via a notifier mechanism.
Specifically, implement genetlink family bind and unbind callbacks to send
BIND and UNBIND events.

The primary purpose of this enhancement is to facilitate the tracking of
user-space consumers by the Intel HFI driver. By leveraging these
notifications, the driver can determine when consumers are present
or absent.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-27 14:50:26 +01:00
Daniel Lezcano
f67cf45dee Revert "thermal: core: Don't update trip points inside the hysteresis range"
It has been reported the commit cf3986f8c0 introduced a regression
when the temperature is wavering in the hysteresis region. The
mitigation stops leading to an uncontrolled temperature increase until
reaching the critical trip point.

Here what happens:

 * 'throttle' is when the current temperature is greater than the trip
   point temperature
 * 'target' is the mitigation level
 * 'passive' is positive when there is a mitigation, zero otherwise
 * these values are computed in the step_wise governor

Configuration:

 trip point 1: temp=95°C, hyst=5°C (passive)
 trip point 2: temp=115°C, hyst=0°C (critical)
 governor: step_wise

 1. The temperature crosses the way up the trip point 1 at 95°C

   - trend=raising
   - throttle=1, target=1
   - passive=1
   - set_trips: low=90°C, high=115°C

 2. The temperature decreases but stays in the hysteresis region at
    93°C

   - trend=dropping
   - throttle=0, target=0
   - passive=1

   Before cf3986f8c0
   - set_trips: low=90°C, high=95°C

   After cf3986f8c0
   - set_trips: low=90°C, high=115°C

 3. The temperature increases a bit but stays in the hysteresis region
    at 94°C (so below the trip point 1 temp 95°C)

   - trend=raising
   - throttle=0, target=0
   - passive=1

   Before cf3986f8c0
   - set_trips: low=90°C, high=95°C

   After cf3986f8c0
   - set_trips: low=90°C, high=115°C

 4. The temperature decreases but stays in the hysteresis region at
    93°C

   - trend=dropping
   - throttle=0, target=THERMAL_NO_TARGET
   - passive=0

   Before cf3986f8c0
   - set_trips: low=90°C, high=95°C

   After cf3986f8c0
   - set_trips: low=90°C, high=115°C

At this point, the 'passive' value is zero, there is no mitigation,
the temperature is in the hysteresis region, the next trip point is
115°C. As 'passive' is zero, the timer to monitor the thermal zone is
disabled. Consequently if the temperature continues to increase, no
mitigation will happen and it will reach the 115°C trip point and
reboot.

Before the optimization, the high boundary would have been 95°C, thus
triggering the mitigation again and rearming the polling timer.

The optimization make sense but given the current implementation of
the step_wise governor collaborating via this 'passive' flag with the
core framework it can not work.

From a higher perspective it seems like there is a problem between the
governor which sets a variable to be used by the core framework. That
sounds akward and it would make much more sense if the core framework
controls the governor and not the opposite. But as the devil hides in
the details, there are some subtilities to be addressed before.

Elaborating those would be out of the scope this changelog. So let's
stay simple and revert the change first to fixup all broken mobile
platforms.

This reverts commit cf3986f8c0 ("thermal: core: Don't update trip
points inside the hysteresis range") and takes a conflict with commit
0c0c4740c9 ("0c0c4740c9d2 thermal: trip: Use for_each_trip() in
__thermal_zone_set_trips()") in drivers/thermal/thermal_trip.c into
account.

Fixes: cf3986f8c0 ("thermal: core: Don't update trip points inside the hysteresis range")
Reported-by: Manaf Meethalavalappu Pallikunhi <quic_manafm@quicinc.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Cc: 6.7+ <stable@vger.kernel.org> # 6.7+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-26 13:18:13 +01:00
Rafael J. Wysocki
4e7193acde - Fix memory leak in the error path at probe time in the Mediatek LVTS
driver (Christophe Jaillet)
 
 - Fix control buffer enablement regression on Meditek MT7896 (Frank
   Wunderlich)
 
 - Drop spaces before TABs in different places: thermal-of, ST drivers
   and Makefile (Geert Uytterhoeven)
 
 - Adjust DT binding for NXP as fsl,tmu-range min/maxItems can vary
   among several SoC versions (Fabio Estevam)
 
 - Add support for H616 THS controller for the Sun8i platforms. Note
   that this change relies on another change in the SoC specific code
   which is included in this branch (Martin Botka)
 
 - Don't fail probe due to zone registration failure because there is
   no trip points defined in the DT (Mark Brown)
 
 - Support variable TMU array size for new platforms (Peng Fan)
 
 - Adjust the DT binding for thermal-of and make the polling time not
   required and assume it is zero when not found in the DT (Konrad
   Dybcio)
 
 - Add r8a779h0 support in both the DT and the driver (Geert Uytterhoeven)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGn3N4YVz0WNVyHskqDIjiipP6E8FAmXxqsUACgkQqDIjiipP
 6E9jDgf+NpBl2l9Wb8EhFcB+QRcOY79rn8KzifM4XdU1qwoog3I64lt5mAcznpbf
 7hLK9DQJYjuIx8oi/uxKaYSCno1bwuY/Vpi0/BBQ2DxhhX26C7XfZ+D4o1M3/ciZ
 /jsoaXg3WMKYVeywaHohWxjWH+rLGXxmxDlfcAXlKdAZkbEKLCzsbaRaSy/H8Xhc
 zk9Cr2RdV+HHcuAWK1TGEHUGXINSR5tk5TvY7QCfvLYsloGpSLi6vr68CAEpvBQB
 6pxW8BkAWAVQy6pgP3WthzpE+ZkEnv1vhu7fn0uoEjuxedqvdj+PKDC4svm1k/pA
 +8sV89A5RZi1zOoWWYj8TnH/K2W85w==
 =rTh4
 -----END PGP SIGNATURE-----

Merge tag 'thermal-v6.9-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux

Merge additional thermal control changes for 6.9-rc1 from Daniel Lezcano:

"- Fix memory leak in the error path at probe time in the Mediatek LVTS
   driver (Christophe Jaillet)

 - Fix control buffer enablement regression on Meditek MT7896 (Frank
   Wunderlich)

 - Drop spaces before TABs in different places: thermal-of, ST drivers
   and Makefile (Geert Uytterhoeven)

 - Adjust DT binding for NXP as fsl,tmu-range min/maxItems can vary
   among several SoC versions (Fabio Estevam)

 - Add support for H616 THS controller for the Sun8i platforms. Note
   that this change relies on another change in the SoC specific code
   which is included in this branch (Martin Botka)

 - Don't fail probe due to zone registration failure because there is
   no trip points defined in the DT (Mark Brown)

 - Support variable TMU array size for new platforms (Peng Fan)

 - Adjust the DT binding for thermal-of and make the polling time not
   required and assume it is zero when not found in the DT (Konrad
   Dybcio)

 - Add r8a779h0 support in both the DT and the driver (Geert Uytterhoeven)"

* tag 'thermal-v6.9-rc1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
  thermal/drivers/rcar_gen3: Add support for R-Car V4M
  dt-bindings: thermal: rcar-gen3-thermal: Add r8a779h0 support
  thermal/of: Assume polling-delay(-passive) 0 when absent
  dt-bindings: thermal-zones: Don't require polling-delay(-passive)
  thermal/drivers/qoriq: Fix getting tmu range
  thermal/drivers/sun8i: Don't fail probe due to zone registration failure
  thermal/drivers/sun8i: Add support for H616 THS controller
  thermal/drivers/sun8i: Add SRAM register access code
  thermal/drivers/sun8i: Extend H6 calibration to support 4 sensors
  thermal/drivers/sun8i: Explain unknown H6 register value
  dt-bindings: thermal: sun8i: Add H616 THS controller
  soc: sunxi: sram: export register 0 for THS on H616
  dt-bindings: thermal: qoriq-thermal: Adjust fsl,tmu-range min/maxItems
  thermal: Drop spaces before TABs
  thermal/drivers/mediatek: Fix control buffer enablement on MT7896
  thermal/drivers/mediatek/lvts_thermal: Fix a memory leak in an error handling path
2024-03-13 20:35:48 +01:00
Linus Torvalds
259f7d5e2b Thermal control updates for 6.9-rc1
- Store zone trips table and zone operations directly in struct
    thermal_zone_device (Rafael Wysocki).
 
  - Fix up flex array initialization during thermal zone device
    registration (Nathan Chancellor).
 
  - Rework writable trip points handling in the thermal core and
    several drivers (Rafael Wysocki).
 
  - Thermal core code cleanups (Dan Carpenter, Flavio Suligoi).
 
  - Use thermal zone accessor functions in the int340x Intel thermal
    driver (Rafael Wysocki).
 
  - Add Lunar Lake-M PCI ID to the int340x Intel thermal driver (Srinivas
    Pandruvada).
 
  - Minor fixes for thermal governors (Rafael Wysocki, Di Shen).
 
  - Trip point handling fixes for the iwlwifi wireless driver (Rafael
    Wysocki).
 
  - Code cleanups (Rafael J. Wysocki, AngeloGioacchino Del Regno).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmXvJ0oSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx63MQAIAzLFMzDqbG5bFq096tREuwtYhkQMq/
 n3ZW+FhMfSr5MnGiTCelk/6auYvijMweylxrnfQM8ilIrSWVc2fNks6PTjI/hTe6
 OUfF+nEAu+fv6I68p3evlI+IL7cncU1kygYhDRr6yxh5AFDn/BED/Klv8Ms0CkOi
 YVk6+ZCsvkcC74Tvjm9+wDJZ7XHBqKXsYCyBKqxSBmMePc0FOqDGgji2d6Q8O9Ka
 uITT9W4IhF9GNEV/ujUIrHVbfkUqJHn1sfJTOynG1Zp/MopA8mXAB2fWvkl4Kfd9
 UvgHXZnBM4jFONlqHNlaV9mTiigMaKsgU1wfaSj7fgj8DELWGII0MQfC9kcUgvlJ
 +qqmZ52tc8DKU3Lj6Wg58wgMTrI4XVAJjXwg9CTo65y6KyMuT1dkypnH95TdVtWl
 qZJ9WdxAmAbCJqZzj10kn44HrF565/t0hShrjKvv+inzDyZ5jXMttK3TQS20REsC
 MzoIxahlSUkN32OjiKhebrTNShzqFM6dxTDJLktMiInpgnnZJ/VG4Bao+NkSlLIJ
 ZwTV1xOqZZarkPVMlrOijE1bs6HbomZ7ZEsDSxvtwp+MZ06G4ICY11/KbTw9IZFv
 lCZiFNEzzxzrgqcz+5gS9y8/alknqiU5DSKCxfhbnNTW+Tk09mYPK5N0umUGwaTA
 gQ4fWsBoTpZF
 =Is/Y
 -----END PGP SIGNATURE-----

Merge tag 'thermal-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control updates from Rafael Wysocki:
 "These mostly change the thermal core in a few ways allowing thermal
  drivers to be simplified, in particular in their removal and failing
  probe handling parts that are notoriously prone to errors, and
  propagate the changes to several drivers.

  Apart from that, support for a new platform is added (Intel Lunar
  Lake-M), some bugs are fixed and some code is cleaned up, as usual.

  Specifics:

   - Store zone trips table and zone operations directly in struct
     thermal_zone_device (Rafael Wysocki)

   - Fix up flex array initialization during thermal zone device
     registration (Nathan Chancellor)

   - Rework writable trip points handling in the thermal core and
     several drivers (Rafael Wysocki)

   - Thermal core code cleanups (Dan Carpenter, Flavio Suligoi)

   - Use thermal zone accessor functions in the int340x Intel thermal
     driver (Rafael Wysocki)

   - Add Lunar Lake-M PCI ID to the int340x Intel thermal driver
     (Srinivas Pandruvada)

   - Minor fixes for thermal governors (Rafael Wysocki, Di Shen)

   - Trip point handling fixes for the iwlwifi wireless driver (Rafael
     Wysocki)

   - Code cleanups (Rafael J. Wysocki, AngeloGioacchino Del Regno)"

* tag 'thermal-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (29 commits)
  thermal: core: remove unnecessary check in trip_point_hyst_store()
  thermal: intel: int340x_thermal: Use thermal zone accessor functions
  thermal: core: Remove excess empty line from a comment
  thermal: int340x: processor_thermal: Add Lunar Lake-M PCI ID
  thermal: core: Eliminate writable trip points masks
  thermal: of: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  thermal: imx: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  wifi: iwlwifi: mvm: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  mlxsw: core_thermal: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  thermal: intel: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  thermal: core: Drop the .set_trip_hyst() thermal zone operation
  thermal: core: Add flags to struct thermal_trip
  thermal: core: Move initial num_trips assignment before memcpy()
  thermal: Get rid of CONFIG_THERMAL_WRITABLE_TRIPS
  thermal: intel: Adjust ops handling during thermal zone registration
  thermal: ACPI: Constify acpi_thermal_zone_ops
  thermal: core: Store zone ops in struct thermal_zone_device
  thermal: intel: Discard trip tables after zone registration
  thermal: ACPI: Discard trips table after zone registration
  thermal: core: Store zone trips table in struct thermal_zone_device
  ...
2024-03-13 12:03:57 -07:00
Linus Torvalds
07abb19a9b Power management updates for 6.9-rc1
- Allow the Energy Model to be updated dynamically (Lukasz Luba).
 
  - Add support for LZ4 compression algorithm to the hibernation image
    creation and loading code (Nikhil V).
 
  - Fix and clean up system suspend statistics collection (Rafael
    Wysocki).
 
  - Simplify device suspend and resume handling in the power management
    core code (Rafael Wysocki).
 
  - Fix PCI hibernation support description (Yiwei Lin).
 
  - Make hibernation take set_memory_ro() return values into account as
    appropriate (Christophe Leroy).
 
  - Set mem_sleep_current during kernel command line setup to avoid an
    ordering issue with handling it (Maulik Shah).
 
  - Fix wake IRQs handling when pm_runtime_force_suspend() is used as a
    driver's system suspend callback (Qingliang Li).
 
  - Simplify pm_runtime_get_if_active() usage and add a replacement for
    pm_runtime_put_autosuspend() (Sakari Ailus).
 
  - Add a tracepoint for runtime_status changes tracking (Vilas Bhat).
 
  - Fix section title markdown in the runtime PM documentation (Yiwei
    Lin).
 
  - Enable preferred core support in the amd-pstate cpufreq driver (Meng
    Li).
 
  - Fix min_perf assignment in amd_pstate_adjust_perf() and make the
    min/max limit perf values in amd-pstate always stay within the
    (highest perf, lowest perf) range (Tor Vic, Meng Li).
 
  - Allow intel_pstate to assign model-specific values to strings used in
    the EPP sysfs interface and make it do so on Meteor Lake (Srinivas
    Pandruvada).
 
  - Drop long-unused cpudata::prev_cummulative_iowait from the
    intel_pstate cpufreq driver (Jiri Slaby).
 
  - Prevent scaling_cur_freq from exceeding scaling_max_freq when the
    latter is an inefficient frequency (Shivnandan Kumar).
 
  - Change default transition delay in cpufreq to 2ms (Qais Yousef).
 
  - Remove references to 10ms minimum sampling rate from comments in the
    cpufreq code (Pierre Gondois).
 
  - Honour transition_latency over transition_delay_us in cpufreq (Qais
    Yousef).
 
  - Stop unregistering cpufreq cooling on CPU hot-remove (Viresh Kumar).
 
  - General enhancements / cleanups to ARM cpufreq drivers (tianyu2,
    Nícolas F. R. A. Prado, Erick Archer, Arnd Bergmann, Anastasia
    Belova).
 
  - Update cpufreq-dt-platdev to block/approve devices (Richard Acayan).
 
  - Make the SCMI cpufreq driver get a transition delay value from
    firmware (Pierre Gondois).
 
  - Prevent the haltpoll cpuidle governor from shrinking guest
    poll_limit_ns below grow_start (Parshuram Sangle).
 
  - Avoid potential overflow in integer multiplication when computing
    cpuidle state parameters (C Cheng).
 
  - Adjust MWAIT hint target C-state computation in the ACPI cpuidle
    driver and in intel_idle to return a correct value for C0 (He
    Rongguang).
 
  - Address multiple issues in the TPMI RAPL driver and add support for
    new platforms (Lunar Lake-M, Arrow Lake) to Intel RAPL (Zhang Rui).
 
  - Fix freq_qos_add_request() return value check in dtpm_cpu (Daniel
    Lezcano).
 
  - Fix kernel-doc for dtpm_create_hierarchy() (Yang Li).
 
  - Fix file leak in get_pkg_num() in x86_energy_perf_policy (Samasth
    Norway Ananda).
 
  - Fix cpupower-frequency-info.1 man page typo (Jan Kratochvil).
 
  - Fix a couple of warnings in the OPP core code related to W=1
    builds (Viresh Kumar).
 
  - Move dev_pm_opp_{init|free}_cpufreq_table() to pm_opp.h (Viresh
    Kumar).
 
  - Extend dev_pm_opp_data with turbo support (Sibi Sankar).
 
  - dt-bindings: drop maxItems from inner items (David Heidelberg).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmXvI/ISHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx24sP/jxg6fOGme8raHQvpTXG3/H56wlGzQ4P
 YUvvKUXnfD3yf1zNISsUl7VQebZqDt8rygkwSdymXlUVZX1eubN0RpCFc0F8GZuc
 THG/YQhYQr/9zro3FpKhfDj5evk21PCQzjf+dGvfQF9qVMxNPG1JzEFK6PnolT5X
 2BvkonY1XFWZjCMbZ83B/jt35lTDb0cmeNbCpfD5UJgcnxmMOtZYpORdyfPWTJpG
 GVCwmAFVVXxXlust/AIpt3mmOpKzSA9GnrtJkhtQe5GN+Y4OjnJiFJmTC7EfCctj
 JlWgVUA716mtFMUrjXgjfI54firF2oQpqaSa2HG/V/A96JWQqjarGz5dAV1IrPEt
 ZmYpvMe4E90S411wF1OWyrEqjXUuDnH1OWUvUdWSt4E7DhFw3esDi/jLW2tyVKAT
 hIy+/O4wzbDSTX/h9Cgt1Qjhew6lKUIwvhEXclB3fuJ+JoviWNkC9lnK93e2H0A3
 VYfkd/lpUD74035l0FrCJ/49MjX9kqrsn+TipHsIlSXAi8ZRdKbVvxOTD8RYudcI
 GvCiDDrkMgNwGlyedgbtTBUepCvSg93b+vVmRj7YMPtBhioOUo3qCn6wpqhxfnth
 9BCnPW7JxqUw/NJdlk9hKumaUZq+MK8G+kdYcIDg6xmAkWSUVP2QKlWavfMCxqRP
 +dN6T2iHsKFe
 =UePT
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "From the functional perspective, the most significant change here is
  the addition of support for Energy Models that can be updated
  dynamically at run time.

  There is also the addition of LZ4 compression support for hibernation,
  the new preferred core support in amd-pstate, new platforms support in
  the Intel RAPL driver, new model-specific EPP handling in intel_pstate
  and more.

  Apart from that, the cpufreq default transition delay is reduced from
  10 ms to 2 ms (along with some related adjustments), the system
  suspend statistics code undergoes a significant rework and there is a
  usual bunch of fixes and code cleanups all over.

  Specifics:

   - Allow the Energy Model to be updated dynamically (Lukasz Luba)

   - Add support for LZ4 compression algorithm to the hibernation image
     creation and loading code (Nikhil V)

   - Fix and clean up system suspend statistics collection (Rafael
     Wysocki)

   - Simplify device suspend and resume handling in the power management
     core code (Rafael Wysocki)

   - Fix PCI hibernation support description (Yiwei Lin)

   - Make hibernation take set_memory_ro() return values into account as
     appropriate (Christophe Leroy)

   - Set mem_sleep_current during kernel command line setup to avoid an
     ordering issue with handling it (Maulik Shah)

   - Fix wake IRQs handling when pm_runtime_force_suspend() is used as a
     driver's system suspend callback (Qingliang Li)

   - Simplify pm_runtime_get_if_active() usage and add a replacement for
     pm_runtime_put_autosuspend() (Sakari Ailus)

   - Add a tracepoint for runtime_status changes tracking (Vilas Bhat)

   - Fix section title markdown in the runtime PM documentation (Yiwei
     Lin)

   - Enable preferred core support in the amd-pstate cpufreq driver
     (Meng Li)

   - Fix min_perf assignment in amd_pstate_adjust_perf() and make the
     min/max limit perf values in amd-pstate always stay within the
     (highest perf, lowest perf) range (Tor Vic, Meng Li)

   - Allow intel_pstate to assign model-specific values to strings used
     in the EPP sysfs interface and make it do so on Meteor Lake
     (Srinivas Pandruvada)

   - Drop long-unused cpudata::prev_cummulative_iowait from the
     intel_pstate cpufreq driver (Jiri Slaby)

   - Prevent scaling_cur_freq from exceeding scaling_max_freq when the
     latter is an inefficient frequency (Shivnandan Kumar)

   - Change default transition delay in cpufreq to 2ms (Qais Yousef)

   - Remove references to 10ms minimum sampling rate from comments in
     the cpufreq code (Pierre Gondois)

   - Honour transition_latency over transition_delay_us in cpufreq (Qais
     Yousef)

   - Stop unregistering cpufreq cooling on CPU hot-remove (Viresh Kumar)

   - General enhancements / cleanups to ARM cpufreq drivers (tianyu2,
     Nícolas F. R. A. Prado, Erick Archer, Arnd Bergmann, Anastasia
     Belova)

   - Update cpufreq-dt-platdev to block/approve devices (Richard Acayan)

   - Make the SCMI cpufreq driver get a transition delay value from
     firmware (Pierre Gondois)

   - Prevent the haltpoll cpuidle governor from shrinking guest
     poll_limit_ns below grow_start (Parshuram Sangle)

   - Avoid potential overflow in integer multiplication when computing
     cpuidle state parameters (C Cheng)

   - Adjust MWAIT hint target C-state computation in the ACPI cpuidle
     driver and in intel_idle to return a correct value for C0 (He
     Rongguang)

   - Address multiple issues in the TPMI RAPL driver and add support for
     new platforms (Lunar Lake-M, Arrow Lake) to Intel RAPL (Zhang Rui)

   - Fix freq_qos_add_request() return value check in dtpm_cpu (Daniel
     Lezcano)

   - Fix kernel-doc for dtpm_create_hierarchy() (Yang Li)

   - Fix file leak in get_pkg_num() in x86_energy_perf_policy (Samasth
     Norway Ananda)

   - Fix cpupower-frequency-info.1 man page typo (Jan Kratochvil)

   - Fix a couple of warnings in the OPP core code related to W=1 builds
     (Viresh Kumar)

   - Move dev_pm_opp_{init|free}_cpufreq_table() to pm_opp.h (Viresh
     Kumar)

   - Extend dev_pm_opp_data with turbo support (Sibi Sankar)

   - dt-bindings: drop maxItems from inner items (David Heidelberg)"

* tag 'pm-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (95 commits)
  dt-bindings: opp: drop maxItems from inner items
  OPP: debugfs: Fix warning around icc_get_name()
  OPP: debugfs: Fix warning with W=1 builds
  cpufreq: Move dev_pm_opp_{init|free}_cpufreq_table() to pm_opp.h
  OPP: Extend dev_pm_opp_data with turbo support
  Fix cpupower-frequency-info.1 man page typo
  cpufreq: scmi: Set transition_delay_us
  firmware: arm_scmi: Populate fast channel rate_limit
  firmware: arm_scmi: Populate perf commands rate_limit
  cpuidle: ACPI/intel: fix MWAIT hint target C-state computation
  PM: sleep: wakeirq: fix wake irq warning in system suspend
  powercap: dtpm: Fix kernel-doc for dtpm_create_hierarchy() function
  cpufreq: Don't unregister cpufreq cooling on CPU hotplug
  PM: suspend: Set mem_sleep_current during kernel command line setup
  cpufreq: Honour transition_latency over transition_delay_us
  cpufreq: Limit resolving a frequency to policy min/max
  Documentation: PM: Fix runtime_pm.rst markdown syntax
  cpufreq: amd-pstate: adjust min/max limit perf
  cpufreq: Remove references to 10ms min sampling rate
  cpufreq: intel_pstate: Update default EPPs for Meteor Lake
  ...
2024-03-13 11:40:06 -07:00
Geert Uytterhoeven
1828c1c17b thermal/drivers/rcar_gen3: Add support for R-Car V4M
Add support for the Thermal Sensor/Chip Internal Voltage Monitor/Core
Voltage Monitor (THS/CIVM/CVM) on the Renesas R-Car V4M (R8A779H0) SoC.

The conversion formulas for R-Car V4M are the same as for other R-Car
Gen4 SoCs.

Based on a patch in the BSP by Duy Nguyen.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/bd5b002a802c1e058e0048592f17862db1d04263.1709722342.git.geert+renesas@glider.be
2024-03-11 17:14:46 +01:00
Konrad Dybcio
488164006a thermal/of: Assume polling-delay(-passive) 0 when absent
Currently, thermal zones associated with providers that have interrupts
for signaling hot/critical trips are required to set a polling-delay
of 0 to indicate no polling. This feels a bit backwards.

Change the code such that "no polling delay" also means "no polling".

Suggested-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Bjorn Andersson <andersson@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240125-topic-thermal-v1-2-3c9d4dced138@linaro.org
2024-03-11 17:14:46 +01:00
Peng Fan
4d0642074c thermal/drivers/qoriq: Fix getting tmu range
TMU Version 1 has 4 TTRCRs, while TMU Version >=2 has 16 TTRCRs.
So limit the len to 4 will report "invalid range data" for i.MX93.

This patch drop the local array with allocated ttrcr array and
able to support larger tmu ranges.

Fixes: f12d60c81f ("thermal/drivers/qoriq: Support version 2.1")
Tested-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240226003657.3012880-1-peng.fan@oss.nxp.com
2024-03-11 17:14:46 +01:00
Mark Brown
79d998421d thermal/drivers/sun8i: Don't fail probe due to zone registration failure
Currently the sun8i thermal driver will fail to probe if any of the
thermal zones it is registering fails to register with the thermal core.
Since we currently do not define any trip points for the GPU thermal
zones on at least A64 or H5 this means that we have no thermal support
on these platforms:

[    1.698703] thermal_sys: Failed to find 'trips' node
[    1.698707] thermal_sys: Failed to find trip points for thermal-sensor id=1

even though the main CPU thermal zone on both SoCs is fully configured.
This does not seem ideal, while we may not be able to use all the zones
it seems better to have those zones which are usable be operational.
Instead just carry on registering zones if we get any non-deferral
error, allowing use of those zones which are usable.

This means that we also need to update the interrupt handler to not
attempt to notify the core for events on zones which we have not
registered, I didn't see an ability to mask individual interrupts and
I would expect that interrupts would still be indicated in the ISR even
if they were masked.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240123-thermal-sun8i-registration-v3-1-3e5771b1bbdd@kernel.org
2024-03-11 17:14:46 +01:00
Martin Botka
83620b3b04 thermal/drivers/sun8i: Add support for H616 THS controller
Add support for the thermal sensor found in H616 SoCs, is the same as
the H6 thermal sensor controller, but with four sensors.
Also the registers readings are wrong, unless a bit in the first SYS_CFG
register cleared, so set exercise the SRAM regmap to take care of that.

Signed-off-by: Martin Botka <martin.botka@somainline.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240219153639.179814-7-andre.przywara@arm.com
2024-03-11 17:14:46 +01:00
Andre Przywara
bd0b451bd4 thermal/drivers/sun8i: Add SRAM register access code
The Allwinner H616 SoC needs to clear a bit in one register in the SRAM
controller, to report reasonable temperature values. On reset, bit 16 in
register 0x3000000 is set, which leads to the driver reporting
temperatures around 200C. Clearing this bit brings the values down to the
expected range. The BSP code does a one-time write in U-Boot, with a
comment just mentioning the effect on the THS, but offering no further
explanation.

To not rely on firmware to set things up for us, add code that queries
the SRAM controller device via a DT phandle link, then clear just this
single bit.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240219153639.179814-6-andre.przywara@arm.com
2024-03-11 17:14:46 +01:00
Maksim Kiselev
63e39fcaa6 thermal/drivers/sun8i: Extend H6 calibration to support 4 sensors
The H616 SoC resembles the H6 thermal sensor controller, with a few
changes like four sensors.

Extend sun50i_h6_ths_calibrate() function to support calibration of
these sensors.

Co-developed-by: Martin Botka <martin.botka@somainline.org>
Signed-off-by: Martin Botka <martin.botka@somainline.org>
Signed-off-by: Maksim Kiselev <bigunclemax@gmail.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Acked-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240219153639.179814-5-andre.przywara@arm.com
2024-03-11 17:14:46 +01:00
Andre Przywara
d7bf0a1ada thermal/drivers/sun8i: Explain unknown H6 register value
So far we were ORing in some "unknown" value into the THS control
register on the Allwinner H6. This part of the register is not explained
in the H6 manual, but the H616 manual details those bits, and on closer
inspection the THS IP blocks in both SoCs seem very close:
- The BSP code for both SoCs writes the same values into THS_CTRL.
- The reset values of at least the first three registers are the same.

Replace the "unknown" value with its proper meaning: "acquire time",
most probably the sample part of the sample & hold circuit of the ADC,
according to its explanation in the H616 manual.

No functional change, just a macro rename and adjustment.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Acked-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20240219153639.179814-4-andre.przywara@arm.com
2024-03-11 17:14:46 +01:00
Geert Uytterhoeven
f492d8220f thermal: Drop spaces before TABs
There is never a need to have a space before a TAB, but it hurts the
eyes of vim users.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/480478a53fd42621e97b2db36e181903cc0f53e3.1708001426.git.geert+renesas@glider.be
2024-03-11 17:14:46 +01:00
Frank Wunderlich
371ed6263e thermal/drivers/mediatek: Fix control buffer enablement on MT7896
Reading thermal sensor on mt7986 devices returns invalid temperature:

bpi-r3 ~ # cat /sys/class/thermal/thermal_zone0/temp
 -274000

Fix this by adding missing members in mtk_thermal_data struct which were
used in mtk_thermal_turn_on_buffer after commit 33140e668b.

Cc: stable@vger.kernel.org
Fixes: 33140e668b ("thermal/drivers/mediatek: Control buffer enablement tweaks")
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com>
Reviewed-by: Daniel Golle <daniel@makrotopia.org>
Tested-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230907112018.52811-1-linux@fw-web.de
2024-03-11 17:14:46 +01:00
Christophe JAILLET
ca93bf607a thermal/drivers/mediatek/lvts_thermal: Fix a memory leak in an error handling path
If devm_krealloc() fails, then 'efuse' is leaking.
So free it to avoid a leak.

Fixes: f5f633b182 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/481d345233862d58c3c305855a93d0dbc2bbae7e.1706431063.git.christophe.jaillet@wanadoo.fr
2024-03-11 17:14:46 +01:00
Rafael J. Wysocki
3bd834640b Merge branch 'pm-em'
Merge Enery Model changes for 6.9-rc1:

 - Allow the Energy Model to be updated dynamically (Lukasz Luba).

* pm-em: (24 commits)
  PM: EM: Fix nr_states warnings in static checks
  Documentation: EM: Update with runtime modification design
  PM: EM: Add em_dev_compute_costs()
  PM: EM: Remove old table
  PM: EM: Change debugfs configuration to use runtime EM table data
  drivers/thermal/devfreq_cooling: Use new Energy Model interface
  drivers/thermal/cpufreq_cooling: Use new Energy Model interface
  powercap/dtpm_devfreq: Use new Energy Model interface to get table
  powercap/dtpm_cpu: Use new Energy Model interface to get table
  PM: EM: Optimize em_cpu_energy() and remove division
  PM: EM: Support late CPUs booting and capacity adjustment
  PM: EM: Add performance field to struct em_perf_state and optimize
  PM: EM: Add em_perf_state_from_pd() to get performance states table
  PM: EM: Introduce em_dev_update_perf_domain() for EM updates
  PM: EM: Add functions for memory allocations for new EM tables
  PM: EM: Use runtime modified EM for CPUs energy estimation in EAS
  PM: EM: Introduce runtime modifiable table
  PM: EM: Split the allocation and initialization of the EM table
  PM: EM: Check if the get_cost() callback is present in em_compute_costs()
  PM: EM: Introduce em_compute_costs()
  ...
2024-03-11 15:59:51 +01:00
Rafael J. Wysocki
dcb497ec99 Merge branches 'thermal-core' and 'thermal-intel'
Merge thermal core changes and Intel thermal drivers changes for
6.9-rc1:

 - Store zone trips table and zone operations directly in struct
   thermal_zone_device (Rafael Wysocki).

 - Rework writable trip points handling (Rafael Wysocki).

 - Thermal core code cleanups (Dan Carpenter, Flavio Suligoi).

 - Use thermal zone accessor functions in the int340x Intel thermal
   driver (Rafael Wysocki).

 - Add Lunar Lake-M PCI ID to the int340x Intel thermal driver (Srinivas
   Pandruvada).

* thermal-core:
  thermal: core: remove unnecessary check in trip_point_hyst_store()
  thermal: core: Remove excess empty line from a comment
  thermal: core: Eliminate writable trip points masks
  thermal: of: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  thermal: imx: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  wifi: iwlwifi: mvm: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  mlxsw: core_thermal: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  thermal: intel: Set THERMAL_TRIP_FLAG_RW_TEMP directly
  thermal: core: Drop the .set_trip_hyst() thermal zone operation
  thermal: core: Add flags to struct thermal_trip
  thermal: core: Move initial num_trips assignment before memcpy()
  thermal: Get rid of CONFIG_THERMAL_WRITABLE_TRIPS
  thermal: intel: Adjust ops handling during thermal zone registration
  thermal: ACPI: Constify acpi_thermal_zone_ops
  thermal: core: Store zone ops in struct thermal_zone_device
  thermal: intel: Discard trip tables after zone registration
  thermal: ACPI: Discard trips table after zone registration
  thermal: core: Store zone trips table in struct thermal_zone_device

* thermal-intel:
  thermal: intel: int340x_thermal: Use thermal zone accessor functions
  thermal: int340x: processor_thermal: Add Lunar Lake-M PCI ID
2024-03-07 21:05:12 +01:00
Dan Carpenter
59d894a078 thermal: core: remove unnecessary check in trip_point_hyst_store()
This code was shuffled around a bit recently.  We no longer need to
check the value of "ret" because we know it's zero.

Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-06 13:26:25 +01:00
Rafael J. Wysocki
53b94f421d thermal: intel: int340x_thermal: Use thermal zone accessor functions
Make int340x_thermal use the dedicated accessor functions for the
thermal zone device object address and the thermal zone type string.

This is requisite for future thermal core improvements.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
2024-03-05 21:38:22 +01:00
Rafael J. Wysocki
166d017d34 Merge thermal core changes for 6.9 to satisfy a dependency. 2024-03-05 21:37:49 +01:00
Flavio Suligoi
32abd25087 thermal: core: Remove excess empty line from a comment
The first and the third lines of the kerneldoc comment for:

thermal_zone_device_set_polling()

belong to the same sentences, so join them together.

Signed-off-by: Flavio Suligoi <f.suligoi@asem.it>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-03-05 13:17:33 +01:00
Srinivas Pandruvada
f1f0c44522 thermal: int340x: processor_thermal: Add Lunar Lake-M PCI ID
Add Lunar Lake-M PCI ID for processor thermal device.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-28 20:46:51 +01:00
Rafael J. Wysocki
4a62d588a8 thermal: core: Eliminate writable trip points masks
All of the thermal_zone_device_register_with_trips() callers pass zero
writable trip points masks to it, so drop the mask argument from that
function and update all of its callers accordingly.

This also removes the artificial trip points per zone limit of 32,
related to using writable trip points masks.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-27 12:04:38 +01:00
Rafael J. Wysocki
83c2d444ed thermal: of: Set THERMAL_TRIP_FLAG_RW_TEMP directly
It is now possible to flag trip points with THERMAL_TRIP_FLAG_RW_TEMP
to allow their temperature to be set from user space via sysfs instead
of using a nonzero writable trips mask during thermal zone registration,
so make the OF thermal code do that.

No intentional functional impact.

Note that this change is requisite for dropping the mask argument from
thermal_zone_device_register_with_trips() going forward.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-27 12:04:01 +01:00
Rafael J. Wysocki
68e9c60353 thermal: imx: Set THERMAL_TRIP_FLAG_RW_TEMP directly
It is now possible to flag trip points with THERMAL_TRIP_FLAG_RW_TEMP
to allow their temperature to be set from user space via sysfs instead
of using a nonzero writable trips mask during thermal zone registration,
so make the imx thermal code do that.

No intentional functional impact.

Note that this change is requisite for dropping the mask argument from
thermal_zone_device_register_with_trips() going forward.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-27 12:04:01 +01:00
Rafael J. Wysocki
cca52f6969 thermal: intel: Set THERMAL_TRIP_FLAG_RW_TEMP directly
Some Intel thermal drivers need/want the temperature of their trip
points to be set by user space via sysfs and so they pass nonzero
writable trip masks during thermal zone registration for this purpose.

It is now possible to achieve the same result by setting the
THERMAL_TRIP_FLAG_RW_TEMP trip flag directly, so modify the drivers
in question to do that instead of using a nonzero writable trips mask.

No intentional functional impact.

Note that this change is requisite for dropping the mask argument from
thermal_zone_device_register_with_trips() going forward.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-27 12:04:01 +01:00
Rafael J. Wysocki
46f5bef8ec thermal: core: Drop the .set_trip_hyst() thermal zone operation
None of the users of the thermal core provides a .set_trip_hyst()
thermal zone operation, so drop that callback from struct
thermal_zone_device_ops and update trip_point_hyst_store()
accordingly.

No functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-27 12:04:01 +01:00
Rafael J. Wysocki
5340f76472 thermal: core: Add flags to struct thermal_trip
In order to allow thermal zone creators to specify the writability of
trip point temperature and hysteresis on a per-trip basis, add a flags
field to struct thermal_trip and define flags to represent the desired
trip properties.

Also make thermal_zone_device_register_with_trips() set the
THERMAL_TRIP_FLAG_RW_TEMP flag for all trips covered by the writable
trips mask passed to it and modify the thermal sysfs code to look at
the trip flags instead of using the writable trips mask directly or
checking the presence of the .set_trip_hyst() zone callback.

Additionally, make trip_point_temp_store() and trip_point_hyst_store()
fail with an error code if the trip passed to one of them has
THERMAL_TRIP_FLAG_RW_TEMP or THERMAL_TRIP_FLAG_RW_HYST,
respectively, clear in its flags.

No intentional functional impact.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-27 12:02:50 +01:00
Nathan Chancellor
da1983355c thermal: core: Move initial num_trips assignment before memcpy()
When booting a CONFIG_FORTIFY_SOURCE=y kernel compiled with a toolchain
that supports __counted_by() (such as clang-18 and newer), there is a
panic on boot:

  [    2.913770] memcpy: detected buffer overflow: 72 byte write of buffer size 0
  [    2.920834] WARNING: CPU: 2 PID: 1 at lib/string_helpers.c:1027 __fortify_report+0x5c/0x74
  ...
  [    3.039208] Call trace:
  [    3.041643]  __fortify_report+0x5c/0x74
  [    3.045469]  __fortify_panic+0x18/0x20
  [    3.049209]  thermal_zone_device_register_with_trips+0x4c8/0x4f8

This panic occurs because trips is counted by num_trips but num_trips is
assigned after the call to memcpy(), so the fortify checks think the
buffer size is zero because tz was allocated with kzalloc().

Move the num_trips assignment before the memcpy() to resolve the panic
and ensure that the fortify checks work properly.

Fixes: 9b0a627586 ("thermal: core: Store zone trips table in struct thermal_zone_device")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-27 11:58:47 +01:00
Rafael J. Wysocki
a85739c8c6 thermal: Get rid of CONFIG_THERMAL_WRITABLE_TRIPS
The only difference made by CONFIG_THERMAL_WRITABLE_TRIPS is whether or
not the writable trips mask passed during thermal zone registration
will take any effect, but whoever passes a non-zero writable trips mask
to thermal_zone_device_register_with_trips() can be forgiven thinking
that it will always work.

Moreover, some thermal drivers expect user space to set trip temperature
values, so they select CONFIG_THERMAL_WRITABLE_TRIPS, possibly overriding
a manual choice to unset it and going against the design purportedly
allowing system integrators to decide on the writability of trip points
for the given kernel build.  It is also set in one platform's defconfig.

Forthermore, CONFIG_THERMAL_WRITABLE_TRIPS only affects trip temperature,
because trip hysteresis is writable as long as the thermal zone provides
a callback to update it, regardless of the CONFIG_THERMAL_WRITABLE_TRIPS
value.

The above means that the symbol in question is used inconsistently and
its purpose is at least moot, so remove it and always take the writable
trip mask passed to thermal_zone_device_register_with_trips() into
account.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-23 18:24:48 +01:00
Rafael J. Wysocki
62dd17846d thermal: intel: Adjust ops handling during thermal zone registration
Because thermal zone operations are now stored directly in struct
thermal_zone_device, thermal zone creators can discard the operations
structure after the zone registration is complete, or it can be made
read-only.

Accordingly, make int340x_thermal_zone_add() use a local variable to
represent thermal zone operations, so it is freed automatically upon the
function exit, and make the other Intel thermal drivers use const zone
operations structures.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-23 18:24:48 +01:00
Rafael J. Wysocki
698a1eb1f7 thermal: core: Store zone ops in struct thermal_zone_device
The current code requires thermal zone creators to pass pointers to
writable ops structures to thermal_zone_device_register_with_trips()
which needs to modify the target struct thermal_zone_device_ops object
if the "critical" operation in it is NULL.

Moreover, the callers of thermal_zone_device_register_with_trips() are
required to hold on to the struct thermal_zone_device_ops object passed
to it until the given thermal zone is unregistered.

Both of these requirements are quite inconvenient, so modify struct
thermal_zone_device to contain struct thermal_zone_device_ops as field and
make thermal_zone_device_register_with_trips() copy the contents of the
struct thermal_zone_device_ops passed to it via a pointer (which can be
const now) to that field.

Also adjust the code using thermal zone ops accordingly and modify
thermal_of_zone_register() to use a local ops variable during
thermal zone registration so ops do not need to be freed in
thermal_of_zone_unregister() any more.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
2024-02-23 18:24:48 +01:00