Reading thermal sensor on mt7986 devices returns invalid temperature:
bpi-r3 ~ # cat /sys/class/thermal/thermal_zone0/temp
-274000
Fix this by adding missing members in mtk_thermal_data struct which were
used in mtk_thermal_turn_on_buffer after commit 33140e668b10.
Cc: stable@vger.kernel.org
Fixes: 33140e668b10 ("thermal/drivers/mediatek: Control buffer enablement tweaks")
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com>
Reviewed-by: Daniel Golle <daniel@makrotopia.org>
Tested-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230907112018.52811-1-linux@fw-web.de
If devm_krealloc() fails, then 'efuse' is leaking.
So free it to avoid a leak.
Fixes: f5f633b18234 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/481d345233862d58c3c305855a93d0dbc2bbae7e.1706431063.git.christophe.jaillet@wanadoo.fr
Merge Enery Model changes for 6.9-rc1:
- Allow the Energy Model to be updated dynamically (Lukasz Luba).
* pm-em: (24 commits)
PM: EM: Fix nr_states warnings in static checks
Documentation: EM: Update with runtime modification design
PM: EM: Add em_dev_compute_costs()
PM: EM: Remove old table
PM: EM: Change debugfs configuration to use runtime EM table data
drivers/thermal/devfreq_cooling: Use new Energy Model interface
drivers/thermal/cpufreq_cooling: Use new Energy Model interface
powercap/dtpm_devfreq: Use new Energy Model interface to get table
powercap/dtpm_cpu: Use new Energy Model interface to get table
PM: EM: Optimize em_cpu_energy() and remove division
PM: EM: Support late CPUs booting and capacity adjustment
PM: EM: Add performance field to struct em_perf_state and optimize
PM: EM: Add em_perf_state_from_pd() to get performance states table
PM: EM: Introduce em_dev_update_perf_domain() for EM updates
PM: EM: Add functions for memory allocations for new EM tables
PM: EM: Use runtime modified EM for CPUs energy estimation in EAS
PM: EM: Introduce runtime modifiable table
PM: EM: Split the allocation and initialization of the EM table
PM: EM: Check if the get_cost() callback is present in em_compute_costs()
PM: EM: Introduce em_compute_costs()
...
Merge thermal core changes and Intel thermal drivers changes for
6.9-rc1:
- Store zone trips table and zone operations directly in struct
thermal_zone_device (Rafael Wysocki).
- Rework writable trip points handling (Rafael Wysocki).
- Thermal core code cleanups (Dan Carpenter, Flavio Suligoi).
- Use thermal zone accessor functions in the int340x Intel thermal
driver (Rafael Wysocki).
- Add Lunar Lake-M PCI ID to the int340x Intel thermal driver (Srinivas
Pandruvada).
* thermal-core:
thermal: core: remove unnecessary check in trip_point_hyst_store()
thermal: core: Remove excess empty line from a comment
thermal: core: Eliminate writable trip points masks
thermal: of: Set THERMAL_TRIP_FLAG_RW_TEMP directly
thermal: imx: Set THERMAL_TRIP_FLAG_RW_TEMP directly
wifi: iwlwifi: mvm: Set THERMAL_TRIP_FLAG_RW_TEMP directly
mlxsw: core_thermal: Set THERMAL_TRIP_FLAG_RW_TEMP directly
thermal: intel: Set THERMAL_TRIP_FLAG_RW_TEMP directly
thermal: core: Drop the .set_trip_hyst() thermal zone operation
thermal: core: Add flags to struct thermal_trip
thermal: core: Move initial num_trips assignment before memcpy()
thermal: Get rid of CONFIG_THERMAL_WRITABLE_TRIPS
thermal: intel: Adjust ops handling during thermal zone registration
thermal: ACPI: Constify acpi_thermal_zone_ops
thermal: core: Store zone ops in struct thermal_zone_device
thermal: intel: Discard trip tables after zone registration
thermal: ACPI: Discard trips table after zone registration
thermal: core: Store zone trips table in struct thermal_zone_device
* thermal-intel:
thermal: intel: int340x_thermal: Use thermal zone accessor functions
thermal: int340x: processor_thermal: Add Lunar Lake-M PCI ID
This code was shuffled around a bit recently. We no longer need to
check the value of "ret" because we know it's zero.
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Make int340x_thermal use the dedicated accessor functions for the
thermal zone device object address and the thermal zone type string.
This is requisite for future thermal core improvements.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
The first and the third lines of the kerneldoc comment for:
thermal_zone_device_set_polling()
belong to the same sentences, so join them together.
Signed-off-by: Flavio Suligoi <f.suligoi@asem.it>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Add Lunar Lake-M PCI ID for processor thermal device.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
All of the thermal_zone_device_register_with_trips() callers pass zero
writable trip points masks to it, so drop the mask argument from that
function and update all of its callers accordingly.
This also removes the artificial trip points per zone limit of 32,
related to using writable trip points masks.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
It is now possible to flag trip points with THERMAL_TRIP_FLAG_RW_TEMP
to allow their temperature to be set from user space via sysfs instead
of using a nonzero writable trips mask during thermal zone registration,
so make the OF thermal code do that.
No intentional functional impact.
Note that this change is requisite for dropping the mask argument from
thermal_zone_device_register_with_trips() going forward.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
It is now possible to flag trip points with THERMAL_TRIP_FLAG_RW_TEMP
to allow their temperature to be set from user space via sysfs instead
of using a nonzero writable trips mask during thermal zone registration,
so make the imx thermal code do that.
No intentional functional impact.
Note that this change is requisite for dropping the mask argument from
thermal_zone_device_register_with_trips() going forward.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Some Intel thermal drivers need/want the temperature of their trip
points to be set by user space via sysfs and so they pass nonzero
writable trip masks during thermal zone registration for this purpose.
It is now possible to achieve the same result by setting the
THERMAL_TRIP_FLAG_RW_TEMP trip flag directly, so modify the drivers
in question to do that instead of using a nonzero writable trips mask.
No intentional functional impact.
Note that this change is requisite for dropping the mask argument from
thermal_zone_device_register_with_trips() going forward.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
None of the users of the thermal core provides a .set_trip_hyst()
thermal zone operation, so drop that callback from struct
thermal_zone_device_ops and update trip_point_hyst_store()
accordingly.
No functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
In order to allow thermal zone creators to specify the writability of
trip point temperature and hysteresis on a per-trip basis, add a flags
field to struct thermal_trip and define flags to represent the desired
trip properties.
Also make thermal_zone_device_register_with_trips() set the
THERMAL_TRIP_FLAG_RW_TEMP flag for all trips covered by the writable
trips mask passed to it and modify the thermal sysfs code to look at
the trip flags instead of using the writable trips mask directly or
checking the presence of the .set_trip_hyst() zone callback.
Additionally, make trip_point_temp_store() and trip_point_hyst_store()
fail with an error code if the trip passed to one of them has
THERMAL_TRIP_FLAG_RW_TEMP or THERMAL_TRIP_FLAG_RW_HYST,
respectively, clear in its flags.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When booting a CONFIG_FORTIFY_SOURCE=y kernel compiled with a toolchain
that supports __counted_by() (such as clang-18 and newer), there is a
panic on boot:
[ 2.913770] memcpy: detected buffer overflow: 72 byte write of buffer size 0
[ 2.920834] WARNING: CPU: 2 PID: 1 at lib/string_helpers.c:1027 __fortify_report+0x5c/0x74
...
[ 3.039208] Call trace:
[ 3.041643] __fortify_report+0x5c/0x74
[ 3.045469] __fortify_panic+0x18/0x20
[ 3.049209] thermal_zone_device_register_with_trips+0x4c8/0x4f8
This panic occurs because trips is counted by num_trips but num_trips is
assigned after the call to memcpy(), so the fortify checks think the
buffer size is zero because tz was allocated with kzalloc().
Move the num_trips assignment before the memcpy() to resolve the panic
and ensure that the fortify checks work properly.
Fixes: 9b0a62758665 ("thermal: core: Store zone trips table in struct thermal_zone_device")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The only difference made by CONFIG_THERMAL_WRITABLE_TRIPS is whether or
not the writable trips mask passed during thermal zone registration
will take any effect, but whoever passes a non-zero writable trips mask
to thermal_zone_device_register_with_trips() can be forgiven thinking
that it will always work.
Moreover, some thermal drivers expect user space to set trip temperature
values, so they select CONFIG_THERMAL_WRITABLE_TRIPS, possibly overriding
a manual choice to unset it and going against the design purportedly
allowing system integrators to decide on the writability of trip points
for the given kernel build. It is also set in one platform's defconfig.
Forthermore, CONFIG_THERMAL_WRITABLE_TRIPS only affects trip temperature,
because trip hysteresis is writable as long as the thermal zone provides
a callback to update it, regardless of the CONFIG_THERMAL_WRITABLE_TRIPS
value.
The above means that the symbol in question is used inconsistently and
its purpose is at least moot, so remove it and always take the writable
trip mask passed to thermal_zone_device_register_with_trips() into
account.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Because thermal zone operations are now stored directly in struct
thermal_zone_device, thermal zone creators can discard the operations
structure after the zone registration is complete, or it can be made
read-only.
Accordingly, make int340x_thermal_zone_add() use a local variable to
represent thermal zone operations, so it is freed automatically upon the
function exit, and make the other Intel thermal drivers use const zone
operations structures.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The current code requires thermal zone creators to pass pointers to
writable ops structures to thermal_zone_device_register_with_trips()
which needs to modify the target struct thermal_zone_device_ops object
if the "critical" operation in it is NULL.
Moreover, the callers of thermal_zone_device_register_with_trips() are
required to hold on to the struct thermal_zone_device_ops object passed
to it until the given thermal zone is unregistered.
Both of these requirements are quite inconvenient, so modify struct
thermal_zone_device to contain struct thermal_zone_device_ops as field and
make thermal_zone_device_register_with_trips() copy the contents of the
struct thermal_zone_device_ops passed to it via a pointer (which can be
const now) to that field.
Also adjust the code using thermal zone ops accordingly and modify
thermal_of_zone_register() to use a local ops variable during
thermal zone registration so ops do not need to be freed in
thermal_of_zone_unregister() any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Because the thermal core creates and uses its own copy of the trips
table passed to thermal_zone_device_register_with_trips(), it is not
necessary to hold on to a local copy of it any more after the given
thermal zone has been registered.
Accordingly, modify Intel thermal drivers to discard the trips tables
passed to thermal_zone_device_register_with_trips() after thermal zone
registration, for example by storing them in local variables which are
automatically discarded when the zone registration is complete.
Also make some additional code simplifications unlocked by the above
changes.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The current code expects thermal zone creators to pass a pointer to a
writable trips table to thermal_zone_device_register_with_trips() and
that trips table is then used by the thermal core going forward.
Consequently, the callers of thermal_zone_device_register_with_trips()
are required to hold on to the trips table passed to it until the given
thermal zone is unregistered, at which point the trips table can be
freed, but at the same time they are not expected to access that table
directly. This is both error prone and confusing.
To address it, turn the trips table pointer in struct thermal_zone_device
into a flex array (counted by its num_trips field), allocate it during
thermal zone device allocation and copy the contents of the trips table
supplied by the zone creator (which can be const now) into it, which
will allow the callers of thermal_zone_device_register_with_trips() to
drop their trip tables right after the zone registration.
This requires the imx thermal driver to be adjusted to store the new
temperature in its internal trips table in imx_set_trip_temp(), because
it will be separate from the core's trips table now and it has to be
explicitly kept in sync with the latter.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Merge thermal core changes for 6.9:
- Minor fixes for thermal governors (Rafael J. Wysocki, Di Shen).
- Trip point handling fixes for the iwlwifi wireless driver (Rafael J.
Wysocki).
- Code cleanups (Rafael J. Wysocki, AngeloGioacchino Del Regno).
* thermal-tmp:
thermal: gov_power_allocator: Avoid overwriting PID coefficients from setup time
thermal: sysfs: Fix up white space in trip_point_temp_store()
iwlwifi: mvm: Use for_each_thermal_trip() for walking trip points
iwlwifi: mvm: Populate trip table before registering thermal zone
iwlwifi: mvm: Drop unused fw_trips_index[] from iwl_mvm_thermal_device
thermal: core: Change governor name to const char pointer
thermal: gov_bang_bang: Fix possible cooling device state ping-pong
thermal: gov_fair_share: Fix dependency on trip points ordering
The RAPL framework uses CPU hotplug locking to protect the rapl_packages
list and rp->lead_cpu to guarantee that
1. the RAPL package device is not unprobed and freed
2. the cached rp->lead_cpu is always valid
for operations like powercap sysfs accesses.
Current RAPL APIs assume being called from CPU hotplug callbacks which
hold the CPU hotplug lock, but TPMI RAPL driver invokes the APIs in the
driver's .probe() function without acquiring the CPU hotplug lock.
Fix the problem by providing both locked and lockless versions of RAPL
APIs.
Fixes: 9eef7f9da928 ("powercap: intel_rapl: Introduce RAPL TPMI interface driver")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Cc: 6.5+ <stable@vger.kernel.org> # 6.5+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CPU temperature can be negative in some cases. Thus the negative CPU
temperature should not be considered as a failure.
Fix intel_tcc_get_temp() and its users to support negative CPU
temperature.
Fixes: a3c1f066e1c5 ("thermal/intel: Introduce Intel TCC library")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Cc: 6.3+ <stable@vger.kernel.org> # 6.3+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When the PID coefficients k_* are set via sysfs before the IPA
algorithm is triggered then the coefficients would be overwritten after
IPA throttle() is called. The old configuration values might be
different than the new values estimated by the IPA internal algorithm.
There might be a time delay when this overwriting happens. It
depends on the thermal zone temperature value. The temperature value
needs to cross the first trip point value then IPA algorithms start
operating. Although, the PID coefficients setup time should not be
affected or linked to any later operating phase and values must not be
overwritten.
This patch initializes params->sustainable_power when the governor
binds to thermal zone to avoid overwriting k_*. The basic function won't
be affected, as the k_* still can be estimated if the sustainable_power
is modified.
Signed-off-by: Di Shen <di.shen@unisoc.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove an excess tab character from an otherwise empty code line.
No functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Energy Model framework support modifications at runtime of the power
values. Use the new EM table which is protected with RCU. Align the
code so that this RCU read section is short.
This change is not expected to alter the general functionality.
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Energy Model framework support modifications at runtime of the power
values. Use the new EM table which is protected with RCU. Align the
code so that this RCU read section is short.
This change is not expected to alter the general functionality.
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The current behavior of thermal_zone_trip_update() in the bang-bang
thermal governor may be problematic for trip points with 0 hysteresis,
because when the zone temperature reaches the trip temperature and
stays there, it will then cause the cooling device go "on" and "off"
alternately, which is not desirable.
Address this by requiring the zone temperature to actually fall below
trip->temperature - trip->hysteresis for the cooling device to go off.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The computation in the fair share governor's get_trip_level() function
currently works under the assumption that the temperature ordering of
trips[] in a thermal zone is ascending, which need not be the case.
However, get_trip_level() can be made work regardless of whether or not
the trips table is ordered by temperature in any way, so change it
accordingly.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
After conversion of this driver to use powercap idle_inject core, this
driver doesn't use target_mwait value. So remove dead code.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h.
of_device.h isn't needed, but mod_devicetable.h and property.h were
implicitly included.
Signed-off-by: Rob Herring <robh@kernel.org>
- Add debugfs-based diagnostics support to the thermal core (Daniel
Lezcano, Dan Carpenter).
- Fix a power allocator thermal governor issue preventing it from
resetting cooling devices sometimes (Di Shen).
- Simplify the thermal netlink API and clean up related code (Rafael J.
Wysocki).
- Make the Intel HFI driver support hibernation and deep suspend
properly (Ricardo Neri).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmWmb3cSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx1ywP+QGUpQvZaMwJ9YYA1nyUa6UbHSEwQNPc
xscY0gRS4ovWh9EPx7HpMFtvj33jPeZONFsLAFL7AwODY4VKwt8KSinTosmrW5QC
izesNWFwSsI2uRYsKCvigVwU6Frl4TsCG68TwbLeZl97tX5Sdaimv7TiAeU+VxJ6
3IQ+E2d8N+ybHCK59FI8pd1EA0ZrvGFuSN1ogFAUEDfKu74A3NUtPuXWKTlnHnEy
XfcX9axCclQmVSaYHcDeX+y0/s2bZAnSpYeoGbc2fRAXKFgVeu7tsxtj1ei/c/sO
sttHTYjVtvBgXlZRL/jMsKK0VDXqJOOnqpEH/z7q9dn15vQTzQv2Iq6krkVlnXtr
le2LmOxJXtU7VEf4cm0dpmbgl8IFWuZ/47Vb/duEIhmWs5piSmRj7/GIX1DGdABd
Es7GERvA3nYZBI/5bxw49o+4/23+62lAWN+AlmTfWSJqW64cP0osbUHUF9dN5otA
82pERngkva3RLs3Fyh5cW4oO/pcTgHsYX/2li6WwC0GKm16mjIHer2li9VmDamyH
rk0cILxBofN99wvTq0j1HfPOdun6yE5vi+n+YQkB5Ry38r6Osyi91dZsMk5xPQo4
R7ISm23ExpD3bmb6DahsAC/yLlrHnHYpDjWBSbHpaU8dUiReh96AJHdM6nINVoc6
QfgjzOwtbcaM
=TPzj
-----END PGP SIGNATURE-----
Merge tag 'thermal-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more thermal control updates from Rafael Wysocki:
"These add support for debugfs-based diagnostics to the thermal core,
simplify the thermal netlink API, fix system-wide PM support in the
Intel HFI driver and clean up some code.
Specifics:
- Add debugfs-based diagnostics support to the thermal core (Daniel
Lezcano, Dan Carpenter)
- Fix a power allocator thermal governor issue preventing it from
resetting cooling devices sometimes (Di Shen)
- Simplify the thermal netlink API and clean up related code (Rafael
J. Wysocki)
- Make the Intel HFI driver support hibernation and deep suspend
properly (Ricardo Neri)"
* tag 'thermal-6.8-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal/debugfs: Unlock on error path in thermal_debug_tz_trip_up()
thermal: intel: hfi: Add syscore callbacks for system-wide PM
thermal: gov_power_allocator: avoid inability to reset a cdev
thermal: helpers: Rearrange thermal_cdev_set_cur_state()
thermal: netlink: Rework notify API for cooling devices
thermal: core: Use kstrdup_const() during cooling device registration
thermal/debugfs: Add thermal debugfs information for mitigation episodes
thermal/debugfs: Add thermal cooling device debugfs information
thermal: netlink: Pass thermal zone pointer to notify routines
thermal: netlink: Drop thermal_notify_tz_trip_add/delete()
thermal: netlink: Pass pointers to thermal_notify_tz_trip_up/down()
thermal: netlink: Pass pointers to thermal_notify_tz_trip_change()
Merge additional updates for 6.8-rc1 in the thermal core and in the
Intel HFI thermal driver:
- Add debugfs-based diagnostics support to the thermal core (Daniel
Lezcano, Dan Carpenter).
- Fix a power allocator thermal governor issue preventing it from
resetting cooling devices sometimes (Di Shen).
- Simplify the thermal netlink API and clean up related code (Rafael J.
Wysocki).
- Make the Intel HFI driver support hibernation and deep suspend
properly (Ricardo Neri).
* thermal-core:
thermal/debugfs: Unlock on error path in thermal_debug_tz_trip_up()
thermal: gov_power_allocator: avoid inability to reset a cdev
thermal: helpers: Rearrange thermal_cdev_set_cur_state()
thermal: netlink: Rework notify API for cooling devices
thermal: core: Use kstrdup_const() during cooling device registration
thermal/debugfs: Add thermal debugfs information for mitigation episodes
thermal/debugfs: Add thermal cooling device debugfs information
thermal: netlink: Pass thermal zone pointer to notify routines
thermal: netlink: Drop thermal_notify_tz_trip_add/delete()
thermal: netlink: Pass pointers to thermal_notify_tz_trip_up/down()
thermal: netlink: Pass pointers to thermal_notify_tz_trip_change()
* thermal-intel:
thermal: intel: hfi: Add syscore callbacks for system-wide PM
Add a missing mutex_unlock(&thermal_dbg->lock) to this error path.
Fixes: 7ef01f228c9f ("thermal/debugfs: Add thermal debugfs information for mitigation episodes")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The kernel allocates a memory buffer and provides its location to the
hardware, which uses it to update the HFI table. This allocation occurs
during boot and remains constant throughout runtime.
When resuming from hibernation, the restore kernel allocates a second
memory buffer and reprograms the HFI hardware with the new location as
part of a normal boot. The location of the second memory buffer may
differ from the one allocated by the image kernel.
When the restore kernel transfers control to the image kernel, its HFI
buffer becomes invalid, potentially leading to memory corruption if the
hardware writes to it (the hardware continues to use the buffer from the
restore kernel).
It is also possible that the hardware "forgets" the address of the memory
buffer when resuming from "deep" suspend. Memory corruption may also occur
in such a scenario.
To prevent the described memory corruption, disable HFI when preparing to
suspend or hibernate. Enable it when resuming.
Add syscore callbacks to handle the package of the boot CPU (packages of
non-boot CPUs are handled via CPU offline). Syscore ops always run on the
boot CPU. Additionally, HFI only needs to be disabled during "deep" suspend
and hibernation. Syscore ops only run in these cases.
Cc: 6.1+ <stable@vger.kernel.org> # 6.1+
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
[ rjw: Comment adjustment, subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 0952177f2a1f ("thermal/core/power_allocator: Update once
cooling devices when temp is low") adds an update flag to avoid
triggering a thermal event when there is no need, and the thermal
cdev is updated once when the temperature is low.
But when the trips are writable, and switch_on_temp is set to be a
higher value, the cooling device state may not be reset to 0,
because last_temperature is smaller than switch_on_temp.
For example:
First:
switch_on_temp=70 control_temp=85;
Then userspace change the trip_temp:
switch_on_temp=45 control_temp=55 cur_temp=54
Then userspace reset the trip_temp:
switch_on_temp=70 control_temp=85 cur_temp=57 last_temp=54
At this time, the cooling device state should be reset to 0.
However, because cur_temp(57) < switch_on_temp(70)
last_temp(54) < switch_on_temp(70) ----> update = false,
update is false, the cooling device state can not be reset.
Using the observation that tz->passive can also be regarded as the
temperature status, set the update flag to the tz->passive value.
When the temperature drops below switch_on for the first time, the
states of cooling devices can be reset once, and tz->passive is updated
to 0. In the next round, because tz->passive is 0, cdev->state will not
be updated.
By using the tz->passive value as the "update" flag, the issue above
can be solved, and the cooling devices can be updated only once when the
temperature is low.
Fixes: 0952177f2a1f ("thermal/core/power_allocator: Update once cooling devices when temp is low")
Cc: 5.13+ <stable@vger.kernel.org> # 5.13+
Suggested-by: Wei Wang <wvw@google.com>
Signed-off-by: Di Shen <di.shen@unisoc.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Change the code layout in thermal_cdev_set_cur_state() so it returns
early on errors which is more consistent with what happens elsewhere.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
In analogy with some previous thermal netlink API changes, redefine
thermal_notify_cdev_state_update(), thermal_notify_cdev_add() and
thermal_notify_cdev_delete() to take a const cdev pointer as their
first argument and let them extract the requisite information from
there by themselves.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Some *thermal_cooling_device_register() calls pass a string literal as
the 'type' parameter, so kstrdup_const() can be used instead of
kstrdup() to avoid a memory allocation in such cases.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The mitigation episodes are recorded. A mitigation episode happens
when the first trip point is crossed the way up and then the way
down. During this episode other trip points can be crossed also and
are accounted for this mitigation episode. The interesting information
is the average temperature at the trip point, the undershot and the
overshot. The standard deviation of the mitigated temperature will be
added later.
The thermal debugfs directory structure tries to stay consistent with
the sysfs one but in a very simplified way:
thermal/
`-- thermal_zones
|-- 0
| `-- mitigations
`-- 1
`-- mitigations
The content of the mitigations file has the following format:
,-Mitigation at 349988258us, duration=130136ms
| trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) |
| 0 | passive | 65000 | 2000 | 130136 | 68227 | 62500 | 75625 |
| 1 | passive | 75000 | 2000 | 104209 | 74857 | 71666 | 77500 |
,-Mitigation at 272451637us, duration=75000ms
| trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) |
| 0 | passive | 65000 | 2000 | 75000 | 68561 | 62500 | 75000 |
| 1 | passive | 75000 | 2000 | 60714 | 74820 | 70555 | 77500 |
,-Mitigation at 238184119us, duration=27316ms
| trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) |
| 0 | passive | 65000 | 2000 | 27316 | 73377 | 62500 | 75000 |
| 1 | passive | 75000 | 2000 | 19468 | 75284 | 69444 | 77500 |
,-Mitigation at 39863713us, duration=136196ms
| trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) |
| 0 | passive | 65000 | 2000 | 136196 | 73922 | 62500 | 75000 |
| 1 | passive | 75000 | 2000 | 91721 | 74386 | 69444 | 78125 |
More information for a better understanding of the thermal behavior
will be added after. The idea is to give detailed statistics
information about the undershots and overshots, the temperature speed,
etc... As all the information in a single file is too much, the idea
would be to create a directory named with the mitigation timestamp
where all data could be added.
Please note this code is immune against trip ordering but not against
a trip temperature change while a mitigation is happening. However,
this situation should be extremely rare, perhaps not happening and we
might question ourselves if something should be done in the core
framework for other components first.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
[ rjw: White space fixups, rebase ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The thermal framework does not have any debug information except a
sysfs stat which is a bit controversial. This one allocates big chunks
of memory for every cooling devices with a high number of states and
could represent on some systems in production several megabytes of
memory for just a portion of it. As the sysfs is limited to a page
size, the output is not exploitable with large data array and gets
truncated.
The patch provides the same information than sysfs except the
transitions are dynamically allocated, thus they won't show more
events than the ones which actually occurred. There is no longer a
size limitation and it opens the field for more debugging information
where the debugfs is designed for, not sysfs.
The thermal debugfs directory structure tries to stay consistent with
the sysfs one but in a very simplified way:
thermal/
-- cooling_devices
|-- 0
| |-- clear
| |-- time_in_state_ms
| |-- total_trans
| `-- trans_table
|-- 1
| |-- clear
| |-- time_in_state_ms
| |-- total_trans
| `-- trans_table
|-- 2
| |-- clear
| |-- time_in_state_ms
| |-- total_trans
| `-- trans_table
|-- 3
| |-- clear
| |-- time_in_state_ms
| |-- total_trans
| `-- trans_table
`-- 4
|-- clear
|-- time_in_state_ms
|-- total_trans
`-- trans_table
The content of the files in the cooling devices directory is the same
as the sysfs one except for the trans_table which has the following
format:
Transition Hits
1->0 246
0->1 246
2->1 632
1->2 632
3->2 98
2->3 98
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
[ rjw: White space fixups, rebase ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
- Add dynamic thresholds for trip point crossing detection to prevent
trip point crossing notifications from being sent at incorrect times
or not at all in some cases (Rafael J. Wysocki).
- Fix synchronization issues related to the resume of thermal zones
during a system-wide resume and allow thermal zones to be resumed
concurrently (Rafael J. Wysocki).
- Modify the thermal zone unregistration to wait for the given zone to
go away completely before returning to the caller and rework the
sysfs interface for trip points on top of that (Rafael J. Wysocki).
- Fix a possible NULL pointer dereference in thermal zone registration
error path (Rafael J. Wysocki).
- Clean up the IPA thermal governor and modify it (with the help of a
new governor callback) to avoid allocating and freeing memory every
time its throttling callback is invoked (Lukasz Luba).
- Make the IPA thermal governor handle thermal instance weight changes
via sysfs correctly (Lukasz Luba).
- Update the thermal netlink code to avoid sending messages if there
are no recipients (Stanislaw Gruszka).
- Convert Mediatek Thermal to the json-schema (Rafał Miłecki).
- Fix thermal DT bindings issue on Loongson (Binbin Zhou).
- Fix returning NULL instead of -ENODEV during thermal probe on
Loogsoon (Binbin Zhou).
- Add thermal DT binding for tsens on the SM8650 platform (Neil
Armstrong).
- Add reboot on the critical trip point crossing option feature (Fabio
Estevam).
- Use DEFINE_SIMPLE_DEV_PM_OPS do define PM functions for thermal
suspend/resume on AmLogic (Uwe Kleine-König)
- Add D1/T113s THS controller support to the Sun8i thermal control
driver (Maxim Kiselev)
- Fix example in the thermal DT binding for QCom SPMI (Johan Hovold).
- Fix compilation warning in the tmon utility (Florian Eckert).
- Add support for interrupt-based thermal configuration on Exynos along
with a set of related cleanups (Mateusz Majewski).
- Make the Intel HFI thermal driver enable an HFI instance (eg. processor
package) from its first online CPU and disable it when the last CPU in
it goes offline (Ricardo Neri).
- Fix a kernel-doc warning and a spello in the cpuidle_cooling thermal
driver (Randy Dunlap).
- Move the .get_temp() thermal zone callback presence check to the
thermal zone registration code (Daniel Lezcano).
- Use the for_each_trip() macro for trip points table walks in a few
places in the thermal core (Rafael J. Wysocki).
- Make all trip point updates (via sysfs as well as from the platform
firmware) trigger trip change notifications (Rafael J. Wysocki).
- Drop redundant code from the thermal core and make one function in
it take a const pointer argument (Rafael J. Wysocki).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmWb8iUSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxKLkP/iDsuDwmhZAjbAu2iftk/8ad8Trm2VoK
+9eZ5Eqa8lKEcJLb0RxueTnFT4ppvT/hY99HOG4FM+mCnWeH/Z32N697DhqiUg4v
GZUpeOPzxYgsfOOTeuL5XgfrVMgBjJrJunTXmzgAd8lIhTmRbAMVmFVJ18CJO11O
RHgqvYznYFi5cywA9/NkG2xkhFB0VDoiTuIiuMMV+pMjqF0d5ooBMkhmjvPQ5Rp9
FjNJ7hqiTamAsDPdULAFqhIGGhKZWWFbh4+S+JPCwBW8nqvxyJpemsm20vrwctJR
bSXWQkgkDpWEeg9yrEAOO/Uk9yGd3jiLfkvPBKbK0x/YxGZ4hOYHcbF3cOUvmPYP
5K3ZJ61DNrzB/5S3LY54VYrWmTVRdK6Lk3HYNvfAUYFJZMZ5oMYZLCUmo4SswUdy
UUEIY27H7L18eLhP9zCcKo4njdaVG+vXQn/rJIFOpG0k9OElzPs1X8Dp/m9pKQDR
rDUsMXqB34NUVrIEhjAgqvwF5xHooW8gykpuJgxwBetA9w8Pls2A/mzLsDY3wgdQ
htiANGpKTDqBQSn+HrjzYckv9/R+1tDyTJmEDNZwllA1DJfrOlpCRD2VHRpgTZEA
Ldnq0bhyq6RQnousqxhgpYkIAoGaebs9XasRH0YtBG5gIumeWfqeVzmTcM5xdsNB
yf6RdQy8QunS
=QVyh
-----END PGP SIGNATURE-----
Merge tag 'thermal-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control updates from Rafael Wysocki:
"These add support for the D1/T113s THS controller to the sun8i driver
and a DT-based mechanism for platforms to indicate a preference to
reboot (instead of shutting down) on crossing a critical trip point,
fix issues, make other improvements (in the IPA governor, the Intel
HFI driver, the exynos driver and the thermal netlink interface among
other places) and clean up code.
One long-standing issue addressed here is that trip point crossing
notifications sent to user space might be unreliable due to the
incorrect handling of trip point hysteresis in the thermal core:
multiple notifications might be sent for the same event or there might
be events without any notification at all.
Specifics:
- Add dynamic thresholds for trip point crossing detection to prevent
trip point crossing notifications from being sent at incorrect
times or not at all in some cases (Rafael J. Wysocki)
- Fix synchronization issues related to the resume of thermal zones
during a system-wide resume and allow thermal zones to be resumed
concurrently (Rafael J. Wysocki)
- Modify the thermal zone unregistration to wait for the given zone
to go away completely before returning to the caller and rework the
sysfs interface for trip points on top of that (Rafael J. Wysocki)
- Fix a possible NULL pointer dereference in thermal zone
registration error path (Rafael J. Wysocki)
- Clean up the IPA thermal governor and modify it (with the help of a
new governor callback) to avoid allocating and freeing memory every
time its throttling callback is invoked (Lukasz Luba)
- Make the IPA thermal governor handle thermal instance weight
changes via sysfs correctly (Lukasz Luba)
- Update the thermal netlink code to avoid sending messages if there
are no recipients (Stanislaw Gruszka)
- Convert Mediatek Thermal to the json-schema (Rafał Miłecki)
- Fix thermal DT bindings issue on Loongson (Binbin Zhou)
- Fix returning NULL instead of -ENODEV during thermal probe on
Loogsoon (Binbin Zhou)
- Add thermal DT binding for tsens on the SM8650 platform (Neil
Armstrong)
- Add reboot on the critical trip point crossing option feature
(Fabio Estevam)
- Use DEFINE_SIMPLE_DEV_PM_OPS do define PM functions for thermal
suspend/resume on AmLogic (Uwe Kleine-König)
- Add D1/T113s THS controller support to the Sun8i thermal control
driver (Maxim Kiselev)
- Fix example in the thermal DT binding for QCom SPMI (Johan Hovold)
- Fix compilation warning in the tmon utility (Florian Eckert)
- Add support for interrupt-based thermal configuration on Exynos
along with a set of related cleanups (Mateusz Majewski)
- Make the Intel HFI thermal driver enable an HFI instance (eg.
processor package) from its first online CPU and disable it when
the last CPU in it goes offline (Ricardo Neri)
- Fix a kernel-doc warning and a spello in the cpuidle_cooling
thermal driver (Randy Dunlap)
- Move the .get_temp() thermal zone callback presence check to the
thermal zone registration code (Daniel Lezcano)
- Use the for_each_trip() macro for trip points table walks in a few
places in the thermal core (Rafael J. Wysocki)
- Make all trip point updates (via sysfs as well as from the platform
firmware) trigger trip change notifications (Rafael J. Wysocki)
- Drop redundant code from the thermal core and make one function in
it take a const pointer argument (Rafael J. Wysocki)"
* tag 'thermal-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits)
thermal: trip: Constify thermal zone argument of thermal_zone_trip_id()
thermal: intel: hfi: Disable an HFI instance when all its CPUs go offline
thermal: intel: hfi: Enable an HFI instance from its first online CPU
thermal: intel: hfi: Refactor enabling code into helper functions
thermal/drivers/exynos: Use set_trips ops
thermal/drivers/exynos: Use BIT wherever possible
thermal/drivers/exynos: Split initialization of TMU and the thermal zone
thermal/drivers/exynos: Stop using the threshold mechanism on Exynos 4210
thermal/drivers/exynos: Simplify regulator (de)initialization
thermal/drivers/exynos: Handle devm_regulator_get_optional return value correctly
thermal/drivers/exynos: Wwitch from workqueue-driven interrupt handling to threaded interrupts
thermal/drivers/exynos: Drop id field
thermal/drivers/exynos: Remove an unnecessary field description
tools/thermal/tmon: Fix compilation warning for wrong format
dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Clean up examples
dt-bindings: thermal: qcom-spmi-adc-tm5/hc: Fix example node names
thermal/drivers/sun8i: Add D1/T113s THS controller support
dt-bindings: thermal: sun8i: Add binding for D1/T113s THS controller
thermal: amlogic: Use DEFINE_SIMPLE_DEV_PM_OPS for PM functions
thermal: amlogic: Make amlogic_thermal_disable() return void
...
There are several rountines in the thermal netlink API that take a
thermal zone ID or a thermal zone type as their arguments, but from
their callers perspective it would be more convenient to pass a thermal
zone pointer to them and let them extract the necessary data from the
given thermal zone object by themselves.
Modify the code accordingly.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Because thermal_notify_tz_trip_add/delete() are never used, drop them
entirely along with the related code.
The addition or removal of trip points is not supported by the thermal
core and is unlikely to be supported in the future, so it is also
unlikely that these functions will ever be needed.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Instead of requiring the callers of thermal_notify_tz_trip_up/down() to
provide specific values needed to populate struct param in them, make
them extract those values from objects passed by the callers via const
pointers.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Instead of requiring the caller of thermal_notify_tz_trip_change() to
provide specific values needed to populate struct param in it, make it
extract those values from objects passed to it by the caller via const
pointers.
No intentional functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Merge changes in thermal control drivers for Intel platforms for
6.8-rc1:
- Make the Intel HFI thermal driver enable an HFI instance (eg. processor
package) from its first online CPU and disable it when the last CPU in
it goes offline (Ricardo Neri).
* thermal-intel:
thermal: intel: hfi: Disable an HFI instance when all its CPUs go offline
thermal: intel: hfi: Enable an HFI instance from its first online CPU
thermal: intel: hfi: Refactor enabling code into helper functions