linux-next/drivers/thermal
Rafael J. Wysocki f7c1b0e4ae thermal: core: Back off when polling thermal zones on errors
Commit a8a2617744 ("thermal: core: Call monitor_thermal_zone() if zone
temperature is invalid") introduced a polling mechanism by which the
thermal core attampts to get a valid temperature value for thermal zones
where the .get_temp() callback returns errors to start with (for
example, due to initialization ordering woes).  However, this polling is
carried out periodically ad infinitum and every iteration of it causes
a message to be printed to the kernel log which means a lot of log noise
on systems where there are thermal zones that never get ready for some
reason.  It is also not really useful to continuously poll thermal zones
that never respond.

To address this, modify the thermal core to increase the delay between
consecutive thermal zone temperature checks after every check that fails
until it reaches a certain maximum value.  At that point, the thermal
zone in question will be disabled, but user space will be able to
reenable it if it believes that the failure is transient.

Also change the code to print messages regarding failed temperature
checks to the kernel log only twice, once when the thermal zone's
.get_temp() callback returns an error for the first time and once when
disabling the given thermal zone.  In addition, a dev_crit() message
will be printed at that point if the given thermal zone contains a
critical trip point to notify the system operator about the situation.

Fixes: a8a2617744 ("thermal: core: Call monitor_thermal_zone() if zone temperature is invalid")
Link: https://lore.kernel.org/linux-acpi/CAGnHSE=RyPK++UG0-wAtVKgeJxe0uzFYgLxm+RUOKKoQquW=Ow@mail.gmail.com/
Reported-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://patch.msgid.link/2962033.e9J7NaK4W3@rjwysocki.net
2024-07-24 12:40:23 +02:00
..
broadcom thermal/drivers/broadcom: Simplify with dev_err_probe() 2024-07-15 13:31:40 +02:00
intel Merge branch 'thermal-intel' 2024-07-15 20:44:31 +02:00
mediatek thermal/drivers/mediatek/lvts_thermal: Provide default calibration data 2024-07-15 13:31:39 +02:00
qcom Merge branch 'thermal-core' 2024-07-15 20:43:21 +02:00
renesas thermal/drivers/renesas/rcar: Add dependency on OF 2024-07-15 13:31:39 +02:00
samsung thermal/drivers/exynos: Simplify with dev_err_probe() 2024-07-15 13:31:40 +02:00
st thermal/drivers/sti: Cleanup code related to stih416 2024-07-15 13:31:41 +02:00
tegra thermal: trip: Pass trip pointer to .set_trip_temp() thermal zone callback 2024-07-12 15:14:01 +02:00
ti-soc-thermal thermal: ti-bandgap: Convert to platform remove callback returning void 2023-10-02 14:24:15 +02:00
amlogic_thermal.c thermal/drivers/amlogic: Support A1 SoC family Thermal Sensor controller 2024-04-23 12:40:29 +02:00
armada_thermal.c thermal/drivers/armada: Simplify name sanitization 2024-04-23 12:40:29 +02:00
cpufreq_cooling.c thermal/cpufreq: Remove arch_update_thermal_pressure() 2024-04-24 12:08:00 +02:00
cpuidle_cooling.c thermal: cpuidle_cooling: fix kernel-doc warning and a spello 2023-12-21 12:05:48 +01:00
da9062-thermal.c thermal: core: Eliminate writable trip points masks 2024-02-27 12:04:38 +01:00
db8500_thermal.c thermal/drivers/db8500: Remove redundant of_match_ptr() 2023-08-16 12:09:19 +02:00
devfreq_cooling.c thermal: devfreq_cooling: Fix perf state when calculate dfc res_util 2024-03-27 16:27:39 +01:00
dove_thermal.c thermal: dove: Convert to platform remove callback returning void 2023-09-29 12:34:16 +02:00
gov_bang_bang.c thermal: gov_bang_bang: Drop unnecessary cooling device target state checks 2024-06-11 21:06:44 +02:00
gov_fair_share.c thermal: gov_fair_share: Eliminate unnecessary integer divisions 2024-04-24 10:15:08 +02:00
gov_power_allocator.c thermal: gov_power_allocator: Return early in manage if trip_max is NULL 2024-07-04 13:35:50 +02:00
gov_step_wise.c thermal: gov_step_wise: Go straight to instance->lower when mitigation is over 2024-06-25 14:37:05 +02:00
gov_user_space.c thermal: gov_user_space: Use .trip_crossed() instead of .throttle() 2024-04-24 20:42:10 +02:00
hisi_thermal.c thermal/drivers/hisi: Simplify with dev_err_probe() 2024-07-15 13:31:40 +02:00
imx8mm_thermal.c thermal/drivers/imx8mm_thermal: Fix function pointer declaration by adding identifier name 2023-10-15 23:40:09 +02:00
imx_sc_thermal.c thermal: Explicitly include correct DT includes 2023-07-31 20:03:42 +02:00
imx_thermal.c Merge branch 'thermal-core' 2024-07-15 20:43:21 +02:00
k3_bandgap.c thermal/drivers/k3_bandgap: Remove some unused fields in struct k3_bandgap 2024-04-23 12:40:29 +02:00
k3_j72xx_bandgap.c thermal/drivers/k3_j72xx_bandgap: Implement suspend/resume support 2024-07-15 13:31:39 +02:00
Kconfig thermal/drivers/renesas: Group all renesas thermal drivers together 2024-07-15 13:31:38 +02:00
khadas_mcu_fan.c thermal/core: Make cooling device state change private 2021-01-19 22:31:10 +01:00
kirkwood_thermal.c thermal: kirkwood: Convert to platform remove callback returning void 2023-09-29 12:34:17 +02:00
loongson2_thermal.c thermal/drivers/loongson2: Add Loongson-2K2000 support 2024-04-23 12:40:30 +02:00
Makefile thermal/drivers/renesas: Group all renesas thermal drivers together 2024-07-15 13:31:38 +02:00
max77620_thermal.c thermal/drivers/max77620: Remove duplicate error message 2023-10-15 23:40:10 +02:00
qoriq_thermal.c thermal/drivers/qoriq: Fix getting tmu range 2024-03-11 17:14:46 +01:00
rockchip_thermal.c thermal: rockchip: Convert to platform remove callback returning void 2023-10-02 14:23:30 +02:00
spear_thermal.c thermal: spear: Convert to platform remove callback returning void 2023-10-02 14:24:06 +02:00
sprd_thermal.c thermal: sprd: Convert to platform remove callback returning void 2023-10-02 14:24:08 +02:00
sun8i_thermal.c thermal/drivers/sun8i: Don't fail probe due to zone registration failure 2024-03-11 17:14:46 +01:00
thermal_core.c thermal: core: Back off when polling thermal zones on errors 2024-07-24 12:40:23 +02:00
thermal_core.h thermal: core: Back off when polling thermal zones on errors 2024-07-24 12:40:23 +02:00
thermal_debugfs.c thermal: trip: Use common set of trip type names 2024-06-11 21:04:40 +02:00
thermal_debugfs.h thermal/debugfs: Do not extend mitigation episodes beyond system resume 2024-06-11 21:04:00 +02:00
thermal_helpers.c thermal: core: Allow thermal zones to tell the core to ignore them 2024-07-18 13:35:55 +02:00
thermal_hwmon.c thermal: core: Store zone ops in struct thermal_zone_device 2024-02-23 18:24:48 +01:00
thermal_hwmon.h thermal/hwmon: Use the right device for devm_thermal_add_hwmon_sysfs() 2023-03-03 20:45:02 +01:00
thermal_mmio.c thermal/core: Use the thermal zone 'devdata' accessor in thermal located drivers 2023-03-03 20:45:02 +01:00
thermal_netlink.c Merge branch 'thermal-intel' into thermal 2024-04-15 15:45:32 +02:00
thermal_netlink.h thermal: netlink: Add genetlink bind/unbind notifications 2024-03-27 14:50:26 +01:00
thermal_of.c thermal/of: Assume polling-delay(-passive) 0 when absent 2024-03-11 17:14:46 +01:00
thermal_sysfs.c thermal: trip: Pass trip pointer to .set_trip_temp() thermal zone callback 2024-07-12 15:14:01 +02:00
thermal_trace_ipa.h thermal: core: Make struct thermal_zone_device definition internal 2024-04-08 16:01:20 +02:00
thermal_trace.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
thermal_trip.c thermal: trip: Fold __thermal_zone_get_trip() into its caller 2024-07-12 15:14:56 +02:00
thermal-generic-adc.c thermal/drivers/generic-adc: Simplify with dev_err_probe() 2024-07-15 13:31:41 +02:00
uniphier_thermal.c thermal: uniphier: Use thermal_zone_for_each_trip() for walking trip points 2024-07-04 13:25:31 +02:00