Commit Graph

153 Commits

Author SHA1 Message Date
Ricardo Neri
f0c344c000 hwmon: (coretemp) Extend the bitmask to read temperature to 0xff
The Intel Software Development manual defines the temperature digital
readout as the bits [22:16] of the IA32_[PACKAGE]_THERM_STATUS registers.
Bit 23 is specified as reserved.

In recent processors, however, the temperature digital readout uses bits
[23:16]. In those processors, using the bitmask 0x7f would lead to
incorrect readings if the temperature deviates from TjMax by more than
127 degrees Celsius.

Although not guaranteed, bit 23 is likely to be 0 in processors from a few
generations ago. The temperature reading would still be correct in those
processors when using a 0xff bitmask.

Model-specific provisions can be made for older processors in which bit 23
is not 0 should the need arise.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Link: https://lore.kernel.org/r/20240425171311.19519-4-ricardo.neri-calderon@linux.intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-04-28 10:08:43 -07:00
Linus Torvalds
15223fdbdf hwmon updates for v6.9
* New drivers for
 
   - Amphenol ChipCap 2
 
   - ASPEED g6 PWM/Fan tach
 
   - Astera Labs PT5161L retimer
 
   - ASUS ROG RYUJIN II 360 AIO cooler
 
   - LTC4282
 
   - Microsoft Surface devices
 
   - MPS MPQ8785 Synchronous Step-Down Converter
 
   - NZXT Kraken X and Z series AIO CPU coolers
 
 * Additional chip support in existing drivers
 
   - Ayaneo Air Plus 7320u (oxp-sensors)
 
   - INA260 (ina2xx)
 
   - XPS 9315 (dell-smm)
 
   - MSI customer ID (nct6683)
 
 * Devicetree bindings updates
 
   - Common schema for hardware monitoring devices
 
   - Common schema for fans
 
   - Update chip descriptions to use common schema
 
   - Document regulator properties in several drivers
 
   - Explicit bindings for infineon buck converters
 
 * Other improvements
 
   - Replaced rbtree with maple tree register cache in several drivers
 
   - Added support for humidity min/max alarm and volatage fault attributes
     to hwmon core
 
   - Dropped non-functional I2C_CLASS_HWMON support for drivers w/o detect()
 
   - Dropped obsolete and redundant entried from MAINTAINERS
 
   - Cleaned up axi-fan-control and coretemp drivers
 
   - Minor fixes and improvements in several other drivers
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEiHPvMQj9QTOCiqgVyx8mb86fmYEFAmXvI8QACgkQyx8mb86f
 mYFWTQ/9Hz4QfgIueAEhWGy0XORt7nSCeexkWIR341iyRk4LQA0UjDQg3+Ub5Hi1
 7IBGDNi124S1I/W8fJl1KFgXjbirhuzpHq4DF60Ty+egTZ9IZpu3uySR4mubixYm
 J3el7SIJBvs3SgMFdS/HCtJqeU5HLk+1NjfWDmnq0Z27GHzEy/Nglj2TTO1CjMz/
 tZ/LOWkdG5tMbwI8SZ/mBNMXMpYp/jnUZbrMxgZ/y+07R3jP7i1GWRjq5ZGuWaP6
 SQEs4vfss/y6WUSZZenIIigRIAiAnsNIrjUjrMKPdf0EkjB+0ljn/jLXpAsUU6fL
 07Uy+AwQb89PPWIKHdldn7/MYaR3zU+LwKwPbjULuvpo6Cj87WcIP/x7QqL//Ise
 Ix2Buy/oWoVHKG7Gtf+mF+Ott5MeFgj6pVsCN4IAYYdyai0GPM3RpFAcrIXFCjsE
 i3M5aRC46Yy8Ba6ov3Jmlh83kc9LauJrlCxIxIXTlUJIZiW7a5w083QDSaw3qQdB
 hukwfC8wOzpEsQngkBQyRSpF468lASzc4lp++tPLS/W0zxBrgrnHvgXTHnN8IxvQ
 ocuD5tVMg9gE2xT88t8BHTcw2uv03U5RoXY+nucbxA+Y/aT2t+jZhX9cPbq4+Rhe
 v7XDGMxcBYgtfwx6JT97DKqW9qLc01k8wxonCOrUop6B/+MdRbw=
 =BTB3
 -----END PGP SIGNATURE-----

Merge tag 'hwmon-for-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon updates from Guenter Roeck:
 "New drivers:
   - Amphenol ChipCap 2
   - ASPEED g6 PWM/Fan tach
   - Astera Labs PT5161L retimer
   - ASUS ROG RYUJIN II 360 AIO cooler
   - LTC4282
   - Microsoft Surface devices
   - MPS MPQ8785 Synchronous Step-Down Converter
   - NZXT Kraken X and Z series AIO CPU coolers

  Additional chip support in existing drivers:
   - Ayaneo Air Plus 7320u (oxp-sensors)
   - INA260 (ina2xx)
   - XPS 9315 (dell-smm)
   - MSI customer ID (nct6683)

  Devicetree bindings updates:
   - Common schema for hardware monitoring devices
   - Common schema for fans
   - Update chip descriptions to use common schema
   - Document regulator properties in several drivers
   - Explicit bindings for infineon buck converters

  Other improvements:
   - Replaced rbtree with maple tree register cache in several drivers
   - Added support for humidity min/max alarm and volatage fault
     attributes to hwmon core
   - Dropped non-functional I2C_CLASS_HWMON support for drivers w/o
     detect()
   - Dropped obsolete and redundant entried from MAINTAINERS
   - Cleaned up axi-fan-control and coretemp drivers
   - Minor fixes and improvements in several other drivers"

* tag 'hwmon-for-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (70 commits)
  hwmon: (dell-smm) Add XPS 9315 to fan control whitelist
  hwmon: (aspeed-g6-pwm-tacho): Support for ASPEED g6 PWM/Fan tach
  dt-bindings: hwmon: Support Aspeed g6 PWM TACH Control
  dt-bindings: hwmon: fan: Add fan binding to schema
  dt-bindings: hwmon: tda38640: Add interrupt & regulator properties
  hwmon: (amc6821) add of_match table
  dt-bindings: hwmon: lm75: use common hwmon schema
  hwmon: (sis5595) drop unused DIV_TO_REG function
  dt-bindings: hwmon: reference common hwmon schema
  dt-bindings: hwmon: lltc,ltc4286: use common hwmon schema
  dt-bindings: hwmon: adi,adm1275: use common hwmon schema
  dt-bindings: hwmon: ti,ina2xx: use common hwmon schema
  dt-bindings: hwmon: add common properties
  hwmon: (pmbus/ir38064) Use PMBUS_REGULATOR_ONE to declare regulator
  hwmon: (pmbus/lm25066) Use PMBUS_REGULATOR_ONE to declare regulator
  hwmon: (pmbus/tda38640) Use PMBUS_REGULATOR_ONE to declare regulator
  regulator: dt-bindings: promote infineon buck converters to their own binding
  dt-bindings: hwmon/pmbus: ti,lm25066: document regulators
  dt-bindings: hwmon: nuvoton,nct6775: Add compatible value for NCT6799
  MAINTAINERS: Drop redundant hwmon entries
  ...
2024-03-13 11:26:58 -07:00
Zhang Rui
1a793caf6f hwmon: (coretemp) Use dynamic allocated memory for core temp_data
The total memory needed for saving per core temperature data depends on
the number of cores in a package. Using static allocated memory wastes
memories on systems with low per package core count.

Improve the code to use dynamic allocated memory so that it can be
improved further when per package core count information becomes
available.

No functional change intended.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20240202092144.71180-12-rui.zhang@intel.com
[groeck: Fixed continuation line alignment]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-02-25 12:37:37 -08:00
Zhang Rui
18b24a5f9c hwmon: (coretemp) Remove redundant temp_data->is_pkg_data
temp_data->index saves the index in pdata->core_data[]. It is not used
by package temp_data.

Use temp_data->index as the indicator of package temp_data and remove
redundant temp_data->is_pkg_data.

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20240202092144.71180-11-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-02-25 12:37:37 -08:00
Zhang Rui
326241f71f hwmon: (coretemp) Split package temp_data and core temp_data
Saving package temp_data and core temp_data in one array with different
offsets is fragile.

Split them and clean up crabbed maths and macros. This also fixes a
problem that pdata->core_data[0] was never used.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20240202092144.71180-10-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-02-25 12:37:37 -08:00
Zhang Rui
b0b01414a2 hwmon: (coretemp) Abstract core_temp helpers
coretemp driver has an obscure and fragile logic for handling package
and core temperature data.

Place the logic in newly introduced helpers for further optimizations.

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20240202092144.71180-9-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-02-25 12:37:37 -08:00
Zhang Rui
87eb801925 hwmon: (coretemp) Remove redundant pdata->cpu_map[]
pdata->cpu_map[] saves the mapping between cpu core id and the index in
pdata->core_data[]. This is used to find the temp_data structure using
cpu_core_id, by traversing the pdata->cpu_map[] array. But the same goal
can be achieved by traversing the pdata->core_temp[] array directly.

Remove redundant pdata->cpu_map[].

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20240202092144.71180-8-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-02-25 12:37:37 -08:00
Zhang Rui
18d8f55833 hwmon: (coretemp) Replace sensor_device_attribute with device_attribute
Replace sensor_device_attribute with device_attribute because
sensor_device_attribute->index is no longer used.

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20240202092144.71180-7-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-02-25 12:37:37 -08:00
Zhang Rui
25f8e01baa hwmon: (coretemp) Remove unnecessary dependency of array index
When sensor_device_attribute pointer is available, use container_of() to
get the temp_data address.

This removes the unnecessary dependency of cached index in
pdata->core_data[].

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20240202092144.71180-6-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-02-25 12:37:37 -08:00
Zhang Rui
c8c2074020 hwmon: (coretemp) Introduce enum for attr index
Introduce enum coretemp_attr_index to better describe the index of each
sensor attribute.

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20240202092144.71180-5-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-02-25 12:37:37 -08:00
Thomas Gleixner
bd745d1c41 x86/cpu/topology: Rename topology_max_die_per_package()
The plural of die is dies.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Michael Kelley <mhklinux@outlook.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Link: https://lore.kernel.org/r/20240213210253.065874205@linutronix.de
2024-02-15 22:07:45 +01:00
Zhang Rui
34cf8c657c hwmon: (coretemp) Enlarge per package core count limit
Currently, coretemp driver supports only 128 cores per package.
This loses some core temperature information on systems that have more
than 128 cores per package.
 [   58.685033] coretemp coretemp.0: Adding Core 128 failed
 [   58.692009] coretemp coretemp.0: Adding Core 129 failed
 ...

Enlarge the limitation to 512 because there are platforms with more than
256 cores per package.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20240202092144.71180-4-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-02-04 06:43:45 -08:00
Zhang Rui
fdaf0c8629 hwmon: (coretemp) Fix bogus core_id to attr name mapping
Before commit 7108b80a54 ("hwmon/coretemp: Handle large core ID
value"), there is a fixed mapping between
1. cpu_core_id
2. the index in pdata->core_data[] array
3. the sysfs attr name, aka "tempX_"
The later two always equal cpu_core_id + 2.

After the commit, pdata->core_data[] index is got from ida so that it
can handle sparse core ids and support more cores within a package.

However, the commit erroneously maps the sysfs attr name to
pdata->core_data[] index instead of cpu_core_id + 2.

As a result, the code is not aligned with the comments, and brings user
visible changes in hwmon sysfs on systems with sparse core id.

For example, before commit 7108b80a54 ("hwmon/coretemp: Handle large
core ID value"),
/sys/class/hwmon/hwmon2/temp2_label:Core 0
/sys/class/hwmon/hwmon2/temp3_label:Core 1
/sys/class/hwmon/hwmon2/temp4_label:Core 2
/sys/class/hwmon/hwmon2/temp5_label:Core 3
/sys/class/hwmon/hwmon2/temp6_label:Core 4
/sys/class/hwmon/hwmon3/temp10_label:Core 8
/sys/class/hwmon/hwmon3/temp11_label:Core 9
after commit,
/sys/class/hwmon/hwmon2/temp2_label:Core 0
/sys/class/hwmon/hwmon2/temp3_label:Core 1
/sys/class/hwmon/hwmon2/temp4_label:Core 2
/sys/class/hwmon/hwmon2/temp5_label:Core 3
/sys/class/hwmon/hwmon2/temp6_label:Core 4
/sys/class/hwmon/hwmon2/temp7_label:Core 8
/sys/class/hwmon/hwmon2/temp8_label:Core 9

Restore the previous behavior and rework the code, comments and variable
names to avoid future confusions.

Fixes: 7108b80a54 ("hwmon/coretemp: Handle large core ID value")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20240202092144.71180-3-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-02-04 06:43:45 -08:00
Zhang Rui
4e440abc89 hwmon: (coretemp) Fix out-of-bounds memory access
Fix a bug that pdata->cpu_map[] is set before out-of-bounds check.
The problem might be triggered on systems with more than 128 cores per
package.

Fixes: 7108b80a54 ("hwmon/coretemp: Handle large core ID value")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20240202092144.71180-2-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-02-04 06:43:44 -08:00
Zhang Rui
bbfff736d3 hwmon: (coretemp) Fix potentially truncated sysfs attribute name
When build with W=1 and "-Werror=format-truncation", below error is
observed in coretemp driver,

   drivers/hwmon/coretemp.c: In function 'create_core_data':
>> drivers/hwmon/coretemp.c:393:34: error: '%s' directive output may be truncated writing likely 5 or more bytes into a region of size between 3 and 13 [-Werror=format-truncation=]
     393 |                          "temp%d_%s", attr_no, suffixes[i]);
         |                                  ^~
   drivers/hwmon/coretemp.c:393:26: note: assuming directive output of 5 bytes
     393 |                          "temp%d_%s", attr_no, suffixes[i]);
         |                          ^~~~~~~~~~~
   drivers/hwmon/coretemp.c:392:17: note: 'snprintf' output 7 or more bytes (assuming 22) into a destination of size 19
     392 |                 snprintf(tdata->attr_name[i], CORETEMP_NAME_LENGTH,
         |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     393 |                          "temp%d_%s", attr_no, suffixes[i]);
         |                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   cc1: all warnings being treated as errors

Given that
1. '%d' could take 10 charactors,
2. '%s' could take 10 charactors ("crit_alarm"),
3. "temp", "_" and the NULL terminator take 6 charactors,
fix the problem by increasing CORETEMP_NAME_LENGTH to 28.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Fixes: 7108b80a54 ("hwmon/coretemp: Handle large core ID value")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202310200443.iD3tUbbK-lkp@intel.com/
Link: https://lore.kernel.org/r/20231025122316.836400-1-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2023-10-25 11:57:54 -07:00
Zhang Rui
a2930f6dc9 hwmon: (coretemp) Delete an obsolete comment
The refinement of tjmax value retrieved from MSR_IA32_TEMPERATURE_TARGET
has been changed for several times.

Now, the raw value from MSR is used without refinement. Thus remove the
obsolete comment.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20230330103346.6044-2-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2023-04-19 07:08:39 -07:00
Zhang Rui
6c2b659913 hwmon: (coretemp) Delete tjmax debug message
After commit c0c67f8761 ("hwmon: (coretemp) Add support for dynamic
tjmax"), tjmax value is retrieved from MSR every time the temperature is
read.
This means that, with debug message enabled, the tjmax debug message is
printed out for every single temperature read for any CPU. This spams
the syslog.

Ideally, as tjmax is package scope unique, the debug message should show
once when tjmax is changed for one package. But this requires inventing
some new per-package data in the coretemp driver, and this is overkill.

To keep the code simple, delete the tjmax debug message.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20230330103346.6044-1-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2023-04-19 07:08:39 -07:00
Robin Murphy
6d03bbff45 hwmon: (coretemp) Simplify platform device handling
Coretemp's platform driver is unconventional. All the real work is done
globally by the initcall and CPU hotplug notifiers, while the "driver"
effectively just wraps an allocation and the registration of the hwmon
interface in a long-winded round-trip through the driver core.  The whole
logic of dynamically creating and destroying platform devices to bring
the interfaces up and down is error prone, since it assumes
platform_device_add() will synchronously bind the driver and set drvdata
before it returns, thus results in a NULL dereference if drivers_autoprobe
is turned off for the platform bus. Furthermore, the unusual approach of
doing that from within a CPU hotplug notifier, already commented in the
code that it deadlocks suspend, also causes lockdep issues for other
drivers or subsystems which may want to legitimately register a CPU
hotplug notifier from a platform bus notifier.

All of these issues can be solved by ripping this unusual behaviour out
completely, simply tying the platform devices to the lifetime of the
module itself, and directly managing the hwmon interfaces from the
hotplug notifiers. There is a slight user-visible change in that
/sys/bus/platform/drivers/coretemp will no longer appear, and
/sys/devices/platform/coretemp.n will remain present if package n is
hotplugged off, but hwmon users should really only be looking for the
presence of the hwmon interfaces, whose behaviour remains unchanged.

Link: https://lore.kernel.org/lkml/20220922101036.87457-1-janusz.krzysztofik@linux.intel.com/
Link: https://gitlab.freedesktop.org/drm/intel/issues/6641
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Link: https://lore.kernel.org/r/20230103114620.15319-1-janusz.krzysztofik@linux.intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2023-02-03 07:30:09 -08:00
Marcelo Tosatti
0f8b916bc5 hwmon: (coretemp) avoid RDMSR interrupts to isolated CPUs
The coretemp driver uses rdmsr_on_cpu calls to read
MSR_IA32_PACKAGE_THERM_STATUS/MSR_IA32_THERM_STATUS registers,
which contain information about current core temperature.

For certain low latency applications, the RDMSR interruption exceeds
the applications requirements.

So do not create core files in sysfs, for CPUs which have
isolation and nohz_full enabled.

Temperature information from the housekeeping cores should be
sufficient to infer die temperature.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Link: https://lore.kernel.org/r/Y5zT6B1mY9/pnwJV@tpad
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2023-02-03 07:30:09 -08:00
Zhang Rui
fae30e3c20 hwmon: (coretemp) Add support for dynamic ttarget
Tjmax value retrieved from MSR_IA32_TEMPERATURE_TARGET can be changed at
runtime when the Intel SST-PP (Intel Speed Select Technology -
Performance Profile) level is changed. As a result, the ttarget value
also becomes dyamic.

Improve the code to always get updated ttarget value.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20221113153145.32696-4-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-12-04 16:45:03 -08:00
Zhang Rui
c0c67f8761 hwmon: (coretemp) Add support for dynamic tjmax
Tjmax value retrieved from MSR_IA32_TEMPERATURE_TARGET can be changed at
runtime when the Intel SST-PP (Intel Speed Select Technology -
Performance Profile) level is changed.

Improve the code to always use updated tjmax when it can be retrieved
from MSR_IA32_TEMPERATURE_TARGET.

When tjmax can not be retrieved from MSR_IA32_TEMPERATURE_TARGET, still
follow the previous logic and always use a static tjmax value.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20221113153145.32696-3-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-12-04 16:45:03 -08:00
Zhang Rui
2bc0e6d07e hwmon: (coretemp) rearrange tjmax handing code
Rearrange the tjmax handling code so that it can be used directly in
the sysfs attribute callbacks without forward declarations.

No functional change in this patch.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20221113153145.32696-2-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-12-04 16:45:03 -08:00
Zhang Rui
5c0e64dde8 hwmon: (coretemp) Remove obsolete temp_data->valid
Checking for the valid bit of IA32_THERM_STATUS is removed in commit
bf6ea084eb ("hwmon: (coretemp) Do not return -EAGAIN for low
temperatures"), and temp_data->valid is set and never cleared when the
temperature has been read once.

Remove the obsolete temp_data->valid field.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20221108075051.5139-2-rui.zhang@intel.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-12-04 16:45:02 -08:00
Yang Yingliang
7dec14537c hwmon: (coretemp) fix pci device refcount leak in nv1a_ram_new()
As comment of pci_get_domain_bus_and_slot() says, it returns
a pci device with refcount increment, when finish using it,
the caller must decrement the reference count by calling
pci_dev_put(). So call it after using to avoid refcount leak.

Fixes: 14513ee696 ("hwmon: (coretemp) Use PCI host bridge ID to identify CPU if necessary")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20221118093303.214163-1-yangyingliang@huawei.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-12-01 09:20:55 -08:00
Phil Auld
a89ff5f5cc hwmon: (coretemp) Check for null before removing sysfs attrs
If coretemp_add_core() gets an error then pdata->core_data[indx]
is already NULL and has been kfreed. Don't pass that to
sysfs_remove_group() as that will crash in sysfs_remove_group().

[Shortened for readability]
[91854.020159] sysfs: cannot create duplicate filename '/devices/platform/coretemp.0/hwmon/hwmon2/temp20_label'
<cpu offline>
[91855.126115] BUG: kernel NULL pointer dereference, address: 0000000000000188
[91855.165103] #PF: supervisor read access in kernel mode
[91855.194506] #PF: error_code(0x0000) - not-present page
[91855.224445] PGD 0 P4D 0
[91855.238508] Oops: 0000 [#1] PREEMPT SMP PTI
...
[91855.342716] RIP: 0010:sysfs_remove_group+0xc/0x80
...
[91855.796571] Call Trace:
[91855.810524]  coretemp_cpu_offline+0x12b/0x1dd [coretemp]
[91855.841738]  ? coretemp_cpu_online+0x180/0x180 [coretemp]
[91855.871107]  cpuhp_invoke_callback+0x105/0x4b0
[91855.893432]  cpuhp_thread_fun+0x8e/0x150
...

Fix this by checking for NULL first.

Signed-off-by: Phil Auld <pauld@redhat.com>
Cc: linux-hwmon@vger.kernel.org
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20221117162313.3164803-1-pauld@redhat.com
Fixes: 199e0de7f5 ("hwmon: (coretemp) Merge pkgtemp with coretemp")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-12-01 09:20:24 -08:00
Zhang Rui
7108b80a54 hwmon/coretemp: Handle large core ID value
The coretemp driver supports up to a hard-coded limit of 128 cores.

Today, the driver can not support a core with an ID above that limit.
Yet, the encoding of core ID's is arbitrary (BIOS APIC-ID) and so they
may be sparse and they may be large.

Update the driver to map arbitrary core ID numbers into appropriate
array indexes so that 128 cores can be supported, no matter the encoding
of core ID's.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Len Brown <len.brown@intel.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20221014090147.1836-3-rui.zhang@intel.com
2022-10-17 11:58:52 -07:00
Paul Fertser
952a11ca32 hwmon: cleanup non-bool "valid" data fields
We have bool so use it consistently in all the drivers.

The following Coccinelle script was used:

@@
identifier T;
type t = { char, int };
@@
struct T {
...
-	t valid;
+	bool valid;
...
}

@@
identifier v;
@@
(
- v->valid = 0
+ v->valid = false
|
- v->valid = 1
+ v->valid = true
)

followed by sed to fixup the comments:
sed '/bool valid;/{s/!=0/true/;s/zero/false/}'

Few whitespace changes were fixed manually. All modified drivers were
compile-tested.

Signed-off-by: Paul Fertser <fercerpav@gmail.com>
Link: https://lore.kernel.org/r/20210924195202.27917-1-fercerpav@gmail.com
[groeck: Fixed up 'u8 valid' to 'boool valid' in atxp1.c]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2021-10-12 07:22:41 -07:00
Thomas Gleixner
5cfc7ac7c1 hwmon: Convert to new X86 CPU match macros
The new macro set has a consistent namespace and uses C99 initializers
instead of the grufty C89 ones.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lkml.kernel.org/r/20200320131509.859324598@linutronix.de
2020-03-24 21:33:36 +01:00
Wenwen Wang
e027a2dea5 hwmon (coretemp) Fix a memory leak bug
In coretemp_init(), 'zone_devices' is allocated through kcalloc().
However, it is not deallocated in the following execution if
platform_driver_register() fails, leading to a memory leak. To fix this
issue, introduce the 'outzone' label to free 'zone_devices' before
returning the error.

Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Link: https://lore.kernel.org/r/1566248402-6538-1-git-send-email-wenwen@cs.uga.edu
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2019-08-31 08:04:57 -07:00
Linus Torvalds
222a21d295 Merge branch 'x86-topology-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 topology updates from Ingo Molnar:
 "Implement multi-die topology support on Intel CPUs and expose the die
  topology to user-space tooling, by Len Brown, Kan Liang and Zhang Rui.

  These changes should have no effect on the kernel's existing
  understanding of topologies, i.e. there should be no behavioral impact
  on cache, NUMA, scheduler, perf and other topologies and overall
  system performance"

* 'x86-topology-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/rapl: Cosmetic rename internal variables in response to multi-die/pkg support
  perf/x86/intel/uncore: Cosmetic renames in response to multi-die/pkg support
  hwmon/coretemp: Cosmetic: Rename internal variables to zones from packages
  thermal/x86_pkg_temp_thermal: Cosmetic: Rename internal variables to zones from packages
  perf/x86/intel/cstate: Support multi-die/package
  perf/x86/intel/rapl: Support multi-die/package
  perf/x86/intel/uncore: Support multi-die/package
  topology: Create core_cpus and die_cpus sysfs attributes
  topology: Create package_cpus sysfs attribute
  hwmon/coretemp: Support multi-die/package
  powercap/intel_rapl: Update RAPL domain name and debug messages
  thermal/x86_pkg_temp_thermal: Support multi-die/package
  powercap/intel_rapl: Support multi-die/package
  powercap/intel_rapl: Simplify rapl_find_package()
  x86/topology: Define topology_logical_die_id()
  x86/topology: Define topology_die_id()
  cpu/topology: Export die_id
  x86/topology: Create topology_max_die_per_package()
  x86/topology: Add CPUID.1F multi-die/package support
2019-07-08 18:28:44 -07:00
Thomas Gleixner
935912c538 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 164
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation version 2 of the license this program
  is distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details you should have received a copy of the gnu general
  public license along with this program if not write to the free
  software foundation inc 51 franklin street fifth floor boston ma
  02110 1301 usa

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 12 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Richard Fontana <rfontana@redhat.com>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070033.745497013@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-30 11:26:38 -07:00
Len Brown
835896a59b hwmon/coretemp: Cosmetic: Rename internal variables to zones from packages
Syntax update only -- no logical or functional change.

In response to the new multi-die/package changes, update variable names to
use the more generic thermal "zone" terminology, instead of "package", as
the zones can refer to either packages or die.

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Link: https://lkml.kernel.org/r/facecfd3525d55c2051f63a7ec709aeb03cc1dc1.1557769318.git.len.brown@intel.com
2019-05-23 10:08:36 +02:00
Zhang Rui
cfcd82e632 hwmon/coretemp: Support multi-die/package
Package temperature sensors are actually implemented in hardware per-die.

Update coretemp to be "die-aware", so it can expose mulitple sensors per
package, instead of just one.  No change to single-die/package systems.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: linux-pm@vger.kernel.org
Cc: linux-hwmon@vger.kernel.org
Link: https://lkml.kernel.org/r/ec2868f35113a01ff72d9041e0b97fc6a1c7df84.1557769318.git.len.brown@intel.com
2019-05-23 10:08:33 +02:00
Guenter Roeck
0cd709d0dd hwmon: (coretemp) Replace S_<PERMS> with octal values
Replace S_<PERMS> with octal values.

The conversion was done automatically with coccinelle. The semantic patches
and the scripts used to generate this commit log are available at
https://github.com/groeck/coccinelle-patches/hwmon/.

This patch does not introduce functional changes. It was verified by
compiling the old and new files and comparing text and data sizes.

Cc: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-12-16 15:13:41 -08:00
Kees Cook
6396bb2215 treewide: kzalloc() -> kcalloc()
The kzalloc() function has a 2-factor argument form, kcalloc(). This
patch replaces cases of:

        kzalloc(a * b, gfp)

with:
        kcalloc(a * b, gfp)

as well as handling cases of:

        kzalloc(a * b * c, gfp)

with:

        kzalloc(array3_size(a, b, c), gfp)

as it's slightly less ugly than:

        kzalloc_array(array_size(a, b), c, gfp)

This does, however, attempt to ignore constant size factors like:

        kzalloc(4 * 1024, gfp)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
  kzalloc(
-	(sizeof(TYPE)) * E
+	sizeof(TYPE) * E
  , ...)
|
  kzalloc(
-	(sizeof(THING)) * E
+	sizeof(THING) * E
  , ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
  kzalloc(
-	sizeof(u8) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(__u8) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(char) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(unsigned char) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(u8) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(__u8) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(char) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(unsigned char) * COUNT
+	COUNT
  , ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (COUNT_ID)
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * COUNT_ID
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (COUNT_CONST)
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * COUNT_CONST
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (COUNT_ID)
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * COUNT_ID
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (COUNT_CONST)
+	COUNT_CONST, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * COUNT_CONST
+	COUNT_CONST, sizeof(THING)
  , ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

- kzalloc
+ kcalloc
  (
-	SIZE * COUNT
+	COUNT, SIZE
  , ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
  kzalloc(
-	sizeof(TYPE) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(THING) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
  kzalloc(
-	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kzalloc(
-	sizeof(THING1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(THING1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
  kzalloc(
-	(COUNT) * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
)

// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
  kzalloc(C1 * C2 * C3, ...)
|
  kzalloc(
-	(E1) * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	(E1) * (E2) * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	(E1) * (E2) * (E3)
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	E1 * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
)

// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@

(
  kzalloc(sizeof(THING) * C2, ...)
|
  kzalloc(sizeof(TYPE) * C2, ...)
|
  kzalloc(C1 * C2 * C3, ...)
|
  kzalloc(C1 * C2, ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (E2)
+	E2, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * E2
+	E2, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (E2)
+	E2, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * E2
+	E2, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	(E1) * E2
+	E1, E2
  , ...)
|
- kzalloc
+ kcalloc
  (
-	(E1) * (E2)
+	E1, E2
  , ...)
|
- kzalloc
+ kcalloc
  (
-	E1 * E2
+	E1, E2
  , ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12 16:19:22 -07:00
Linus Torvalds
d4667ca142 Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 PTI and Spectre related fixes and updates from Ingo Molnar:
 "Here's the latest set of Spectre and PTI related fixes and updates:

  Spectre:
   - Add entry code register clearing to reduce the Spectre attack
     surface
   - Update the Spectre microcode blacklist
   - Inline the KVM Spectre helpers to get close to v4.14 performance
     again.
   - Fix indirect_branch_prediction_barrier()
   - Fix/improve Spectre related kernel messages
   - Fix array_index_nospec_mask() asm constraint
   - KVM: fix two MSR handling bugs

  PTI:
   - Fix a paranoid entry PTI CR3 handling bug
   - Fix comments

  objtool:
   - Fix paranoid_entry() frame pointer warning
   - Annotate WARN()-related UD2 as reachable
   - Various fixes
   - Add Add Peter Zijlstra as objtool co-maintainer

  Misc:
   - Various x86 entry code self-test fixes
   - Improve/simplify entry code stack frame generation and handling
     after recent heavy-handed PTI and Spectre changes. (There's two
     more WIP improvements expected here.)
   - Type fix for cache entries

  There's also some low risk non-fix changes I've included in this
  branch to reduce backporting conflicts:

   - rename a confusing x86_cpu field name
   - de-obfuscate the naming of single-TLB flushing primitives"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
  x86/entry/64: Fix CR3 restore in paranoid_exit()
  x86/cpu: Change type of x86_cache_size variable to unsigned int
  x86/spectre: Fix an error message
  x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping
  selftests/x86/mpx: Fix incorrect bounds with old _sigfault
  x86/mm: Rename flush_tlb_single() and flush_tlb_one() to __flush_tlb_one_[user|kernel]()
  x86/speculation: Add <asm/msr-index.h> dependency
  nospec: Move array_index_nospec() parameter checking into separate macro
  x86/speculation: Fix up array_index_nospec_mask() asm constraint
  x86/debug: Use UD2 for WARN()
  x86/debug, objtool: Annotate WARN()-related UD2 as reachable
  objtool: Fix segfault in ignore_unreachable_insn()
  selftests/x86: Disable tests requiring 32-bit support on pure 64-bit systems
  selftests/x86: Do not rely on "int $0x80" in single_step_syscall.c
  selftests/x86: Do not rely on "int $0x80" in test_mremap_vdso.c
  selftests/x86: Fix build bug caused by the 5lvl test which has been moved to the VM directory
  selftests/x86/pkeys: Remove unused functions
  selftests/x86: Clean up and document sscanf() usage
  selftests/x86: Fix vDSO selftest segfault for vsyscall=none
  x86/entry/64: Remove the unused 'icebp' macro
  ...
2018-02-14 17:02:15 -08:00
Jia Zhang
b399151cb4 x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping
x86_mask is a confusing name which is hard to associate with the
processor's stepping.

Additionally, correct an indent issue in lib/cpu.c.

Signed-off-by: Jia Zhang <qianyue.zj@alibaba-inc.com>
[ Updated it to more recent kernels. ]
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@alien8.de
Cc: tony.luck@intel.com
Link: http://lkml.kernel.org/r/1514771530-70829-1-git-send-email-qianyue.zj@alibaba-inc.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-02-15 01:15:52 +01:00
Sinan Kaya
b9ccff233e hwmon: (coretemp) deprecate pci_get_bus_and_slot()
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as
where a PCI device is present. This restricts the device drivers to be
reused for other domain numbers.

Use pci_get_domain_bus_and_slot() with a domain number of 0 where we can't
extract the domain number. Other places, use the actual domain number from
the device.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2018-01-02 15:05:34 -08:00
Thomas Gleixner
90b4f30b6d hwmon: (coretemp) Handle frozen hotplug state correctly
The recent conversion to the hotplug state machine missed that the original
hotplug notifiers did not execute in the frozen state, which is used on
suspend on resume.

This does not matter on single socket machines, but on multi socket systems
this breaks when the device for a non-boot socket is removed when the last
CPU of that socket is brought offline. The device removal locks up the
machine hard w/o any debug output.

Prevent executing the hotplug callbacks when cpuhp_tasks_frozen is true.

Thanks to Tommi for providing debug information patiently while I failed to
spot the obvious.

Fixes: e00ca5df37 ("hwmon: (coretemp) Convert to hotplug state machine")
Reported-by: Tommi Rantala <tt.rantala@gmail.com>
Tested-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2017-05-14 07:49:32 -07:00
Thomas Gleixner
7126684605 hwmon: (coretemp) Simplify package management
Keeping track of the per package platform devices requires an extra object,
which is held in a linked list.

The maximum number of packages is known at init() time. So the extra object
and linked list management can be replaced by an array of platform device
pointers in which the per package devices pointers can be stored. Lookup
becomes a simple array lookup instead of a list walk.

The mutex protecting the list can be removed as well because the array is
only accessed from cpu hotplug callbacks which are already serialized.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-12-09 21:54:13 -08:00
Thomas Gleixner
2195c31b12 hwmon: (coretemp) Use proper error codes in cpu online callback
The cpu online callback returns success unconditionally even when the
device has no support, micro code mismatches or device allocation fails.
Only if CPU_HOTPLUG is disabled, the init function checks whether the
device list is empty and removes the driver.

This does not make sense. If CPU HOTPLUG is enabled then there is no point
to keep the driver around when it failed to initialize on the already
online cpus. The chance that not yet online CPUs will provide a functional
interface later is very close to zero.

Add proper error return codes, so the setup of the cpu hotplug states fails
when the device cannot be initialized and remove all the magic cruft.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-12-09 21:54:11 -08:00
Thomas Gleixner
e00ca5df37 hwmon: (coretemp) Convert to hotplug state machine
Install the callbacks via the state machine. Setup and teardown are handled
by the hotplug core.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: linux-hwmon@vger.kernel.org
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Jean Delvare <jdelvare@suse.com>
Cc: rt@linuxtronix.de
Cc: Guenter Roeck <linux@roeck-us.net>
Link: http://lkml.kernel.org/r/20161117183541.8588-5-bigeasy@linutronix.de
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-12-09 21:54:10 -08:00
Thomas Gleixner
4b138cf73f hwmon: (coretemp) Avoid redundant lookups
No point in looking up the same thing over and over.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-12-09 21:54:09 -08:00
Thomas Gleixner
e1b370b640 hwmon: (coretemp) Simplify sibling management
The coretemp driver provides a sysfs interface per physical core. If
hyperthreading is enabled and one of the siblings goes offline the sysfs
interface is removed and then immeditately created again for the
sibling. The only difference of them is the target cpu for the
rdmsr_on_cpu() in the sysfs show functions.

It's way simpler to keep a cpumask of cpus which are active in a package
and only remove the interface when the last sibling goes offline. Otherwise
just move the target cpu for the sysfs show functions to the still online
sibling.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-12-09 21:54:07 -08:00
Thomas Gleixner
723f573433 hwmon: (coretemp) Fixup target cpu for package when cpu is offlined
When a CPU is offlined nothing checks whether it is the target CPU for the
package temperature sysfs interface.

As a consequence all future readouts of the package temperature return
crap:

90000

which is Tjmax of that package.

Check whether the outgoing CPU is the target for the package and assign it
to some other still online CPU in the package. Protect the change against
the rdmsr_on_cpu() in show_crit_alarm().

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2016-12-09 21:54:06 -08:00
Lukasz Odzioba
cc904f9cf2 hwmon: (coretemp) Increase limit of maximum core ID from 32 to 128.
A new limit selected arbitrarily as power of two greater than
required minimum for Xeon Phi processor (72 for Knights Landing).

Currently driver is not able to handle cores with core ID greater than 32.
Such attempt ends up with the following error in dmesg:
coretemp coretemp.0: Adding Core XXX failed

Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2015-10-14 07:57:14 -07:00
Bartosz Golaszewski
19a34eea4f coretemp: Replace cpu_sibling_mask() with topology_sibling_cpumask()
The former duplicates the functionality of the latter but is
neither documented nor arch-independent.

Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Cc: Benoit Cousson <bcousson@baylibre.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Jean Delvare <jdelvare@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Drokin <oleg.drokin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: http://lkml.kernel.org/r/1432645896-12588-4-git-send-email-bgolaszewski@baylibre.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-27 15:22:15 +02:00
Rasmus Villemoes
1055b5f904 hwmon: (coretemp) Allow format checking
By extracting the only part that differs we can allow static checking
of the format string, and possibly save a little .rodata.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
[Guenter Roeck: continuation line alignment]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2015-03-09 09:59:35 -07:00
Wolfram Sang
2a1ed07718 hwmon: drop owner assignment from platform_drivers
A platform_driver does not need to set an owner, it will be populated by the
driver core.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2014-10-20 16:20:36 +02:00
Guenter Roeck
c0940e95f7 Revert "hwmon: (coretemp) Refine TjMax detection"
This reverts commit 9fb6c9c73b.

Tjmax on some Intel CPUs is below 85 degrees C. One known example is
L5630 with Tjmax of 71 degrees C. There are other Xeon processors with
Tjmax of 70 or 80 degrees C. Also, the Intel IA32 System Programming
document states that the temperature target is in bits 23:16 of MSR 0x1a2
(MSR_TEMPERATURE_TARGET), which is 8 bits, not 7.

So even if turbostat uses similar checks to validate Tjmax, there is no
evidence that the checks are actually required. On the contrary, the
checks are known to cause problems and therefore need to be removed.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=75071.

Fixes: 9fb6c9c hwmon: (coretemp) Refine TjMax detection
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2014-05-01 04:07:52 -07:00