1051 Commits

Author SHA1 Message Date
Linus Torvalds
1746db26f8 pci-v6.13-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmdE14wUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxMPRAAslaEhHZ06cU/I+BA0UrMJBbzOw+/
 XM2XUojxWaNMYSBPVXbtSBrfFMnox4G3hFBPK0T0HiWoc7wGx/TUVJk65ioqM8ug
 gS/U3NjSlqlnH8NHxKrb/2t0tlMvSll9WwumOD9pMFeMGFOS3fAgUk+fBqXFYsI/
 RsVRMavW9BucZ0yMHpgr0KGLPSt3HK/E1h0NLO+TN6dpFcoIq3XimKFyk1QQQgiR
 V3W21JMwjw+lDnUAsijU+RBYi5Fj6Rpqig/biRnzagVE6PJOci3ZJEBE7dGqm4LM
 UlgG6Ql/eK+bb3fPhcXxVmscj5XlEfbesX5PUzTmuj79Wq5l9hpy+0c654G79y8b
 rGiEVGM0NxmRdbuhWQUM2EsffqFlkFu7MN3gH0tP0Z0t3VTXfBcGrQJfqCcSCZG3
 5IwGdEE2kmGb5c3RApZrm+HCXdxhb3Nwc3P8c27eXDT4eqHWDJag4hzLETNBdIrn
 Rsbgry6zzAVA6lLT0uasUlWerq/I6OrueJvnEKRGKDtbw/JL6PLveR1Rvsc//cQD
 Tu4FcG81bldQTUOdHEgFyJgmSu77Gvfs5RZBV0cEtcCBc33uGJne08kOdGD4BwWJ
 dqN3wJFh5yX4jlMGmBDw0KmFIwKstfUCIoDE4Kjtal02CURhz5ZCDVGNPnSUKN0C
 hflVX0//cRkHc5g=
 =2Otz
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Make pci_stop_dev() and pci_destroy_dev() safe so concurrent
     callers can't stop a device multiple times, even as we migrate from
     the global pci_rescan_remove_lock to finer-grained locking (Keith
     Busch)

   - Improve pci_walk_bus() implementation by making it recursive and
     moving locking up to avoid need for a 'locked' parameter (Keith
     Busch)

   - Unexport pci_walk_bus_locked(), which is only used internally by
     the PCI core (Keith Busch)

   - Detect some Thunderbolt chips that are built-in and hence
     'trustworthy' by a heuristic since the 'ExternalFacingPort' and
     'usb4-host-interface' ACPI properties are not quite enough (Esther
     Shimanovich)

  Resource management:

   - Use PCI bus addresses (not CPU addresses) in 'ranges' properties
     when building dynamic DT nodes so systems where PCI and CPU
     addresses differ work correctly (Andrea della Porta)

   - Tidy resource sizing and assignment with helpers to reduce
     redundancy (Ilpo Järvinen)

   - Improve pdev_sort_resources() 'bogus alignment' warning to be more
     specific (Ilpo Järvinen)

  Driver binding:

   - Convert driver .remove_new() callbacks to .remove() again to finish
     the conversion from returning 'int' to being 'void' (Sergio
     Paracuellos)

   - Export pcim_request_all_regions(), a managed interface to request
     all BARs (Philipp Stanner)

   - Replace pcim_iomap_regions_request_all() with
     pcim_request_all_regions(), and pcim_iomap_table()[n] with
     pcim_iomap(n), in the following drivers: ahci, crypto qat, crypto
     octeontx2, intel_th, iwlwifi, ntb idt, serial rp2, ALSA korg1212
     (Philipp Stanner)

   - Remove the now unused pcim_iomap_regions_request_all() (Philipp
     Stanner)

   - Export pcim_iounmap_region(), a managed interface to unmap and
     release a PCI BAR (Philipp Stanner)

   - Replace pcim_iomap_regions(mask) with pcim_iomap_region(n), and
     pcim_iounmap_regions(mask) with pcim_iounmap_region(n), in the
     following drivers: fpga dfl-pci, block mtip32xx, gpio-merrifield,
     cavium (Philipp Stanner)

  Error handling:

   - Add sysfs 'reset_subordinate' to reset the entire hierarchy below a
     bridge; previously Secondary Bus Reset could only be used when
     there was a single device below a bridge (Keith Busch)

   - Warn if we reset a running device where the driver didn't register
     pci_error_handlers notification callbacks (Keith Busch)

  ASPM:

   - Disable ASPM L1 before touching L1 PM Substates to follow the spec
     closer and avoid a CPU load timeout on some platforms (Ajay
     Agarwal)

   - Set devices below Intel VMD to D0 before enabling ASPM L1 Substates
     as required per spec for all L1 Substates changes (Jian-Hong Pan)

  Power management:

   - Enable starfive controller runtime PM before probing host bridge
     (Mayank Rana)

   - Enable runtime power management for host bridges (Krishna chaitanya
     chundru)

  Power control:

   - Use of_platform_device_create() instead of of_platform_populate()
     to create pwrctl platform devices so we can control it based on the
     child nodes (Manivannan Sadhasivam)

   - Create pwrctrl platform devices only if there's a relevant power
     supply property (Manivannan Sadhasivam)

   - Add device link from the pwrctl supplier to the PCI dev to ensure
     pwrctl drivers are probed before the PCI dev driver; this avoids a
     race where pwrctl could change device power state while the PCI
     driver was active (Manivannan Sadhasivam)

   - Find pwrctl device for removal with of_find_device_by_node()
     instead of searching all children of the parent (Manivannan
     Sadhasivam)

   - Rename 'pwrctl' to 'pwrctrl' to match new bandwidth controller
     ('bwctrl') and hotplug files (Bjorn Helgaas)

  Bandwidth control:

   - Add read/modify/write locking for Link Control 2, which is used to
     manage Link speed (Ilpo Järvinen)

   - Extract Link Bandwidth Management Status check into
     pcie_lbms_seen(), where it can be shared between the bandwidth
     controller and quirks that use it to help retrain failed links
     (Ilpo Järvinen)

   - Re-add Link Bandwidth notification support with updates to address
     the reasons it was previously reverted (Alexandru Gagniuc, Ilpo
     Järvinen)

   - Add pcie_set_target_speed() and related functionality so drivers
     can manage PCIe Link speed based on thermal or other constraints
     (Ilpo Järvinen)

   - Add a thermal cooling driver to throttle PCIe Links via the
     existing thermal management framework (Ilpo Järvinen)

   - Add a userspace selftest for the PCIe bandwidth controller (Ilpo
     Järvinen)

  PCI device hotplug:

   - Add hotplug controller driver for Marvell OCTEON multi-function
     device where function 0 has a management console interface to
     enable/disable and provision various personalities for the other
     functions (Shijith Thotton)

   - Retain a reference to the pci_bus for the lifetime of a pci_slot to
     avoid a use-after-free when the thunderbolt driver resets USB4 host
     routers on boot, causing hotplug remove/add of downstream docks or
     other devices (Lukas Wunner)

   - Remove unused cpcihp struct cpci_hp_controller_ops.hardware_test
     (Guilherme Giacomo Simoes)

   - Remove unused cpqphp struct ctrl_dbg.ctrl (Christophe JAILLET)

   - Use pci_bus_read_dev_vendor_id() instead of hand-coded presence
     detection in cpqphp (Ilpo Järvinen)

   - Simplify cpqphp enumeration, which is already simple-minded and
     doesn't handle devices below hot-added bridges (Ilpo Järvinen)

  Virtualization:

   - Add ACS quirk for Wangxun FF5xxx NICs, which don't advertise an ACS
     capability but do isolate functions as though PCI_ACS_RR and
     PCI_ACS_CR were set, so the functions can be in independent IOMMU
     groups (Mengyuan Lou)

  TLP Processing Hints (TPH):

   - Add and document TLP Processing Hints (TPH) support so drivers can
     enable and disable TPH and the kernel can save/restore TPH
     configuration (Wei Huang)

   - Add TPH Steering Tag support so drivers can retrieve Steering Tag
     values associated with specific CPUs via an ACPI _DSM to improve
     performance by directing DMA writes closer to their consumers (Wei
     Huang)

  Data Object Exchange (DOE):

   - Wait up to 1 second for DOE Busy bit to clear before writing a
     request to the mailbox to avoid failures if the mailbox is still
     busy from a previous transfer (Gregory Price)

  Endpoint framework:

   - Skip attempts to allocate from endpoint controller memory window if
     the requested size is larger than the window (Damien Le Moal)

   - Add and document pci_epc_mem_map() and pci_epc_mem_unmap() to
     handle controller-specific size and alignment constraints, and add
     test cases to the endpoint test driver (Damien Le Moal)

   - Implement dwc pci_epc_ops.align_addr() so pci_epc_mem_map() can
     observe DWC-specific alignment requirements (Damien Le Moal)

   - Synchronously cancel command handler work in endpoint test before
     cleaning up DMA and BARs (Damien Le Moal)

   - Respect endpoint page size in dw_pcie_ep_align_addr() (Niklas
     Cassel)

   - Use dw_pcie_ep_align_addr() in dw_pcie_ep_raise_msi_irq() and
     dw_pcie_ep_raise_msix_irq() instead of open coding the equivalent
     (Niklas Cassel)

   - Avoid NULL dereference if Modem Host Interface Endpoint lacks
     'mmio' DT property (Zhongqiu Han)

   - Release PCI domain ID of Endpoint controller parent (not controller
     itself) and before unregistering the controller, to avoid
     use-after-free (Zijun Hu)

   - Clear secondary (not primary) EPC in pci_epc_remove_epf() when
     removing the secondary controller associated with an NTB (Zijun Hu)

  Cadence PCIe controller driver:

   - Lower severity of 'phy-names' message (Bartosz Wawrzyniak)

  Freescale i.MX6 PCIe controller driver:

   - Fix suspend/resume support on i.MX6QDL, which has a hardware
     erratum that prevents use of L2 (Stefan Eichenberger)

  Intel VMD host bridge driver:

   - Add 0xb60b and 0xb06f Device IDs for client SKUs (Nirmal Patel)

  MediaTek PCIe Gen3 controller driver:

   - Update mediatek-gen3 DT binding to require the exact number of
     clocks for each SoC (Fei Shao)

   - Add support for DT 'max-link-speed' and 'num-lanes' properties to
     restrict the link speed and width (AngeloGioacchino Del Regno)

  Microchip PolarFlare PCIe controller driver:

   - Add DT and driver support for using either of the two PolarFire
     Root Ports (Conor Dooley)

  NVIDIA Tegra194 PCIe controller driver:

   - Move endpoint controller cleanups that depend on refclk from the
     host to the notifier that tells us the host has deasserted PERST#,
     when refclk should be valid (Manivannan Sadhasivam)

  Qualcomm PCIe controller driver:

   - Add qcom SAR2130P DT binding with an additional clock (Dmitry
     Baryshkov)

   - Enable MSI interrupts if 'global' IRQ is supported, since a
     previous commit unintentionally masked them (Manivannan Sadhasivam)

   - Move endpoint controller cleanups that depend on refclk from the
     host to the notifier that tells us the host has deasserted PERST#,
     when refclk should be valid (Manivannan Sadhasivam)

   - Add DT binding and driver support for IPQ9574, with Synopsys IP
     v5.80a and Qcom IP 1.27.0 (devi priya)

   - Move the OPP "operating-points-v2" table from the
     qcom,pcie-sm8450.yaml DT binding to qcom,pcie-common.yaml, where it
     can be used by other Qcom platforms (Qiang Yu)

   - Add 'global' SPI interrupt for events like link-up, link-down to
     qcom,pcie-x1e80100 DT binding so we can start enumeration when the
     link comes up (Qiang Yu)

   - Disable ASPM L0s for qcom,pcie-x1e80100 since the PHY is not tuned
     to support this (Qiang Yu)

   - Add ops_1_21_0 for SC8280X family SoC, which doesn't use the
     'iommu-map' DT property and doesn't need BDF-to-SID translation
     (Qiang Yu)

  Rockchip PCIe controller driver:

   - Define ROCKCHIP_PCIE_AT_SIZE_ALIGN to replace magic 256 endpoint
     .align value (Damien Le Moal)

   - When unmapping an endpoint window, compute the region index instead
     of searching for it, and verify that the address was mapped (Damien
     Le Moal)

   - When mapping an endpoint window, verify that the address hasn't
     been mapped already (Damien Le Moal)

   - Implement pci_epc_ops.align_addr() for rockchip-ep (Damien Le Moal)

   - Fix MSI IRQ data mapping to observe the alignment constraint, which
     fixes intermittent page faults in memcpy_toio() and memcpy_fromio()
     (Damien Le Moal)

   - Rename rockchip_pcie_parse_ep_dt() to
     rockchip_pcie_ep_get_resources() for consistency with similar DT
     interfaces (Damien Le Moal)

   - Skip the unnecessary link train in rockchip_pcie_ep_probe() and do
     it only in the endpoint start operation (Damien Le Moal)

   - Implement pci_epc_ops.stop_link() to disable link training and
     controller configuration (Damien Le Moal)

   - Attempt link training at 5 GT/s when both partners support it
     (Damien Le Moal)

   - Add a handler for PERST# signal so we can detect host-initiated
     resets and start link training after PERST# is deasserted (Damien
     Le Moal)

  Synopsys DesignWare PCIe controller driver:

   - Clear outbound address on unmap so dw_pcie_find_index() won't match
     an ATU index that was already unmapped (Damien Le Moal)

   - Use of_property_present() instead of of_property_read_bool() when
     testing for presence of non-boolean DT properties (Rob Herring)

   - Advertise 1MB size if endpoint supports Resizable BARs, which was
     inadvertently lost in v6.11 (Niklas Cassel)

  TI J721E PCIe driver:

   - Add PCIe support for J722S SoC (Siddharth Vadapalli)

   - Delay PCIE_T_PVPERL_MS (100 ms), not just PCIE_T_PERST_CLK_US (100
     us), before deasserting PERST# to ensure power and refclk are
     stable (Siddharth Vadapalli)

  TI Keystone PCIe controller driver:

   - Set the 'ti,keystone-pcie' mode so v3.65a devices work in Root
     Complex mode (Kishon Vijay Abraham I)

   - Try to avoid unrecoverable SError for attempts to issue config
     transactions when the link is down; this is racy but the best we
     can do (Kishon Vijay Abraham I)

  Miscellaneous:

   - Reorganize kerneldoc parameter names to match order in function
     signature (Julia Lawall)

   - Fix sysfs reset_method_store() memory leak (Todd Kjos)

   - Simplify pci_create_slot() (Ilpo Järvinen)

   - Fix incorrect printf format specifiers in pcitest (Luo Yifan)"

* tag 'pci-v6.13-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (127 commits)
  PCI: rockchip-ep: Handle PERST# signal in EP mode
  PCI: rockchip-ep: Improve link training
  PCI: rockship-ep: Implement the pci_epc_ops::stop_link() operation
  PCI: rockchip-ep: Refactor endpoint link training enable
  PCI: rockchip-ep: Refactor rockchip_pcie_ep_probe() MSI-X hiding
  PCI: rockchip-ep: Refactor rockchip_pcie_ep_probe() memory allocations
  PCI: rockchip-ep: Rename rockchip_pcie_parse_ep_dt()
  PCI: rockchip-ep: Fix MSI IRQ data mapping
  PCI: rockchip-ep: Implement the pci_epc_ops::align_addr() operation
  PCI: rockchip-ep: Improve rockchip_pcie_ep_map_addr()
  PCI: rockchip-ep: Improve rockchip_pcie_ep_unmap_addr()
  PCI: rockchip-ep: Use a macro to define EP controller .align feature
  PCI: rockchip-ep: Fix address translation unit programming
  PCI/pwrctrl: Rename pwrctrl functions and structures
  PCI/pwrctrl: Rename pwrctl files to pwrctrl
  PCI/pwrctl: Remove pwrctl device without iterating over all children of pwrctl parent
  PCI/pwrctl: Ensure that pwrctl drivers are probed before PCI client drivers
  PCI/pwrctl: Create pwrctl device only if at least one power supply is present
  PCI/pwrctl: Use of_platform_device_create() to create pwrctl devices
  tools: PCI: Fix incorrect printf format specifiers
  ...
2024-11-26 18:05:44 -08:00
Bjorn Helgaas
2d56427989 Merge branch 'pci/misc'
- Reorganize kerneldoc parameter names to match order in function signature
  (Julia Lawall)

- Remove kerneldoc return value descriptions from hotplug registration
  interfaces that don't return anything (Ilpo Järvinen)

- Fix sysfs reset_method_store() memory leak (Todd Kjos)

- Simplify pci_create_slot() (Ilpo Järvinen)

- Fix incorrect printf format specifiers in pcitest (Luo Yifan)

* pci/misc:
  tools: PCI: Fix incorrect printf format specifiers
  PCI: Simplify pci_create_slot() logic
  PCI: Fix reset_method_store() memory leak
  PCI: hotplug: Remove "Returns" kerneldoc from void functions
  PCI: hotplug: Reorganize kerneldoc parameter names
2024-11-25 13:41:00 -06:00
Bjorn Helgaas
ab02bafcec Merge branch 'pci/tph'
- Add and document TLP Processing Hints (TPH) support so drivers can enable
  and disable TPH and the kernel can save/restore TPH configuration (Wei
  Huang)

- Add TPH Steering Tag support so drivers can retrieve Steering Tag values
  associated with specific CPUs via an ACPI _DSM to direct DMA writes
  closer to their consumers (Wei Huang)

* pci/tph:
  PCI/TPH: Add TPH documentation
  PCI/TPH: Add Steering Tag support
  PCI: Add TLP Processing Hints (TPH) support
2024-11-25 13:40:55 -06:00
Bjorn Helgaas
c03d361c20 Merge branch 'pci/resource'
- Add resource_set_size() to set resource size when start has already been
  set (Ilpo Järvinen)

- Add resource_set_range() helper to set both resource start and size (Ilpo
  Järvinen)

- Use IS_ALIGNED() and resource_size() in quirk_s3_64M() instead of
  open-coding them (Ilpo Järvinen)

- Add ALIGN_DOWN_IF_NONZERO() to avoid code duplication when distributing
  resources across devices (Ilpo Järvinen)

- Improve pdev_sort_resources() warning message to be more specific (Ilpo
  Järvinen)

* pci/resource:
  PCI: Improve pdev_sort_resources() warning message
  PCI: Add ALIGN_DOWN_IF_NONZERO() helper
  PCI: Use align and resource helpers, and SZ_* in quirk_s3_64M()
  PCI: Use resource_set_{range,size}() helpers
  resource: Add resource set range and size helpers
2024-11-25 13:40:55 -06:00
Bjorn Helgaas
d985e2beb9 Merge branch 'pci/reset'
- Add sysfs 'reset_subordinate' to reset hierarchy below bridge (Keith
  Busch)

- Warn if we reset a running device where driver didn't register
  pci_error_handlers notification callbacks (Keith Busch)

* pci/reset:
  PCI: Warn if a running device is unaware of reset
  PCI: Add 'reset_subordinate' to reset hierarchy below bridge
2024-11-25 13:40:54 -06:00
Linus Torvalds
e6de688e93 Devicetree updates for v6.13:
Bindings:
 
 - Enable dtc "interrupt_provider" warnings for binding examples.
   Fix the warnings in fsl,mu-msi and ti,sci-inta due to this.
 
 - Convert zii,rave-sp-wdt, zii,rave-sp-pwrbutton,  and
   altr,fpga-passive-serial to DT schema format
 
 - Add some documentation on the different forms of YAML text blocks
   which are a constant source of review comments
 
 - Fix some schema errors in constraints for arrays
 
 - Add compatibles for qcom,sar2130p-pdc and onnn,adt7462
 
 DT core:
 
 - Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n
 
 - Add some warnings on deprecated address handling
 
 - Rework early_init_dt_scan() so the arch can pass in the phys address
   of the DTB as __pa() is not always valid to use. This fixes a warning
   for arm64 with kexec.
 
 - Add and use some new DT graph iterators for iterating over ports and
   endpoints
 
 - Rework reserved-memory handling to be sized dynamically for fixed
   regions
 
 - Optimize of_modalias() to avoid a strlen() call
 
 - Constify struct device_node and property pointers where ever possible
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmc7qaoACgkQ+vtdtY28
 YcN54g/+Ifz4hQTSWV+VBhihovMMPiQUdxZ+MfJfPnPcZ7NJzaTf+zqhZyS4wQou
 v0pdtyR0B1fCM/EvKaYD+1aTTAQFEIT5Dqac+9ePwqaYqSk+yCTxyzW9m+P3rTPV
 THo8SGRss7T+Rs+2WaUGxphTJItMGIRdbBvoqK+82EdKFXXKw2BSD8tlJTWwbTam
 9xkrpUzw7f4FvVY8vVhRyOd5i8/v+FH8D65DMIT6ME9zRn4MzKVzCg6udgYeCBld
 C2XbV+wnyewtjrN2IX+2uQ2mheb7yJu3AEI3iFR5x/sRrsSLpisxrUl38xOOpxrM
 XxYtHgE3omjagQ+y+L2PMthlKvhFrXVXIvhUH8xxje5z1Vyq3VMfiABkHlMpAnys
 5LY4xEhvqDkPNo65UmjMiHxGW/xtcKsmAZBOp+HLerZfCJIFvl380fi8mNg/Sjvz
 7ExCSpzCPsHASZg7QCTplU3BUtg+067Ch/k8Hsn/Og73Pqm3xH4IezQZKwweN9ZT
 LC6OQBI7C3Yt1hom9qgUcA4H4/aaPxTVV7i0DGuAKh8Lon6SaoX2yFpweUBgbsL/
 c9DIW4vbYBIGASxxUbHlNMKvPCKACKmpFXhsnH5Waj+VWSOwsJ8bjGpH8PfMKdFW
 dyJB/r94GqCGpCW7+FC1qGmXiQJGkCo89pKBVjSf4Kj45ht/76o=
 =NCYS
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree updates from Rob Herring:
 "Bindings:

   - Enable dtc "interrupt_provider" warnings for binding examples. Fix
     the warnings in fsl,mu-msi and ti,sci-inta due to this.

   - Convert zii,rave-sp-wdt, zii,rave-sp-pwrbutton, and
     altr,fpga-passive-serial to DT schema format

   - Add some documentation on the different forms of YAML text blocks
     which are a constant source of review comments

   - Fix some schema errors in constraints for arrays

   - Add compatibles for qcom,sar2130p-pdc and onnn,adt7462

  DT core:

   - Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n

   - Add some warnings on deprecated address handling

   - Rework early_init_dt_scan() so the arch can pass in the phys
     address of the DTB as __pa() is not always valid to use. This fixes
     a warning for arm64 with kexec.

   - Add and use some new DT graph iterators for iterating over ports
     and endpoints

   - Rework reserved-memory handling to be sized dynamically for fixed
     regions

   - Optimize of_modalias() to avoid a strlen() call

   - Constify struct device_node and property pointers where ever
     possible"

* tag 'devicetree-for-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (36 commits)
  of: Allow overlay kunit tests to run CONFIG_OF_OVERLAY=n
  dt-bindings: interrupt-controller: qcom,pdc: Add SAR2130P compatible
  of/address: Rework bus matching to avoid warnings
  of: WARN on deprecated #address-cells/#size-cells handling
  of/fdt: Don't use default address cell sizes for address translation
  dt-bindings: Enable dtc "interrupt_provider" warnings
  of/fdt: add dt_phys arg to early_init_dt_scan and early_init_dt_verify
  dt-bindings: cache: qcom,llcc: Fix X1E80100 reg entries
  dt-bindings: watchdog: convert zii,rave-sp-wdt.txt to yaml format
  dt-bindings: input: convert zii,rave-sp-pwrbutton.txt to yaml
  media: xilinx-tpg: use new of_graph functions
  fbdev: omapfb: use new of_graph functions
  gpu: drm: omapdrm: use new of_graph functions
  ASoC: audio-graph-card2: use new of_graph functions
  ASoC: audio-graph-card: use new of_graph functions
  ASoC: test-component: use new of_graph functions
  of: property: use new of_graph functions
  of: property: add of_graph_get_next_port_endpoint()
  of: property: add of_graph_get_next_port()
  of: module: remove strlen() call in of_modalias()
  ...
2024-11-20 13:19:25 -08:00
Ilpo Järvinen
665745f274 PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller
This mostly reverts the commit b4c7d2076b4e ("PCI/LINK: Remove bandwidth
notification"). An upcoming commit extends this driver building PCIe
bandwidth controller on top of it.

PCIe bandwidth notifications were first added in the commit e8303bb7a75c
("PCI/LINK: Report degraded links via link bandwidth notification") but
later had to be removed. The significant changes compared with the old
bandwidth notification driver include:

1) Don't print the notifications into kernel log, just keep the Link
   Speed cached in struct pci_bus updated. While somewhat unfortunate,
   the log spam was the source of complaints that eventually lead to
   the removal of the bandwidth notifications driver (see the links
   below for further information).

2) Besides the Link Bandwidth Management Interrupt, also enable Link
   Autonomous Bandwidth Interrupt to cover the other source of bandwidth
   changes.

3) Handle Link Speed updates robustly. Refresh the cached Link Speed
   when enabling Bandwidth Notification Interrupts, and solve the race
   between Link Speed read and LBMS/LABS update in
   pcie_bwnotif_irq_thread().

4) Use concurrency safe LNKCTL RMW operations.

5) The driver is now called PCIe bwctrl (bandwidth controller) instead
   of just bandwidth notifications because of increased scope and
   functionality within the driver.

6) Coexist with the Target Link Speed quirk in pcie_failed_link_retrain().
   Provide LBMS counting API for it.

7) Tweaks to variable/functions names for consistency and length reasons.

Bandwidth Notifications enable the cur_bus_speed in the struct pci_bus to
keep track PCIe Link Speed changes.

[bhelgaas: This is based on previous work by Alexandru Gagniuc
<mr.nuke.me@gmail.com>; see e8303bb7a75c ("PCI/LINK: Report degraded links
via link bandwidth notification")]

Link: https://lore.kernel.org/r/20241018144755.7875-7-ilpo.jarvinen@linux.intel.com
Link: https://lore.kernel.org/all/20190429185611.121751-1-helgaas@kernel.org/
Link: https://lore.kernel.org/linux-pci/20190501142942.26972-1-keith.busch@intel.com/
Link: https://lore.kernel.org/linux-pci/20200115221008.GA191037@google.com/
Suggested-by: Lukas Wunner <lukas@wunner.de> # Building bwctrl on top of bwnotif
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: squash fix to drop IRQF_ONESHOT and convert to hardirq handler:
https://lore.kernel.org/r/20241115165717.15233-1-ilpo.jarvinen@linux.intel.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-11-16 10:09:04 -06:00
Keith Busch
a3151e6daa PCI: Warn if a running device is unaware of reset
If a reset is issued to a running device with a driver that didn't register
the notification callbacks, the driver may be unaware of this event and
have an inconsistent view of the device's state. Log a warning of this
event because there's nothing else indicating the event occured, which
could be confusing when debugging such situations.

Link: https://lore.kernel.org/r/20241025222755.3756162-2-kbusch@meta.com
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Amey Narkhede <ameynarkhede03@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
2024-11-13 16:48:49 -06:00
Keith Busch
2fa046449a PCI: Add 'reset_subordinate' to reset hierarchy below bridge
The "bus" and "cxl_bus" reset methods reset a device by asserting Secondary
Bus Reset on the bridge leading to the device.  These only work if the
device is the only device below the bridge.

Add a sysfs 'reset_subordinate' attribute on bridges that can assert
Secondary Bus Reset regardless of how many devices are below the bridge.

This resets all the devices below a bridge in a single command, including
the locking and config space save/restore that reset methods normally do.

This may be the only way to reset devices that don't support other reset
methods (ACPI, FLR, PM reset, etc).

Link: https://lore.kernel.org/r/20241025222755.3756162-1-kbusch@meta.com
Signed-off-by: Keith Busch <kbusch@kernel.org>
[bhelgaas: commit log, add capable(CAP_SYS_ADMIN) check]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Amey Narkhede <ameynarkhede03@gmail.com>
2024-11-13 16:43:34 -06:00
Ilpo Järvinen
d2bd39c045 PCI: Store all PCIe Supported Link Speeds
The PCIe bandwidth controller added by a subsequent commit will require
selecting PCIe Link Speeds that are lower than the Maximum Link Speed.

The struct pci_bus only stores max_bus_speed. Even if PCIe r6.1 sec 8.2.1
currently disallows gaps in supported Link Speeds, the Implementation Note
in PCIe r6.1 sec 7.5.3.18, recommends determining supported Link Speeds
using the Supported Link Speeds Vector in the Link Capabilities 2 Register
(when available) to "avoid software being confused if a future
specification defines Links that do not require support for all slower
speeds."

Reuse code in pcie_get_speed_cap() to add pcie_get_supported_speeds() to
query the Supported Link Speeds Vector of a PCIe device. The value is taken
directly from the Supported Link Speeds Vector or synthesized from the Max
Link Speed in the Link Capabilities Register when the Link Capabilities 2
Register is not available.

The Supported Link Speeds Vector in the Link Capabilities Register 2
corresponds to the bus below on Root Ports and Downstream Ports, whereas it
corresponds to the bus above on Upstream Ports and Endpoints (PCIe r6.1 sec
7.5.3.18):

  Supported Link Speeds Vector - This field indicates the supported Link
  speed(s) of the associated Port.

Add supported_speeds into the struct pci_dev that caches the
Supported Link Speeds Vector.

supported_speeds contains a set of Link Speeds only in the case where PCIe
Link Speed can be determined. Root Complex Integrated Endpoints do not have
a well-defined Link Speed because they do not implement either of the Link
Capabilities Registers, which is allowed by PCIe r6.1 sec 7.5.3 (the same
limitation applies to determining cur_bus_speed and max_bus_speed that are
PCI_SPEED_UNKNOWN in such case). This is of no concern from PCIe bandwidth
controller point of view because such devices are not attached into a PCIe
Root Port that could be controlled.

The supported_speeds field keeps the extra reserved zero at the least
significant bit to match the Link Capabilities 2 Register layout.

An attempt was made to store supported_speeds field into the struct pci_bus
as an intersection of both ends of the Link, however, the subordinate
struct pci_bus is not available early enough. The Target Speed quirk (in
pcie_failed_link_retrain()) can run either during initial scan or later,
requiring it to use the API provided by the PCIe bandwidth controller to
set the Target Link Speed in order to co-exist with the bandwidth
controller. When the Target Speed quirk is calling the bandwidth controller
during initial scan, the struct pci_bus is not yet initialized. As such,
storing supported_speeds into the struct pci_bus is not viable.

Suggested-by: Lukas Wunner <lukas@wunner.de>
Link: https://lore.kernel.org/r/20241018144755.7875-4-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: move pcie_get_supported_speeds() decl to drivers/pci/pci.h]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-11-11 14:19:30 -06:00
Jason Gunthorpe
f3c3ccc4fe PCI: Fix pci_enable_acs() support for the ACS quirks
There are ACS quirks that hijack the normal ACS processing and deliver to
to special quirk code. The enable path needs to call
pci_dev_specific_enable_acs() and then pci_dev_specific_acs_enabled() will
report the hidden ACS state controlled by the quirk.

The recent rework got this out of order and we should try to call
pci_dev_specific_enable_acs() regardless of any actual ACS support in the
device.

As before command line parameters that effect standard PCI ACS don't
interact with the quirk versions, including the new config_acs= option.

Link: https://lore.kernel.org/r/0-v1-f96b686c625b+124-pci_acs_quirk_fix_jgg@nvidia.com
Fixes: 47c8846a49ba ("PCI: Extend ACS configurability")
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Closes: https://lore.kernel.org/all/e89107da-ac99-4d3a-9527-a4df9986e120@kernel.org
Closes: https://bugzilla.suse.com/show_bug.cgi?id=1229019
Tested-by: Steffen Dirkwinkel <me@steffen.cc>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-10-29 11:35:52 -05:00
Rob Herring (Arm)
f68303cf1c PCI: Constify pci_register_io_range() fwnode_handle
pci_register_io_range() does not modify the passed in fwnode_handle, so
make it const.

Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20241010-dt-const-v1-1-87a51f558425@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2024-10-15 08:58:35 -05:00
Ilpo Järvinen
783602c920 PCI: Use resource_set_{range,size}() helpers
Convert open-coded resource size calculations to use
resource_set_{range,size}() helpers.

While at it, use SZ_* for size parameter where appropriate which makes the
intent of code more obvious.

Also, cast sizes to resource_size_t, not u64.

Link: https://lore.kernel.org/r/20240614100606.15830-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-10-10 17:44:57 -05:00
Todd Kjos
2985b1844f PCI: Fix reset_method_store() memory leak
In reset_method_store(), a string is allocated via kstrndup() and assigned
to the local "options". options is then used in with strsep() to find
spaces:

  while ((name = strsep(&options, " ")) != NULL) {

If there are no remaining spaces, then options is set to NULL by strsep(),
so the subsequent kfree(options) doesn't free the memory allocated via
kstrndup().

Fix by using a separate tmp_options to iterate with strsep() so options is
preserved.

Link: https://lore.kernel.org/r/20241001231147.3583649-1-tkjos@google.com
Fixes: d88f521da3ef ("PCI: Allow userspace to query and set device reset mechanism")
Signed-off-by: Todd Kjos <tkjos@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-10-02 16:36:35 -05:00
Wei Huang
f69767a1ad PCI: Add TLP Processing Hints (TPH) support
Add support for PCIe TLP Processing Hints (TPH) support (see PCIe r6.2,
sec 6.17).

Add TPH register definitions in pci_regs.h, including the TPH Requester
capability register, TPH Requester control register, TPH Completer
capability, and the ST fields of MSI-X entry.

Introduce pcie_enable_tph() and pcie_disable_tph(), enabling drivers to
toggle TPH support and configure specific ST mode as needed. Also add a new
kernel parameter, "pci=notph", allowing users to disable TPH support across
the entire system.

Link: https://lore.kernel.org/r/20241002165954.128085-2-wei.huang2@amd.com
Co-developed-by: Jing Liu <jing2.liu@intel.com>
Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com>
Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
2024-10-02 16:20:01 -05:00
Linus Torvalds
3a37872316 pci-v6.12-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmbseugUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxdwxAAvdvDyTuiPo2R8pQtvKg4YL2IUnK5
 UR28mBxZDK5DFhLtD/QzmVVG/eaLY6bJHthHgJgTApzekkqU0h9dcRI0eegXrvcz
 I3HRsZK2yatUky9l8O148OLzF897r7vXL3QtGe6qjKU+9D83IEeooLKgBca+GoBC
 bRLvG/fYRzdjOe8UHFqCoeMIg3IOY7CNifvFOihAGpJpxfZQktj6hSKu6q7BL1Rx
 NRgYlxh0eLcb7vAJqz6RZpQ8PRCwhAjlDuu0BOkES8/6EwisD1xUh3qdDxfVgNA6
 FpcAb/53yr46cs4tM9ZTwluka86AskuXj3jwSKf7nE3zqr4nM9OD3sGOSYzK8UdE
 EDBKj+9iEpYRC6rJMk5gNH2AZkR1OEpNUisR6+kEn81A9yNNoTmkHdHUOWo8TuxD
 btc0sTM+eWApvTiZwgL4VjMZulQllV51K8tcfvODRhlMkbOPNWGWdmpWqEbUS2HU
 i7+zzQC3DC5iPlAKgRSeYB0aad6la6brqPW16sGhGovNhgwbzakDLCUJJGn/LNuO
 wd0UNpJTnHlfChbvNh2bBxiMOo0cab1tJ5Jp97STQYhLg2nW93s/dAfdpSAsYO4S
 5YzjSADWeyeuDsHE1RdUdDvYAPMb1VZBUd2OSHis5zw7kmh25c9KYXEkDJ25q/ju
 sVXK4oMNW/Gnd5M=
 =L3s9
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:

   - Wait for device readiness after reset by polling Vendor ID and
     looking for Configuration RRS instead of polling the Command
     register and looking for non-error completions, to avoid hardware
     retries done for RRS on non-Vendor ID reads (Bjorn Helgaas)

   - Rename CRS Completion Status to RRS ('Request Retry Status') to
     match PCIe r6.0 spec usage (Bjorn Helgaas)

   - Clear LBMS bit after a manual link retrain so we don't try to
     retrain a link when there's no downstream device anymore (Maciej W.
     Rozycki)

   - Revert to the original link speed after retraining fails instead of
     leaving it restricted to 2.5GT/s, so a future device has a chance
     to use higher speeds (Maciej W. Rozycki)

   - Wait for each level of downstream bus, not just the first, to
     become accessible before restoring devices on that bus (Ilpo
     Järvinen)

   - Add ARCH_PCI_DEV_GROUPS so s390 can add its own attribute_groups
     without having to stomp on the core's pdev->dev.groups (Lukas
     Wunner)

  Driver binding:

   - Export pcim_request_region(), a managed counterpart of
     pci_request_region(), for use by drivers (Philipp Stanner)

   - Export pcim_iomap_region() and deprecate pcim_iomap_regions()
     (Philipp Stanner)

   - Request the PCI BAR used by xboxvideo (Philipp Stanner)

   - Request and map drm/ast BARs with pcim_iomap_region() (Philipp
     Stanner)

  MSI:

   - Add MSI_FLAG_NO_AFFINITY flag for devices that mux MSIs onto a
     single IRQ line and cannot set the affinity of each MSI to a
     specific CPU core (Marek Vasut)

   - Use MSI_FLAG_NO_AFFINITY and remove unnecessary .irq_set_affinity()
     implementations in aardvark, altera, brcmstb, dwc, mediatek-gen3,
     mediatek, mobiveil, plda, rcar, tegra, vmd, xilinx-nwl,
     xilinx-xdma, and xilinx drivers to avoid 'IRQ: set affinity failed'
     warnings (Marek Vasut)

  Power management:

   - Add pwrctl support for ATH11K inside the WCN6855 package (Konrad
     Dybcio)

  PCI device hotplug:

   - Remove unnecessary hpc_ops struct from shpchp (ngn)

   - Check for PCI_POSSIBLE_ERROR(), not 0xffffffff, in cpqphp
     (weiyufeng)

  Virtualization:

   - Mark Creative Labs EMU20k2 INTx masking as broken (Alex Williamson)

   - Add an ACS quirk for Qualcomm SA8775P, which doesn't advertise ACS
     but does provide ACS-like features (Subramanian Ananthanarayanan)

  IOMMU:

   - Add function 0 DMA alias quirk for Glenfly Arise audio function,
     which uses the function 0 Requester ID (WangYuli)

  NPEM:

   - Add Native PCIe Enclosure Management (NPEM) support for sysfs
     control of NVMe RAID storage indicators (ok/fail/locate/
     rebuild/etc) (Mariusz Tkaczyk)

   - Add support for the ACPI _DSM PCIe SSD status LED management, which
     is functionally similar to NPEM but mediated by platform firmware
     (Mariusz Tkaczyk)

  Device trees:

   - Drop minItems and maxItems from ranges in PCI generic host binding
     since host bridges may have several MMIO and I/O port apertures
     (Frank Li)

   - Add kirin, rcar-gen2, uniphier DT binding top-level constraints for
     clocks (Krzysztof Kozlowski)

  Altera PCIe controller driver:

   - Convert altera DT bindings from text to YAML (Matthew Gerlach)

   - Replace TLP_REQ_ID() with macro PCI_DEVID(), which does the same
     thing and is what other drivers use (Jinjie Ruan)

  Broadcom STB PCIe controller driver:

   - Add DT binding maxItems for reset controllers (Jim Quinlan)

   - Use the 'bridge' reset method if described in the DT (Jim Quinlan)

   - Use the 'swinit' reset method if described in the DT (Jim Quinlan)

   - Add 'has_phy' so the existence of a 'rescal' reset controller
     doesn't imply software control of it (Jim Quinlan)

   - Add support for many inbound DMA windows (Jim Quinlan)

   - Rename SoC 'type' to 'soc_base' express the fact that SoCs come in
     families of multiple similar devices (Jim Quinlan)

   - Add Broadcom 7712 DT description and driver support (Jim Quinlan)

   - Sort enums, pcie_offsets[], pcie_cfg_data, .compatible strings for
     maintainability (Bjorn Helgaas)

  Freescale i.MX6 PCIe controller driver:

   - Add imx6q-pcie 'dbi2' and 'atu' reg-names for i.MX8M Endpoints
     (Richard Zhu)

   - Fix a code restructuring error that caused i.MX8MM and i.MX8MP
     Endpoints to fail to establish link (Richard Zhu)

   - Fix i.MX8MP Endpoint occasional failure to trigger MSI by enforcing
     outbound alignment requirement (Richard Zhu)

   - Call phy_power_off() in the .probe() error path (Frank Li)

   - Rename internal names from imx6_* to imx_* since i.MX7/8/9 are also
     supported (Frank Li)

   - Manage Refclk by using SoC-specific callbacks instead of switch
     statements (Frank Li)

   - Manage core reset by using SoC-specific callbacks instead of switch
     statements (Frank Li)

   - Expand comments for erratum ERR010728 workaround (Frank Li)

   - Use generic PHY APIs to configure mode, speed, and submode, which
     is harmless for devices that implement their own internal PHY
     management and don't set the generic imx_pcie->phy (Frank Li)

   - Add i.MX8Q (i.MX8QM, i.MX8QXP, and i.MX8DXL) DT binding and driver
     Root Complex support (Richard Zhu)

  Freescale Layerscape PCIe controller driver:

   - Replace layerscape-pcie DT binding compatible fsl,lx2160a-pcie with
     fsl,lx2160ar2-pcie (Frank Li)

   - Add layerscape-pcie DT binding deprecated 'num-viewport' property
     to address a DT checker warning (Frank Li)

   - Change layerscape-pcie DT binding 'fsl,pcie-scfg' to phandle-array
     (Frank Li)

  Loongson PCIe controller driver:

   - Increase max PCI hosts to 8 for Loongson-3C6000 and newer chipsets
     (Huacai Chen)

  Marvell Aardvark PCIe controller driver:

   - Fix issue with emulating Configuration RRS for two-byte reads of
     Vendor ID; previously it only worked for four-byte reads (Bjorn
     Helgaas)

  MediaTek PCIe Gen3 controller driver:

   - Add per-SoC struct mtk_gen3_pcie_pdata to support multiple SoC
     types (Lorenzo Bianconi)

   - Use reset_bulk APIs to manage PHY reset lines (Lorenzo Bianconi)

   - Add DT and driver support for Airoha EN7581 PCIe controller
     (Lorenzo Bianconi)

  Qualcomm PCIe controller driver:

   - Update qcom,pcie-sc7280 DT binding with eight interrupts (Rayyan
     Ansari)

   - Add back DT 'vddpe-3v3-supply', which was incorrectly removed
     earlier (Johan Hovold)

   - Drop endpoint redundant masking of global IRQ events (Manivannan
     Sadhasivam)

   - Clarify unknown global IRQ message and only log it once to avoid a
     flood (Manivannan Sadhasivam)

   - Add 'linux,pci-domain' property to endpoint DT binding (Manivannan
     Sadhasivam)

   - Assign PCI domain number for endpoint controllers (Manivannan
     Sadhasivam)

   - Add 'qcom_pcie_ep' and the PCI domain number to IRQ names for
     endpoint controller (Manivannan Sadhasivam)

   - Add global SPI interrupt for PCIe link events to DT binding
     (Manivannan Sadhasivam)

   - Add global RC interrupt handler to handle 'Link up' events and
     automatically enumerate hot-added devices (Manivannan Sadhasivam)

   - Avoid mirroring of DBI and iATU register space so it doesn't
     overlap BAR MMIO space (Prudhvi Yarlagadda)

   - Enable controller resources like PHY only after PERST# is
     deasserted to partially avoid the problem that the endpoint SoC
     crashes when accessing things when Refclk is absent (Manivannan
     Sadhasivam)

   - Add 16.0 GT/s equalization and RX lane margining settings (Shashank
     Babu Chinta Venkata)

   - Pass domain number to pci_bus_release_domain_nr() explicitly to
     avoid a NULL pointer dereference (Manivannan Sadhasivam)

  Renesas R-Car PCIe controller driver:

   - Make the read-only const array 'check_addr' static (Colin Ian King)

   - Add R-Car V4M (R8A779H0) PCIe host and endpoint to DT binding
     (Yoshihiro Shimoda)

  TI DRA7xx PCIe controller driver:

   - Request IRQF_ONESHOT for 'dra7xx-pcie-main' IRQ since the primary
     handler is NULL (Siddharth Vadapalli)

   - Handle IRQ request errors during root port and endpoint probe
     (Siddharth Vadapalli)

  TI J721E PCIe driver:

   - Add DT 'ti,syscon-acspcie-proxy-ctrl' and driver support to enable
     the ACSPCIE module to drive Refclk for the Endpoint (Siddharth
     Vadapalli)

   - Extract the cadence link setup from cdns_pcie_host_setup() so link
     setup can be done separately during resume (Thomas Richard)

   - Add T_PERST_CLK_US definition for the mandatory delay between
     Refclk becoming stable and PERST# being deasserted (Thomas Richard)

   - Add j721e suspend and resume support (Théo Lebrun)

  TI Keystone PCIe controller driver:

   - Fix NULL pointer checking when applying MRRS limitation quirk for
     AM65x SR 1.0 Errata #i2037 (Dan Carpenter)

  Xilinx NWL PCIe controller driver:

   - Fix off-by-one error in INTx IRQ handler that caused INTx
     interrupts to be lost or delivered as the wrong interrupt (Sean
     Anderson)

   - Rate-limit misc interrupt messages (Sean Anderson)

   - Turn off the clock on probe failure and device removal (Sean
     Anderson)

   - Add DT binding and driver support for enabling/disabling PHYs (Sean
     Anderson)

   - Add PCIe phy bindings for the ZCU102 (Sean Anderson)

  Xilinx XDMA PCIe controller driver:

   - Add support for Xilinx QDMA Soft IP PCIe Root Port Bridge to DT
     binding and xilinx-dma-pl driver (Thippeswamy Havalige)

  Miscellaneous:

   - Fix buffer overflow in kirin_pcie_parse_port() (Alexandra Diupina)

   - Fix minor kerneldoc issues and typos (Bjorn Helgaas)

   - Use PCI_DEVID() macro in aer_inject() instead of open-coding it
     (Jinjie Ruan)

   - Check pcie_find_root_port() return in x86 fixups to avoid NULL
     pointer dereferences (Samasth Norway Ananda)

   - Make pci_bus_type constant (Kunwu Chan)

   - Remove unused declarations of __pci_pme_wakeup() and
     pci_vpd_release() (Yue Haibing)

   - Remove any leftover .*.cmd files with make clean (zhang jiao)

   - Remove unused BILLION macro (zhang jiao)"

* tag 'pci-v6.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (132 commits)
  PCI: Fix typos
  dt-bindings: PCI: qcom: Allow 'vddpe-3v3-supply' again
  tools: PCI: Remove unused BILLION macro
  tools: PCI: Remove .*.cmd files with make clean
  PCI: Pass domain number to pci_bus_release_domain_nr() explicitly
  PCI: dra7xx: Fix error handling when IRQ request fails in probe
  PCI: dra7xx: Fix threaded IRQ request for "dra7xx-pcie-main" IRQ
  PCI: qcom: Add RX lane margining settings for 16.0 GT/s
  PCI: qcom: Add equalization settings for 16.0 GT/s
  PCI: dwc: Always cache the maximum link speed value in dw_pcie::max_link_speed
  PCI: dwc: Rename 'dw_pcie::link_gen' to 'dw_pcie::max_link_speed'
  PCI: qcom-ep: Enable controller resources like PHY only after refclk is available
  PCI: Mark Creative Labs EMU20k2 INTx masking as broken
  dt-bindings: PCI: imx6q-pcie: Add reg-name "dbi2" and "atu" for i.MX8M PCIe Endpoint
  dt-bindings: PCI: altera: msi: Convert to YAML
  PCI: imx6: Add i.MX8Q PCIe Root Complex (RC) support
  PCI: Rename CRS Completion Status to RRS
  PCI: aardvark: Correct Configuration RRS checking
  PCI: Wait for device readiness with Configuration RRS
  PCI: brcmstb: Sort enums, pcie_offsets[], pcie_cfg_data, .compatible strings
  ...
2024-09-23 12:47:06 -07:00
Bjorn Helgaas
45e981b86d Merge branch 'pci/controller/qcom'
- Drop endpoint redundant masking of global IRQ events (Manivannan
  Sadhasivam)

- Clarify unknown global IRQ message and only log it once to avoid a flood
  (Manivannan Sadhasivam)

- Add Manivannan Sadhasivam as maintainer of qcom endpoint driver
  (Manivannan Sadhasivam)

- Add 'linux,pci-domain' property to endpoint DT binding (Manivannan
  Sadhasivam)

- Assign PCI domain number for endpoint controllers (Manivannan Sadhasivam)

- Add 'qcom_pcie_ep' and the PCI domain number to IRQ names for endpoint
  controller (Manivannan Sadhasivam)

- Add global SPI interrupt for PCIe link events to DT binding (Manivannan
  Sadhasivam)

- Add global RC interrupt handler to handle 'Link up' events and
  automatically enumerate hot-added devices (Manivannan Sadhasivam)

- Avoid mirroring of DBI and iATU register space so it doesn't overlap BAR
  MMIO space (Prudhvi Yarlagadda)

- Enable controller resources like PHY only after PERST# is deasserted to
  partially avoid the problem that the endpoint SoC crashes when accessing
  things when Refclk is absent (Manivannan Sadhasivam)

- Rename dw_pcie.link_gen to max_link_speed to avoid ambiguity (Manivannan
  Sadhasivam)

- Cache maximum link speed value in dw_pcie.max_link_speed for use by
  vendor drivers (Manivannan Sadhasivam)

- Add 16.0 GT/s equalization and RX lane margining settings (Shashank Babu
  Chinta Venkata)

- Pass domain number to pci_bus_release_domain_nr() explicitly to avoid a
  NULL pointer dereference (Manivannan Sadhasivam)

* pci/controller/qcom:
  PCI: Pass domain number to pci_bus_release_domain_nr() explicitly
  PCI: qcom: Add RX lane margining settings for 16.0 GT/s
  PCI: qcom: Add equalization settings for 16.0 GT/s
  PCI: dwc: Always cache the maximum link speed value in dw_pcie::max_link_speed
  PCI: dwc: Rename 'dw_pcie::link_gen' to 'dw_pcie::max_link_speed'
  PCI: qcom-ep: Enable controller resources like PHY only after refclk is available
  PCI: qcom: Disable mirroring of DBI and iATU register space in BAR region
  PCI: qcom: Enumerate endpoints based on Link up event in 'global_irq' interrupt
  dt-bindings: PCI: qcom,pcie-sm8450: Add 'global' interrupt
  PCI: qcom-ep: Modify 'global_irq' and 'perst_irq' IRQ device names
  PCI: endpoint: Assign PCI domain number for endpoint controllers
  dt-bindings: PCI: pci-ep: Document 'linux,pci-domain' property
  dt-bindings: PCI: pci-ep: Update Maintainers
  PCI: qcom-ep: Reword the error message for receiving unknown global IRQ event
  PCI: qcom-ep: Drop the redundant masking of global IRQ events
2024-09-19 14:25:32 -05:00
Bjorn Helgaas
f2a3ce1597 Merge branch 'pci/reset'
- Wait for each level of downstream bus, not just the first, to become
  accessible before restoring devices on that bus (Ilpo Järvinen)

* pci/reset:
  PCI: Wait for Link before restoring Downstream Buses
2024-09-19 14:25:27 -05:00
Bjorn Helgaas
dffe4cca2e Merge branch 'pci/enumeration'
- Clear LBMS bit after a manual link retrain so we don't try to retrain a
  link when there's no downstream device anymore (Maciej W. Rozycki)

- Revert to the original link speed after retraining fails instead of
  leaving it restricted to 2.5GT/s, so a future device has a chance to use
  higher speeds (Maciej W. Rozycki)

- Correct interpretation of pcie_retrain_link() return status and update it
  to return 0/errno instead of true/false (Maciej W.  Rozycki)

* pci/enumeration:
  PCI: Use an error code with PCIe failed link retraining
  PCI: Correct error reporting with PCIe failed link retraining
  PCI: Revert to the original speed after PCIe failed link retraining
  PCI: Clear the LBMS bit after a link retrain
2024-09-19 14:25:25 -05:00
Manivannan Sadhasivam
0cca961a02
PCI: Pass domain number to pci_bus_release_domain_nr() explicitly
The pci_bus_release_domain_nr() API is supposed to free the domain
number allocated by pci_bus_find_domain_nr(). Most of the callers of
pci_bus_find_domain_nr(), store the domain number in pci_bus::domain_nr.

As such, the pci_bus_release_domain_nr() implicitly frees the domain
number by dereferencing 'struct pci_bus'. However, one of the callers
of this API, the PCI endpoint subsystem, doesn't have 'struct pci_bus',
so it only passes NULL. Due to this, the API will end up dereferencing
the NULL pointer.

To fix this issue, pass the domain number to this API explicitly. Since
'struct pci_bus' is not used for anything else other than extracting the
domain number, it makes sense to pass the domain number directly.

Fixes: 0328947c5032 ("PCI: endpoint: Assign PCI domain number for endpoint controllers")
Closes: https://lore.kernel.org/linux-pci/c0c40ddb-bf64-4b22-9dd1-8dbb18aa2813@stanley.mountain
Link: https://lore.kernel.org/linux-pci/20240912053025.25314-1-manivannan.sadhasivam@linaro.org
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[kwilczynski: commit log]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2024-09-13 22:12:29 +00:00
Bjorn Helgaas
87f10faf16 PCI: Rename CRS Completion Status to RRS
PCIe r6.0 changed the abbreviation for "Configuration Request Retry Status"
Completion Status from "CRS" to "RRS" and uses the terminology of
"Configuration RRS Software Visibility" instead of "CRS Software
Visibility".

Align the Linux usage with the r6.0 spec language.  No functional change
intended.

It's confusing to make this change, but I think "RRS" *is* a better
abbreviation because it was easy to interpret "CRS" as "Completion Retry
Status", which really didn't make any sense.

Link: https://lore.kernel.org/r/20240827234848.4429-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-09-10 19:52:30 -05:00
Bjorn Helgaas
d591f6804e PCI: Wait for device readiness with Configuration RRS
After a device reset, delays are required before the device can
successfully complete config accesses.  PCIe r6.0, sec 6.6, specifies some
delays required before software can perform config accesses.  Devices that
require more time after those delays may respond to config accesses with
Configuration Request Retry Status (RRS) completions.

Callers of pci_dev_wait() are responsible for delays until the device can
respond to config accesses.  pci_dev_wait() waits any additional time until
the device can successfully complete config accesses.

Reading config space of devices that are not present or not ready typically
returns ~0 (PCI_ERROR_RESPONSE).  Previously we polled the Command register
until we got a value other than ~0.  This is sometimes a problem because
Root Complex handling of RRS completions may include several retries and
implementation-specific behavior that is invisible to software (see sec
2.3.2), so the exponential backoff in pci_dev_wait() may not work as
intended.

Linux enables Configuration RRS Software Visibility on all Root Ports that
support it.  If it is enabled, read the Vendor ID instead of the Command
register.  RRS completions cause immediate return of the 0x0001 reserved
Vendor ID value, so the pci_dev_wait() backoff works correctly.

When a read of Vendor ID eventually completes successfully by returning a
non-0x0001 value (the Vendor ID or 0xffff for VFs), the device should be
initialized and ready to respond to config requests.

For conventional PCI devices or devices below Root Ports that don't support
Configuration RRS Software Visibility, poll the Command register as before.

This was developed independently, but is very similar to Stanislav
Spassov's previous work at
https://lore.kernel.org/linux-pci/20200223122057.6504-1-stanspas@amazon.com

Link: https://lore.kernel.org/r/20240827234848.4429-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Duc Dang <ducdang@google.com>
2024-09-10 19:52:23 -05:00
Maciej W. Rozycki
59100eb248
PCI: Use an error code with PCIe failed link retraining
Given how the call place in pcie_wait_for_link_delay() got structured now,
and that pcie_retrain_link() returns a potentially useful error code,
convert pcie_failed_link_retrain() to return an error code rather than a
boolean status, fixing handling at the call site mentioned.  Update the
other call site accordingly.

Fixes: 1abb47390350 ("Merge branch 'pci/enumeration'")
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2408091156530.61955@angie.orcam.me.uk
Reported-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/aa2d1c4e-9961-d54a-00c7-ddf8e858a9b0@linux.intel.com/
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v6.5+
2024-09-09 14:03:36 +00:00
Maciej W. Rozycki
8037ac08c2
PCI: Clear the LBMS bit after a link retrain
The LBMS bit, where implemented, is set by hardware either in response
to the completion of retraining caused by writing 1 to the Retrain Link
bit or whenever hardware has changed the link speed or width in attempt
to correct unreliable link operation.  It is never cleared by hardware
other than by software writing 1 to the bit position in the Link Status
register and we never do such a write.

We currently have two places, namely apply_bad_link_workaround() and
pcie_failed_link_retrain() in drivers/pci/controller/dwc/pcie-tegra194.c
and drivers/pci/quirks.c respectively where we check the state of the LBMS
bit and neither is interested in the state of the bit resulting from the
completion of retraining, both check for a link fault.

And in particular pcie_failed_link_retrain() causes issues consequently, by
trying to retrain a link where there's no downstream device anymore and the
state of 1 in the LBMS bit has been retained from when there was a device
downstream that has since been removed.

Clear the LBMS bit then at the conclusion of pcie_retrain_link(), so that
we have a single place that controls it and that our code can track link
speed or width changes resulting from unreliable link operation.

Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2408091133140.61955@angie.orcam.me.uk
Reported-by: Matthew W Carlis <mattc@purestorage.com>
Link: https://lore.kernel.org/r/20240806000659.30859-1-mattc@purestorage.com/
Link: https://lore.kernel.org/r/20240722193407.23255-1-mattc@purestorage.com/
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Cc: <stable@vger.kernel.org> # v6.5+
2024-09-09 13:42:31 +00:00
Ilpo Järvinen
3e40aa29d4 PCI: Wait for Link before restoring Downstream Buses
__pci_reset_bus() calls pci_bridge_secondary_bus_reset() to perform the
reset and also waits for the Secondary Bus to become again accessible.
__pci_reset_bus() then calls pci_bus_restore_locked() that restores the PCI
devices connected to the bus, and if necessary, recursively restores also
the subordinate buses and their devices.

The logic in pci_bus_restore_locked() does not take into account that after
restoring a device on one level, there might be another Link Downstream
that can only start to come up after restore has been performed for its
Downstream Port device. That is, the Link may require additional wait until
it becomes accessible.

Similarly, pci_slot_restore_locked() lacks wait.

Amend pci_bus_restore_locked() and pci_slot_restore_locked() to wait for
the Secondary Bus before recursively performing the restore of that bus.

Fixes: 090a3c5322e9 ("PCI: Add pci_reset_slot() and pci_reset_bus()")
Link: https://lore.kernel.org/r/20240808121708.2523-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-08-09 15:35:42 -05:00
Philipp Stanner
00f89ae4e7 PCI: Fix devres regression in pci_intx()
pci_intx() becomes managed if pcim_enable_device() has been called in
advance. Commit 25216afc9db5 ("PCI: Add managed pcim_intx()") changed this
behavior so that pci_intx() always leads to creation of a separate device
resource for itself, whereas earlier, a shared resource was used for all
PCI devres operations.

Unfortunately, pci_intx() seems to be used in some drivers' remove() paths;
in the managed case this causes a device resource to be created on driver
detach, which causes .probe() to fail if the driver is reloaded:

  pci 0000:00:1f.2: Resources present before probing

Fix the regression by only redirecting pci_intx() to its managed twin
pcim_intx() if the pci_command changes.

Link: https://lore.kernel.org/r/20240725120729.59788-2-pstanner@redhat.com
Fixes: 25216afc9db5 ("PCI: Add managed pcim_intx()")
Reported-by: Damien Le Moal <dlemoal@kernel.org>
Closes: https://lore.kernel.org/all/b8f4ba97-84fc-4b7e-ba1a-99de2d9f0118@kernel.org/
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
[bhelgaas: add error message to commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Damien Le Moal <dlemoal@kernel.org>
2024-08-01 12:56:06 -05:00
Linus Torvalds
3f386cb8ee pci-v6.11-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmaahiEUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwypg/+LSzrx0CyyXruwwkjuoMIzqXoEpxV
 SSdJv47E9rnJymQvd0RAeNyc1BPbtRcP1FdEvV/G1ovb8qJSOJgU22PSSiMQsQ0h
 2WGBl1ShubQDDLBdy1AggAsRJhIH4P4tWZ4k5Ftz6WZPWA1UcrDqmjN4d02UIYZb
 A3YYcBEIm6bvrixxy+xq/Ii7S9A2idikabDLLGXOMSliFHx0ehWDNXyQEBONlrDh
 rEHih21rPtOltVEdJl7yF+SIA467HI09NuXfTviHWnJ1hinFoSlEHIhz4j+i+r//
 xOj7iDqtk/UAIToVsxtwgOnElNwY6ab/h/t1AmSSxX4FUEV2TiS1YEpUfX7pByt+
 dytgvepjQyycC/ZHUtRZFZ6+1M0z+Vgb5c3+jXyPh8pQEPqmXt8+KYVIi/wychmJ
 Opo4xniiDoKHSZ4E0bg/wMbe9yVCjTpX0i0S7BbNa/TRjud6vAhXvgx/y092jsdg
 h4lU0ywNCgea/rZFHZYomPjncx9xJ+rtOaH+/dVQhCm/wuRHnj7tJGZnl5LfCWVw
 +yNOcExQaE+lRvKqp6mQvUva3+4UArAL2tnFC00tGd0emRLIvXrxY2lF1sqp9wCZ
 AJu65El4nnpFNU7vJR7x4X31BvcdquFEvfofPxPXbPz09N8hPRhkunKzgd5ftKZS
 mcxMfStvIFXiMEM=
 =vw2i
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.11-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:

   - Define PCIE_RESET_CONFIG_DEVICE_WAIT_MS for the generic 100ms
     required after reset before config access (Kevin Xie)

   - Define PCIE_T_RRS_READY_MS for the generic 100ms required after
     reset before config access (probably should be unified with
     PCIE_RESET_CONFIG_DEVICE_WAIT_MS) (Damien Le Moal)

  Resource management:

   - Rename find_resource() to find_resource_space() to be more
     descriptive (Ilpo Järvinen)

   - Export find_resource_space() for use by PCI core, which needs to
     learn whether there is available space for a bridge window (Ilpo
     Järvinen)

   - Prevent double counting of resources so window size doesn't grow on
     each remove/rescan cycle (Ilpo Järvinen)

   - Relax bridge window sizing algorithm so a device doesn't break
     simply because it was removed and rescanned (Ilpo Järvinen)

   - Evaluate the ACPI PRESERVE_BOOT_CONFIG _DSM in
     pci_register_host_bridge() (not acpi_pci_root_create()) so we can
     unify it with similar DT functionality (Vidya Sagar)

   - Extend use of DT "linux,pci-probe-only" property so it works
     per-host bridge as well as globally (Vidya Sagar)

   - Unify support for ACPI PRESERVE_BOOT_CONFIG _DSM and the DT
     "linux,pci-probe-only" property in pci_preserve_config() (Vidya
     Sagar)

  Driver binding:

   - Add devres infrastructure for managed request and map of partial
     BAR resources (Philipp Stanner)

   - Deprecate pcim_iomap_table() because uses like
     "pcim_iomap_table()[0]" have no good way to return errors (Philipp
     Stanner)

   - Add an always-managed pcim_request_region() for use instead of
     pci_request_region() and similar, which are sometimes managed
     depending on whether pcim_enable_device() has been called
     previously (Philipp Stanner)

   - Reimplement pcim_set_mwi() so it doesn't need to keep store MWI
     state (Philipp Stanner)

   - Add pcim_intx() for use instead of pci_intx(), which is sometimes
     managed depending on whether pcim_enable_device() has been called
     previously (Philipp Stanner)

   - Add managed pcim_iomap_range() to allow mapping of a partial BAR
     (Philipp Stanner)

   - Fix a devres mapping leak in drm/vboxvideo (Philipp Stanner)

  Error handling:

   - Add missing bridge locking in device reset path and add a warning
     for other possible lock issues (Dan Williams)

   - Fix use-after-free on concurrent DPC and hot-removal (Lukas Wunner)

  Power management:

   - Disable AER and DPC during suspend to avoid spurious wakeups if
     they share an interrupt with PME (Kai-Heng Feng)

  PCIe native device hotplug:

   - Detect if a device was removed or replaced during system sleep so
     we don't assume a new device is the one that used to be there
     (Lukas Wunner)

  Virtualization:

   - Add an ACS quirk for Broadcom BCM5760X multi-function NIC; it
     prevents transactions between functions even though it doesn't
     advertise ACS, so the functions can be attached individually via
     VFIO (Ajit Khaparde)

  Peer-to-peer DMA:

   - Add a "pci=config_acs=" kernel command-line parameter to relax
     default ACS settings to enable additional peer-to-peer
     configurations. Requires expert knowledge of topology and ACS
     operation (Vidya Sagar)

  Endpoint framework:

   - Remove unused struct pci_epf_group.type_group (Christophe JAILLET)

   - Fix error handling in vpci_scan_bus() and epf_ntb_epc_cleanup()
     (Dan Carpenter)

   - Make struct pci_epc_class constant (Greg Kroah-Hartman)

   - Remove unused pci_endpoint_test_bar_{readl,writel} functions
     (Jiapeng Chong)

   - Rename "BME" to "Bus Master Enable" (Manivannan Sadhasivam)

   - Rename struct pci_epc_event_ops.core_init() callback to epc_init()
     (Manivannan Sadhasivam)

   - Move DMA init to MHI .epc_init() callback for uniformity
     (Manivannan Sadhasivam)

   - Cancel EPF test delayed work when link goes down (Manivannan
     Sadhasivam)

   - Add struct pci_epc_event_ops.epc_deinit() callback for cleanup
     needed on fundamental reset (Manivannan Sadhasivam)

   - Add 64KB alignment to endpoint test to support Rockchip rk3588
     (Niklas Cassel)

   - Optimize endpoint test by using memcpy() instead of readl() (Niklas
     Cassel)

  Device tree bindings:

   - Add generic "ats-supported" property to advertise that a PCIe Root
     Complex supports ATS (Jean-Philippe Brucker)

  Amazon Annapurna Labs PCIe controller driver:

   - Validate IORESOURCE_BUS presence to avoid NULL pointer dereference
     (Aleksandr Mishin)

  Axis ARTPEC-6 PCIe controller driver:

   - Rename .cpu_addr_fixup() parameter to reflect that it is a PCI
     address, not a CPU address (Niklas Cassel)

  Freescale i.MX6 PCIe controller driver:

   - Convert to agnostic GPIO API (Andy Shevchenko)

  Freescale Layerscape PCIe controller driver:

   - Make struct mobiveil_rp_ops constant (Christophe JAILLET)

   - Use new generic dw_pcie_ep_linkdown() to handle link-down events
     (Manivannan Sadhasivam)

  HiSilicon Kirin PCIe controller driver:

   - Convert to agnostic GPIO API (Andy Shevchenko)

   - Use _scoped() iterator for OF children to ensure refcounts are
     decremented at loop exit (Javier Carrasco)

  Intel VMD host bridge driver:

   - Create sysfs "domain" symlink before downstream devices are exposed
     to userspace by pci_bus_add_devices() (Jiwei Sun)

  Loongson PCIe controller driver:

   - Enable MSI when LS7A is used with new CPUs that have integrated
     PCIe Root Complex, e.g., Loongson-3C6000, so downstream devices can
     use MSI (Huacai Chen)

  Microchip AXI PolarFlare PCIe controller driver:

   - Move pcie-microchip-host.c to a new PLDA directory (Minda Chen)

   - Factor PLDA generic items out to a common
     plda,xpressrich3-axi-common.yaml binding (Minda Chen)

   - Factor PLDA generic data structures and code out to shared
     pcie-plda.h, pcie-plda-host.c (Minda Chen)

   - Add PLDA generic interrupt handling with a .request_event_irq()
     callback for vendor-specific events (Minda Chen)

   - Add PLDA generic host init/deinit and map bus functions for use by
     vendor-specific drivers (Minda Chen)

   - Rework to use PLDA core (Minda Chen)

  Microsoft Hyper-V host bridge driver:

   - Return zero, not garbage, when reading PCI_INTERRUPT_PIN (Wei Liu)

  NVIDIA Tegra194 PCIe controller driver:

   - Remove unused struct tegra_pcie_soc (Dr. David Alan Gilbert)

   - Set 64KB inbound ATU alignment restriction (Jon Hunter)

  Qualcomm PCIe controller driver:

   - Make the MHI reg region mandatory for X1E80100, since all PCIe
     controllers have it (Abel Vesa)

   - Prevent use of uninitialized data and possible error pointer
     dereference (Dan Carpenter)

   - Return error, not success, if dev_pm_opp_find_freq_floor() fails
     (Dan Carpenter)

   - Add Operating Performance Points (OPP) support to scale performance
     state based on aggregate link bandwidth to improve SoC power
     efficiency (Krishna chaitanya chundru)

   - Vote for the CPU-PCIe ICC (interconnect) path to ensure it stays
     active even if other drivers don't vote for it (Krishna chaitanya
     chundru)

   - Use devm_clk_bulk_get_all() to get all the clocks from DT to avoid
     writing out all the clock names (Manivannan Sadhasivam)

   - Add DT binding and driver support for the SA8775P SoC (Mrinmay
     Sarkar)

   - Add HDMA support for the SA8775P SoC (Mrinmay Sarkar)

   - Override the SA8775P NO_SNOOP default to avoid possible memory
     corruption (Mrinmay Sarkar)

   - Make sure resources are disabled during PERST# assertion, even if
     the link is already disabled (Manivannan Sadhasivam)

   - Use new generic dw_pcie_ep_linkdown() to handle link-down events
     (Manivannan Sadhasivam)

   - Add DT and endpoint driver support for the SA8775P SoC (Mrinmay
     Sarkar)

   - Add Hyper DMA (HDMA) support for the SA8775P SoC and enable it in
     the EPF MHI driver (Mrinmay Sarkar)

   - Set PCIE_PARF_NO_SNOOP_OVERIDE to override the default NO_SNOOP
     attribute on the SA8775P SoC (both Root Complex and Endpoint mode)
     to avoid possible memory corruption (Mrinmay Sarkar)

  Renesas R-Car PCIe controller driver:

   - Demote WARN() to dev_warn_ratelimited() in rcar_pcie_wakeup() to
     avoid unnecessary backtrace (Marek Vasut)

   - Add DT and driver support for R-Car V4H (R8A779G0) host and
     endpoint. This requires separate proprietary firmware (Yoshihiro
     Shimoda)

  Rockchip PCIe controller driver:

   - Assert PERST# for 100ms after power is stable (Damien Le Moal)

   - Wait PCIE_T_RRS_READY_MS (100ms) after reset before starting
     configuration (Damien Le Moal)

   - Use GPIOD_OUT_LOW flag while requesting ep_gpio to fix a firmware
     crash on Qcom-based modems with Rockpro64 board (Manivannan
     Sadhasivam)

  Rockchip DesignWare PCIe controller driver:

   - Factor common parts of rockchip-dw-pcie DT binding to be shared by
     Root Complex and Endpoint mode (Niklas Cassel)

   - Add missing INTx signals to common DT binding (Niklas Cassel)

   - Add eDMA items to DT binding for Endpoint controller (Niklas
     Cassel)

   - Fix initial dw-rockchip PERST# GPIO value to prevent unnecessary
     short assert/deassert that causes issues with some WLAN controllers
     (Niklas Cassel)

   - Refactor dw-rockchip and add support for Endpoint mode (Niklas
     Cassel)

   - Call pci_epc_init_notify() and drop dw_pcie_ep_init_notify()
     wrapper (Niklas Cassel)

   - Add error messages in .probe() error paths to improve user
     experience (Uwe Kleine-König)

  Samsung Exynos PCIe controller driver:

   - Use bulk clock APIs to simplify clock setup (Shradha Todi)

  StarFive PCIe controller driver:

   - Add DT binding and driver support for the StarFive JH7110
     PLDA-based PCIe controller (Minda Chen)

  Synopsys DesignWare PCIe controller driver:

   - Add generic support for sending PME_Turn_Off when system suspends
     (Frank Li)

   - Fix incorrect interpretation of iATU slot 0 after PERST#
     assert/deassert (Frank Li)

   - Use msleep() instead of usleep_range() while waiting for link
     (Konrad Dybcio)

   - Refactor dw_pcie_edma_find_chip() to enable adding support for
     Hyper DMA (HDMA) (Manivannan Sadhasivam)

   - Enable drivers to supply the eDMA channel count since some can't
     auto detect this (Manivannan Sadhasivam)

   - Call pci_epc_init_notify() and drop dw_pcie_ep_init_notify()
     wrapper (Manivannan Sadhasivam)

   - Pass the eDMA mapping format directly from drivers instead of
     maintaining a capability for it (Manivannan Sadhasivam)

   - Add generic dw_pcie_ep_linkdown() to notify EPF drivers about
     link-down events and restore non-sticky DWC registers lost on link
     down (Manivannan Sadhasivam)

   - Add vendor-specific "apb" reg name, interrupt names, INTx names to
     generic binding (Niklas Cassel)

   - Enforce DWC restriction that 64-bit BARs must start with an
     even-numbered BAR (Niklas Cassel)

   - Consolidate args of dw_pcie_prog_outbound_atu() into a structure
     (Yoshihiro Shimoda)

   - Add support for endpoints to send Message TLPs, e.g., for INTx
     emulation (Yoshihiro Shimoda)

  TI DRA7xx PCIe controller driver:

   - Rename .cpu_addr_fixup() parameter to reflect that it is a PCI
     address, not a CPU address (Niklas Cassel)

  TI Keystone PCIe controller driver:

   - Validate IORESOURCE_BUS presence to avoid NULL pointer dereference
     (Aleksandr Mishin)

   - Work around AM65x/DRA80xM Errata #i2037 that corrupts TLPs and
     causes processor hangs by limiting Max_Read_Request_Size (MRRS) and
     Max_Payload_Size (MPS) (Kishon Vijay Abraham I)

   - Leave BAR 0 disabled for AM654x to fix a regression caused by
     6ab15b5e7057 ("PCI: dwc: keystone: Convert .scan_bus() callback to
     use add_bus"), which caused a 45-second boot delay (Siddharth
     Vadapalli)

  Xilinx Versal CPM PCIe controller driver:

   - Fix overlapping bridge registers and 32-bit BAR addresses in DT
     binding (Thippeswamy Havalige)

  MicroSemi Switchtec management driver:

   - Make struct switchtec_class constant (Greg Kroah-Hartman)

  Miscellaneous:

   - Remove unused struct acpi_handle_node (Dr. David Alan Gilbert)

   - Add missing MODULE_DESCRIPTION() macros (Jeff Johnson)"

* tag 'pci-v6.11-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (154 commits)
  PCI: loongson: Enable MSI in LS7A Root Complex
  PCI: Extend ACS configurability
  PCI: Add missing bridge lock to pci_bus_lock()
  drm/vboxvideo: fix mapping leaks
  PCI: Add managed pcim_iomap_range()
  PCI: Remove legacy pcim_release()
  PCI: Add managed pcim_intx()
  PCI: vmd: Create domain symlink before pci_bus_add_devices()
  PCI: qcom: Prevent use of uninitialized data in qcom_pcie_suspend_noirq()
  PCI: qcom: Prevent potential error pointer dereference
  PCI: qcom: Fix missing error code in qcom_pcie_probe()
  PCI: Give pcim_set_mwi() its own devres cleanup callback
  PCI: Move struct pci_devres.pinned bit to struct pci_dev
  PCI: Remove struct pci_devres.enabled status bit
  PCI: Document hybrid devres hazards
  PCI: Add managed pcim_request_region()
  PCI: Deprecate pcim_iomap_table(), pcim_iomap_regions_request_all()
  PCI: Add managed partial-BAR request and map infrastructure
  PCI: Add devres helpers for iomap table
  PCI: Add and use devres helper for bit masks
  ...
2024-07-19 19:03:18 -07:00
Bjorn Helgaas
df5dd33728 Merge branch 'pci/controller/qcom'
- Use devm_clk_bulk_get_all() to get all the clocks from DT to avoid
  writing out all the clock names (Manivannan Sadhasivam)

- Add DT binding and driver support for the SA8775P SoC (Mrinmay Sarkar)

- Refactor dw_pcie_edma_find_chip() to enable adding support for Hyper DMA
  (HDMA) (Manivannan Sadhasivam)

- Enable drivers to supply the eDMA channel count since some can't auto
  detect this (Manivannan Sadhasivam)

- Add HDMA support for the SA8775P SoC (Mrinmay Sarkar)

- Override the SA8775P NO_SNOOP default to avoid possible memory corruption
  (Mrinmay Sarkar)

- Make sure resources are disabled during PERST# assertion, even if the
  link is already disabled (Manivannan Sadhasivam)

- Vote for the CPU-PCIe ICC (interconnect) path to ensure it stays active
  even if other drivers don't vote for it (Krishna chaitanya chundru)

- Add Operating Performance Points (OPP) to scale performance state based
  on aggregate link bandwidth to improve SoC power efficiency (Krishna
  chaitanya chundru)

- Return failure instead of success if dev_pm_opp_find_freq_floor() fails
  (Dan Carpenter)

- Avoid an error pointer dereference if dev_pm_opp_find_freq_exact() fails
  (Dan Carpenter)

- Prevent use of uninitialized data in qcom_pcie_suspend_noirq() (Dan
  Carpenter)

* pci/controller/qcom:
  PCI: qcom: Prevent use of uninitialized data in qcom_pcie_suspend_noirq()
  PCI: qcom: Prevent potential error pointer dereference
  PCI: qcom: Fix missing error code in qcom_pcie_probe()
  PCI: qcom: Add OPP support to scale performance
  PCI: Bring the PCIe speed to MBps logic to new pcie_dev_speed_mbps()
  PCI: qcom: Add ICC bandwidth vote for CPU to PCIe path
  PCI: qcom-ep: Disable resources unconditionally during PERST# assert
  PCI: qcom-ep: Override NO_SNOOP attribute for SA8775P EP
  PCI: qcom: Override NO_SNOOP attribute for SA8775P RC
  PCI: epf-mhi: Enable HDMA for SA8775P SoC
  PCI: qcom-ep: Add HDMA support for SA8775P SoC
  PCI: dwc: Pass the eDMA mapping format flag directly from glue drivers
  PCI: dwc: Skip finding eDMA channels count for HDMA platforms
  PCI: dwc: Refactor dw_pcie_edma_find_chip() API
  PCI: qcom-ep: Add support for SA8775P SOC
  dt-bindings: PCI: qcom-ep: Add support for SA8775P SoC
  PCI: qcom: Use devm_clk_bulk_get_all() API
2024-07-19 10:10:31 -05:00
Bjorn Helgaas
62281339e3 Merge branch 'pci/reset'
- Warn about doing a Secondary Bus Reset without holding the device lock
  (Dan Williams)

- Lock bridge in addition to downstream hierarchy before doing a Secondary
  Bus Reset (Dan Williams)

* pci/reset:
  PCI: Add missing bridge lock to pci_bus_lock()
  PCI: Warn on missing cfg_access_lock during secondary bus reset
2024-07-19 10:10:23 -05:00
Bjorn Helgaas
147ea50e1e Merge branch 'pci/dpc'
- If there's a device below a bridge, prevent a use-after-free by holding a
  reference to the device while waiting for the secondary bus to be ready
  in case the device is concurrently removed, e.g., by DPC (Lukas Wunner)

* pci/dpc:
  PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
2024-07-19 10:10:22 -05:00
Bjorn Helgaas
06bbe25c21 Merge branch 'pci/devres'
- Add pcim_add_mapping_to_legacy_table() and
  pcim_remove_mapping_from_legacy_table() helper functions to simplify
  devres iomap table (Philipp Stanner)

- Reimplement devres that take a bit mask of BARs in a way that can be used
  to map partial BARs as well as entire BARs (Philipp Stanner)

- Deprecate pcim_iomap_table() and pcim_iomap_regions_request_all() in
  favor of pcim_* request plus pcim_* mapping (Philipp Stanner)

- Add pcim_request_region(), a managed interface to request a single BAR
  (Philipp Stanner)

- Use the existing pci_is_enabled() interface to replace the struct
  devres.enabled bit (Philipp Stanner)

- Move the struct pci_devres.pinned bit to struct pci_dev (Philipp Stanner)

- Reimplement pcim_set_mwi() so it uses its own devres cleanup callback
  instead of a special-purpose bit in struct pci_devres (Philipp Stanner)

- Add pcim_intx(), which is unambiguously managed, unlike pci_intx(), which
  is managed if pcim_enable_device() has been called but unmanaged
  otherwise (Philipp Stanner)

- Remove pcim_release(), which is no longer needed after previous cleanups
  of pcim_set_mwi() and pci_intx() (Philipp Stanner)

- Add pcim_iomap_range(), a managed interface to map part of a BAR (Philipp
  Stanner)

- Fix vboxvideo leak by using the new pcim_iomap_range() instead of the
  unmanaged pci_iomap_range() (Philipp Stanner)

* pci/devres:
  drm/vboxvideo: fix mapping leaks
  PCI: Add managed pcim_iomap_range()
  PCI: Remove legacy pcim_release()
  PCI: Add managed pcim_intx()
  PCI: Give pcim_set_mwi() its own devres cleanup callback
  PCI: Move struct pci_devres.pinned bit to struct pci_dev
  PCI: Remove struct pci_devres.enabled status bit
  PCI: Document hybrid devres hazards
  PCI: Add managed pcim_request_region()
  PCI: Deprecate pcim_iomap_table(), pcim_iomap_regions_request_all()
  PCI: Add managed partial-BAR request and map infrastructure
  PCI: Add devres helpers for iomap table
  PCI: Add and use devres helper for bit masks
2024-07-19 10:10:21 -05:00
Vidya Sagar
47c8846a49 PCI: Extend ACS configurability
PCIe ACS settings control the level of isolation and the possible P2P paths
between devices. With greater isolation the kernel will create smaller
iommu_groups and with less isolation there is more HW that can achieve P2P
transfers. From a virtualization perspective all devices in the same
iommu_group must be assigned to the same VM as they lack security
isolation.

There is no way for the kernel to automatically know the correct ACS
settings for any given system and workload. Existing command line options
(e.g., disable_acs_redir) allow only for large scale change, disabling all
isolation, but this is not sufficient for more complex cases.

Add a kernel command-line option 'config_acs' to directly control all the
ACS bits for specific devices, which allows the operator to setup the right
level of isolation to achieve the desired P2P configuration.  The
definition is future proof; when new ACS bits are added to the spec the
open syntax can be extended.

ACS needs to be setup early in the kernel boot as the ACS settings affect
how iommu_groups are formed. iommu_group formation is a one time event
during initial device discovery, so changing ACS bits after kernel boot can
result in an inaccurate view of the iommu_groups compared to the current
isolation configuration.

ACS applies to PCIe Downstream Ports and multi-function devices.  The
default ACS settings are strict and deny any direct traffic between two
functions. This results in the smallest iommu_group the HW can support.
Frequently these values result in slow or non-working P2PDMA.

ACS offers a range of security choices controlling how traffic is
allowed to go directly between two devices. Some popular choices:

  - Full prevention

  - Translated requests can be direct, with various options

  - Asymmetric direct traffic, A can reach B but not the reverse

  - All traffic can be direct

Along with some other less common ones for special topologies.

The intention is that this option would be used with expert knowledge of
the HW capability and workload to achieve the desired configuration.

Link: https://lore.kernel.org/r/20240625153150.159310-1-vidyas@nvidia.com
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
[bhelgaas: add example, tidy printk formats]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-07-12 16:51:46 -05:00
Dan Williams
a4e772898f PCI: Add missing bridge lock to pci_bus_lock()
One of the true positives that the cfg_access_lock lockdep effort
identified is this sequence:

  WARNING: CPU: 14 PID: 1 at drivers/pci/pci.c:4886 pci_bridge_secondary_bus_reset+0x5d/0x70
  RIP: 0010:pci_bridge_secondary_bus_reset+0x5d/0x70
  Call Trace:
   <TASK>
   ? __warn+0x8c/0x190
   ? pci_bridge_secondary_bus_reset+0x5d/0x70
   ? report_bug+0x1f8/0x200
   ? handle_bug+0x3c/0x70
   ? exc_invalid_op+0x18/0x70
   ? asm_exc_invalid_op+0x1a/0x20
   ? pci_bridge_secondary_bus_reset+0x5d/0x70
   pci_reset_bus+0x1d8/0x270
   vmd_probe+0x778/0xa10
   pci_device_probe+0x95/0x120

Where pci_reset_bus() users are triggering unlocked secondary bus resets.
Ironically pci_bus_reset(), several calls down from pci_reset_bus(), uses
pci_bus_lock() before issuing the reset which locks everything *but* the
bridge itself.

For the same motivation as adding:

  bridge = pci_upstream_bridge(dev);
  if (bridge)
    pci_dev_lock(bridge);

to pci_reset_function() for the "bus" and "cxl_bus" reset cases, add
pci_dev_lock() for @bus->self to pci_bus_lock().

Link: https://lore.kernel.org/r/171711747501.1628941.15217746952476635316.stgit@dwillia2-xfh.jf.intel.com
Reported-by: Imre Deak <imre.deak@intel.com>
Closes: http://lore.kernel.org/r/6657833b3b5ae_14984b29437@dwillia2-xfh.jf.intel.com.notmuch
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
[bhelgaas: squash in recursive locking deadlock fix from Keith Busch:
https://lore.kernel.org/r/20240711193650.701834-1-kbusch@meta.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Kalle Valo <kvalo@kernel.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
2024-07-12 16:22:49 -05:00
Philipp Stanner
25216afc9d PCI: Add managed pcim_intx()
pci_intx() is a "hybrid" function, i.e., it is managed if
pcim_enable_device() has been called, but unmanaged otherwise.

Add pcim_intx(), which is always managed, and implement pci_intx() using
it.

Remove the now-unused struct pci_devres.orig_intx and .restore_intx and
find_pci_dr().

Link: https://lore.kernel.org/r/20240613115032.29098-11-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
[kwilczynski: squashed in
https://lore.kernel.org/r/426645d40776198e0fcc942f4a6cac4433c7a9aa.camel@redhat.com
to fix problem reported and tested by Ashish Kalra <Ashish.Kalra@amd.com>:
https://lore.kernel.org/r/20240708214656.4721-1-Ashish.Kalra@amd.com
https://lore.kernel.org/r/8c4634e9-4f02-4c54-9c89-d75e2f4bf026@amd.com/]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-07-11 16:20:01 -05:00
Philipp Stanner
77f79ac8de
PCI: Remove struct pci_devres.enabled status bit
The struct pci_devres has a separate boolean to track whether a device is
enabled. That, however, can easily be tracked in an agnostic manner through
the function pci_is_enabled().

Using it allows for simplifying the PCI devres implementation.

Replace the separate 'enabled' status bit from struct pci_devres with
calls to pci_is_enabled() at the appropriate places.

Link: https://lore.kernel.org/r/20240613115032.29098-8-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-07-10 04:20:06 +00:00
Philipp Stanner
81fcf28e74
PCI: Document hybrid devres hazards
These functions:

  pci_request_region()
  pci_request_regions()
  pci_request_regions_exclusive()
  pci_request_selected_regions()
  pci_request_selected_regions_exclusive()
  pci_intx()

are "hybrid" functions that are managed if pcim_enable_device() has been
called, but unmanaged otherwise.

This is confusing and has already caused a bug (in 8558de401b5f
("drm/vboxvideo: use managed pci functions")) because users believe all PCI
functions, such as pci_iomap_range(), can become managed that way, which is
not the case.

Add comments to the relevant functions' docstrings that warn users about
this behavior.

Link: https://lore.kernel.org/r/20240613115032.29098-7-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-07-10 04:20:01 +00:00
Philipp Stanner
d47bde7080
PCI: Add managed pcim_request_region()
These existing functions:

  pci_request_region()
  pci_request_selected_regions()
  pci_request_selected_regions_exclusive()

are "hybrid" functions built on __pci_request_region() and are managed if
pcim_enable_device() has been called, but unmanaged otherwise.

Add these new functions:

  pcim_request_region()
  pcim_request_region_exclusive()

These are *always* managed and use the new pcim_addr_devres tracking
infrastructure instead of find_pci_dr() and struct pci_devres.region_mask.

Implement the hybrid functions using the new "pure" functions and remove
struct pci_devres.region_mask, which is no longer needed.

Link: https://lore.kernel.org/r/20240613115032.29098-6-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-07-10 04:19:56 +00:00
Philipp Stanner
bbaff68bf4
PCI: Add managed partial-BAR request and map infrastructure
The pcim_iomap_devres table tracks entire-BAR mappings, so we can't use it
to build a managed version of pci_iomap_range(), which maps partial BARs.

Add struct pcim_addr_devres, which can track request and mapping of both
entire BARs and partial BARs.

Add the following internal devres functions based on struct
pcim_addr_devres:

  pcim_iomap_region()               # request & map entire BAR
  pcim_iounmap_region()             # unmap & release entire BAR
  pcim_request_region()             # request entire BAR
  pcim_release_region()             # release entire BAR
  pcim_request_all_regions()        # request all entire BARs
  pcim_release_all_regions()        # release all entire BARs

Rework the following public interfaces using the new infrastructure
listed above:

  pcim_iomap()                      # map partial BAR
  pcim_iounmap()                    # unmap partial BAR
  pcim_iomap_regions()              # request & map specified BARs
  pcim_iomap_regions_request_all()  # request all BARs, map specified BARs
  pcim_iounmap_regions()            # unmap & release specified BARs

Link: https://lore.kernel.org/r/20240613115032.29098-4-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-07-10 04:19:35 +00:00
Krishna chaitanya chundru
100ae5d77f PCI: Bring the PCIe speed to MBps logic to new pcie_dev_speed_mbps()
Bring the switch case in pcie_link_speed_mbps() to new function to
the header file so that it can be used in other places like
in controller driver.

Link: https://lore.kernel.org/linux-pci/20240619-opp_support-v15-3-aa769a2173a3@quicinc.com
Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-07-09 16:56:36 -05:00
Lukas Wunner
11a1f4bc47
PCI/DPC: Fix use-after-free on concurrent DPC and hot-removal
Keith reports a use-after-free when a DPC event occurs concurrently to
hot-removal of the same portion of the hierarchy:

The dpc_handler() awaits readiness of the secondary bus below the
Downstream Port where the DPC event occurred.  To do so, it polls the
config space of the first child device on the secondary bus.  If that
child device is concurrently removed, accesses to its struct pci_dev
cause the kernel to oops.

That's because pci_bridge_wait_for_secondary_bus() neglects to hold a
reference on the child device.  Before v6.3, the function was only
called on resume from system sleep or on runtime resume.  Holding a
reference wasn't necessary back then because the pciehp IRQ thread
could never run concurrently.  (On resume from system sleep, IRQs are
not enabled until after the resume_noirq phase.  And runtime resume is
always awaited before a PCI device is removed.)

However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also
called on a DPC event.  Commit 53b54ad074de ("PCI/DPC: Await readiness
of secondary bus after reset"), which introduced that, failed to
appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a
reference on the child device because dpc_handler() and pciehp may
indeed run concurrently.  The commit was backported to v5.10+ stable
kernels, so that's the oldest one affected.

Add the missing reference acquisition.

Abridged stack trace:

  BUG: unable to handle page fault for address: 00000000091400c0
  CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0
  RIP: pci_bus_read_config_dword+0x17/0x50
  pci_dev_wait()
  pci_bridge_wait_for_secondary_bus()
  dpc_reset_link()
  pcie_do_recovery()
  dpc_handler()

Fixes: 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset")
Closes: https://lore.kernel.org/r/20240612181625.3604512-3-kbusch@meta.com/
Link: https://lore.kernel.org/linux-pci/8e4bcd4116fd94f592f2bf2749f168099c480ddf.1718707743.git.lukas@wunner.de
Reported-by: Keith Busch <kbusch@kernel.org>
Tested-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: stable@vger.kernel.org # v5.10+
2024-07-01 08:28:29 +00:00
Dan Williams
920f646892 PCI: Warn on missing cfg_access_lock during secondary bus reset
The recent adventure with adding lockdep tracking for cfg_access_lock,
while it yielded many false positives [1], did catch a true positive in the
pci_reset_bus() path [2].

So, while lockdep is difficult to deploy, open coding a check that
cfg_access_lock is held during the reset is feasible.

While this does not offer a full backtrace, it should be sufficient to
implicate the caller of pci_bridge_secondary_bus_reset() as a path that
needs investigation.

Link: https://lore.kernel.org/r/171711746953.1628941.4692125082286867825.stgit@dwillia2-xfh.jf.intel.com
Link: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_134186v1/shard-dg2-1/igt@device_reset@unbind-reset-rebind.html [1]
Link: http://lore.kernel.org/r/cfb50601-5d2a-4676-a958-1bd3f1b06654@intel.com [2]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Kalle Valo <kvalo@kernel.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
2024-06-04 12:10:29 -05:00
Dan Williams
c9d52fb313 PCI: Revert the cfg_access_lock lockdep mechanism
While the experiment did reveal that there are additional places that are
missing the lock during secondary bus reset, one of the places that needs
to take cfg_access_lock (pci_bus_lock()) is not prepared for lockdep
annotation.

Specifically, pci_bus_lock() takes pci_dev_lock() recursively and is
currently dependent on the fact that the device_lock() is marked
lockdep_set_novalidate_class(&dev->mutex). Otherwise, without that
annotation, pci_bus_lock() would need to use something like a new
pci_dev_lock_nested() helper, a scheme to track a PCI device's depth in the
topology, and a hope that the depth of a PCI tree never exceeds the max
value for a lockdep subclass.

The alternative to ripping out the lockdep coverage would be to deploy a
dynamic lock key for every PCI device. Unfortunately, there is evidence
that increasing the number of keys that lockdep needs to track to be
per-PCI-device is prohibitively expensive for something like the
cfg_access_lock.

The main motivation for adding the annotation in the first place was to
catch unlocked secondary bus resets, not necessarily catch lock ordering
problems between cfg_access_lock and other locks. Solve that narrower
problem with follow-on patches, and just due to targeted revert for now.

Link: https://lore.kernel.org/r/171711746402.1628941.14575335981264103013.stgit@dwillia2-xfh.jf.intel.com
Fixes: 7e89efc6e9e4 ("PCI: Lock upstream bridge for pci_reset_function()")
Reported-by: Imre Deak <imre.deak@intel.com>
Closes: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_134186v1/shard-dg2-1/igt@device_reset@unbind-reset-rebind.html
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Kalle Valo <kvalo@kernel.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Cc: Jani Saarinen <jani.saarinen@intel.com>
2024-06-04 12:10:05 -05:00
Bjorn Helgaas
7ecf13fd35 Merge branch 'pci/misc'
- Constify pcibus_class (Heiner Kallweit)

- Annotate pci_cache_line_size variables as __ro_after_init (Heiner
  Kallweit)

- Clean up formatting of PCI accessor macros (Ilpo Järvinen)

- Remove some OLPC dead code (Kunwu Chan)

- Make pcie_bandwidth_capable() static (Ilpo Järvinen)

* pci/misc:
  PCI: Make pcie_bandwidth_capable() static
  x86/pci: Remove OLPC dead code
  PCI: Clean up accessor macro formatting
  PCI/ERR: Cleanup misleading indentation inside if conditions
  PCI: Annotate pci_cache_line_size variables as __ro_after_init
  PCI: Constify pcibus_class
2024-05-16 18:14:14 -05:00
Bjorn Helgaas
12ff1ef539 Merge branch 'pci/pm'
- Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports because we can't get
  them back out of D3cold (Mario Limonciello)

* pci/pm:
  PCI/PM: Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports
2024-05-16 18:14:11 -05:00
Bjorn Helgaas
ce4a9f1b1c Merge branch 'pci/enumeration'
- Clear bridge Secondary Status errors after enumeration since enumeration
  causes many errors (Vidya Sagar)

- Wait for Link Training==0 before starting Link retrain to avoid a race;
  this was done previously but broken by a faulty merge (Ilpo Järvinen)

- Rename PCI_IRQ_LEGACY to PCI_IRQ_INTX to be more specific about what
  "LEGACY" means (Damien Le Moal)

- Update return types of pci_find_capability() stubs to match the extern
  declarations for the actual implementations (Bjorn Helgaas)

- Drop unnecessary pci_enable_device_io() from pata_cs5520 (Heiner
  Kallweit)

- Drop unused pci_enable_device_io() (Heiner Kallweit)

- On 2016 and newer BIOSes, skip early E820 check for ECAM regions
  described in ACPI MCFG; there's no spec requirement for E820
  reservations, and some machines don't provide them (Bjorn Helgaas)

- If devices were disconnected while suspended, don't wait for them when
  resuming (Ilpo Järvinen)

* pci/enumeration:
  PCI: Do not wait for disconnected devices when resuming
  x86/pci: Skip early E820 check for ECAM region
  PCI: Remove unused pci_enable_device_io()
  ata: pata_cs5520: Remove unnecessary call to pci_enable_device_io()
  PCI: Update pci_find_capability() stub return types
  PCI: Remove PCI_IRQ_LEGACY
  scsi: vmw_pvscsi: Do not use PCI_IRQ_LEGACY instead of PCI_IRQ_LEGACY
  scsi: pmcraid: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: mpt3sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: megaraid_sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: ipr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: hpsa: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: arcmsr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  wifi: rtw89: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  wifi: rtw88: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  wifi: ath10k: Refer to INTX instead of LEGACY
  net: wangxun: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  r8169: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  net: alx: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  net: atlantic: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  net: amd-xgbe: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  VMCI: Use PCI_IRQ_ALL_TYPES to remove PCI_IRQ_LEGACY use
  RDMA/vmw_pvrdma: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  IB/qib: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  drm/amdgpu: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  mfd: intel-lpss: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  ntb: idt: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  platform/x86: intel_ips: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  tty: 8250_pci: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  usb: hcd-pci: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  ASoC: Intel: avs: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  Documentation: PCI: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  PCI/portdrv: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  PCI/MSI: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  PCI: Clarify intent of LT wait
  PCI: Wait for Link Training==0 before starting Link retrain
  PCI: Clear Secondary Status errors after enumeration
2024-05-16 18:14:10 -05:00
Ilpo Järvinen
6613443ffc PCI: Do not wait for disconnected devices when resuming
On runtime resume, pci_dev_wait() is called:

  pci_pm_runtime_resume()
    pci_pm_bridge_power_up_actions()
      pci_bridge_wait_for_secondary_bus()
        pci_dev_wait()

While a device is runtime suspended along with its PCI hierarchy, the
device could get disconnected. In such case, the link will not come up no
matter how long pci_dev_wait() waits for it.

Besides the above mentioned case, there could be other ways to get the
device disconnected while pci_dev_wait() is waiting for the link to come
up.

Make pci_dev_wait() exit if the device is already disconnected to avoid
unnecessary delay.

The use cases of pci_dev_wait() boil down to two:

  1. Waiting for the device after reset
  2. pci_bridge_wait_for_secondary_bus()

The callers in both cases seem to benefit from propagating the
disconnection as error even if device disconnection would be more
analoguous to the case where there is no device in the first place which
return 0 from pci_dev_wait(). In the case 2, it results in unnecessary
marking of the devices disconnected again but that is just harmless extra
work.

Also make sure compiler does not become too clever with dev->error_state
and use READ_ONCE() to force a fetch for the up-to-date value.

Link: https://lore.kernel.org/r/20240208132322.4811-1-ilpo.jarvinen@linux.intel.com
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-05-16 14:35:09 -05:00
Heiner Kallweit
844177a807 PCI: Remove unused pci_enable_device_io()
After the last user was removed, remove this PCI core function.  It's very
unlikely that we'll see a new device requiring io space access, even though
memory space access is supported.

Link: https://lore.kernel.org/r/213ebf62-53a3-42b7-8518-ecd5cd6d6b08@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
2024-05-16 14:35:08 -05:00
Ilpo Järvinen
fe4a83ec07 PCI: Make pcie_bandwidth_capable() static
pcie_bandwidth_capable() is only used within pci.c, make it static.

Link: https://lore.kernel.org/r/20240507121758.13849-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-05-08 19:03:55 -05:00
Dave Jiang
53c49b6e6d PCI/CXL: Add 'cxl_bus' reset method for devices below CXL Ports
By default Secondary Bus Reset (SBR) is masked for CXL Ports (see CXL r3.1,
sec 8.1.5.2).

Add cxl_reset_bus_function() (method "cxl_bus") to set the "Unmask SBR" bit
in the upstream CXL Port before performing the bus reset and restore the
original value afterwards.

This method allows the user to perform a bus reset on a CXL device without
needing to set the "Unmask SBR" bit via a user tool.

Link: https://lore.kernel.org/r/20240502165851.1948523-5-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
[bhelgaas: simplify commit log, invert condition to avoid negation]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
2024-05-08 13:25:43 -05:00
Dave Jiang
b1956e2d07 PCI/CXL: Fail bus reset if upstream CXL Port has SBR masked
Per CXL spec r3.1, sec 8.1.5.2, the Secondary Bus Reset (SBR) bit in the
Bridge Control register of a CXL port has no effect unless the "Unmask SBR"
bit is set.

Return -ENOTTY if we attempt a bus reset on a device below a CXL Port where
"Unmask SBR" is 0.  Otherwise, the bus reset would appear to have succeeded
even though setting the bridge SBR bit had no effect.

Link: https://lore.kernel.org/linux-cxl/20240220203956.GA1502351@bhelgaas/
Link: https://lore.kernel.org/r/20240502165851.1948523-4-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
[bhelgaas: simplify commit log and comments]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
2024-05-08 13:25:36 -05:00