Commit Graph

1308214 Commits

Author SHA1 Message Date
Linus Torvalds
7116747a68 soundwire updates for 6.12
- bus cleanup for warnings and probe deferral errors suppression
  - cadence recheck for status with a delayed work
  - intel interrupt rework on reset exit
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmbv9W8ACgkQfBQHDyUj
 g0cZqxAAjsXmwjky3zGs8vuC8hFfTJ4nI9NGW0pPKcYlTLDCGfN49/caL0X+BYvE
 12x+tHyKa3H29as5CDkS/NkmtBGI27V3wcdOTQEv9OXvv1G/Kc8sfB5wJZS2deec
 41EvSu/If1qqQaNXoooI3s63S23SE22vxE6RtN7UNRKlgXs/5qjIVPhdYSUcXiCl
 gl7wG/NgyfMwH17LCY2W1l2/BSP7XKRBny+cT8dzNR1muaZn8Noi3RiIzI+eEWVv
 ldHHkc5yTrCYVsi/xW9oDmUJJ4IKZU1TBGAbtOe/EtaJuMyjOCVClfTG96WQrpFl
 0gDS7qlGjD0tf8RVn4QLVV68ZVXI6EYPUrv2+Z9f8swf21RLKEjx5aflxktzEwPY
 SDPvRhpXa5FjmHt7yhVMLXV1u4CyNWepeEcbmZgsll8zrAXwIcltH8NDicURdzqK
 UHzQFG3LPHrlnC5zBsFpRXFoC8VDa//1kr6vEsXvnY1ehMRUa6+Hvy+1+7q5Syyj
 ffmhzlw1dJcjnQC1l5xYejPzXi0kw/eAsr0eQ+aumVjDm/2BV/UJbnQ4Pj+l4lAU
 b9+K6R+ZxC+XZYE4CZtlT/Jtx2Q8Ncp41XA3ErZSUpEJgFzpWCd+C1u/gYzghoUx
 X0JaX4XzrUtlCvBw8px/NirG8jy36NqvKnoNLkFS/PUMQ74dcmg=
 =Xwb2
 -----END PGP SIGNATURE-----

Merge tag 'soundwire-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire

Pull soundwire updates from Vinod Koul:

 - bus cleanup for warnings and probe deferral errors suppression

 - cadence recheck for status with a delayed work

 - intel interrupt rework on reset exit

* tag 'soundwire-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: intel_bus_common: enable interrupts before exiting reset
  soundwire: cadence: re-check Peripheral status with delayed_work
  soundwire: bus: clean up probe warnings
  soundwire: bus: drop unused driver name field
  soundwire: bus: suppress probe deferral errors
2024-09-23 14:00:46 -07:00
Linus Torvalds
f34c512521 linux-watchdog 6.12-rc1 tag
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.14 (GNU/Linux)
 
 iEYEABECAAYFAmbv6AcACgkQ+iyteGJfRsqKdgCgr3VyhlkM+ryZtxnYl02midaC
 Yp4An2baWN7fzdJWgj11coglnM/mpGvG
 =J3w7
 -----END PGP SIGNATURE-----

Merge tag 'linux-watchdog-6.12-rc1' of git://www.linux-watchdog.org/linux-watchdog

Pull watchdog updates from Wim Van Sebroeck:

 - Add Watchdog Timer driver for RZ/V2H(P)

 - Add Cirrus EP93x

 - Some small fixes and improvements

* tag 'linux-watchdog-6.12-rc1' of git://www.linux-watchdog.org/linux-watchdog:
  watchdog: Convert comma to semicolon
  watchdog: rzv2h_wdt: Add missing MODULE_LICENSE tag to fix modpost error
  dt-bindings: watchdog: Add Cirrus EP93x
  dt-bindings: watchdog: stm32-iwdg: Document interrupt and wakeup properties
  drivers: watchdog: marvell_gti: Convert comma to semicolon
  watchdog: iTCO_wdt: Convert comma to semicolon
  watchdog: Add Watchdog Timer driver for RZ/V2H(P)
  dt-bindings: watchdog: renesas,wdt: Document RZ/V2H(P) SoC
  watchdog: imx_sc_wdt: detect if already running
  watchdog: imx2_wdt: Remove __maybe_unused notations
  watchdog: imx_sc_wdt: Don't disable WDT in suspend
  watchdog: imx7ulp_wdt: move post_rcs_wait into struct imx_wdt_hw_feature
2024-09-23 13:19:37 -07:00
Linus Torvalds
962ad08780 This is the bulk of pin control changes for the v6.12 kernel cycle:
Core changes:
 
 - Add support for "input-schmitt-microvolt" property, as used in the
   Sophgo SoC.
 
 New drivers:
 
 - Mobileye EyeQ5 pin controller, I think this is an automotive SoC.
 
 - Rockchip rk3576 pin control support.
 
 - Sophgo CV1800 series pin controllers: CV1800B, CV1812H and SG2000.
 
 Improvements:
 
 - Gradual improvements to Renesas, Samsung, Qualcomm, Nuvoton
   and a few other drivers.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmbvQpoACgkQQRCzN7AZ
 XXOwIxAApTqZMFAQwEsqMSrcDeJeoaH3vQLa1MKmOLY+IIdNPemlaCpoYUX48MRB
 DRxz9c6lU2sD80xtYfuXCpye8i4cVrkLlRWpap2efX1rqaMWQNO1vQvO4VgL6plI
 ElU1ivjKpuqIsjJH3UELILuDV8Ft26p1AoCJhE5cTVu+nsP4aNaS0DQLzf2K7nrd
 c5chLYPxM0K+ViHtDDAlrJ70y5fUCkEaSUC8PBqVu04gRjubIqNKwlbaNRR/pRKk
 dtG4RbDdJwu//pcStNIuKpwlEHcY6E8egY80fjP+i6XUNDfxN2LFcC7IWkvTRiZg
 xz7nQFsH6BnuUKpaxFCCTs58sFPKCoasbWMja4RtXoq3e0W6Gne7/tNO6uLc0dFx
 AIGIW/scLzmOgi32YYZsMu9aL6d6Gr2Bj1aJrzHcgi4gZWJnfwCDwuhb70X9eZaY
 T1HMjCGU16XO9ipAtIHrtsOt6DRvq5tLygEQ39u6nxQCrapJ30JCFMAxesGdErAL
 ER8RmUGhvv+sMJ0fE8pcCP7vOB3LaPiqgqDsEQK7ph81pIWi/iXWQBNzRqGMs2aQ
 30WZ27nO26NBS54yUd019UGJGTDafYuWVn+coXEYd8YBlqxP9qRlfbVUvLbvi730
 OopyOKOhAuYQEjuQsK2Gz9/5K5axw2f3d6ClAwqnGeWTtihgS6s=
 =7uZC
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Pull pin control updates from Linus Walleij:
 "Core changes:

   - Add support for "input-schmitt-microvolt" property, as used in the
     Sophgo SoC

  New drivers:

   - Mobileye EyeQ5 pin controller, I think this is an automotive SoC

   - Rockchip rk3576 pin control support

   - Sophgo CV1800 series pin controllers: CV1800B, CV1812H and SG2000

  Improvements:

   - Gradual improvements to Renesas, Samsung, Qualcomm, Nuvoton and a
     few other drivers"

* tag 'pinctrl-v6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (67 commits)
  pinctrl: intel: Constify struct intel_pinctrl parameter
  pinctrl: Remove redundant null pointer checks in pinctrl_remove_device_debugfs()
  pinctrl: baytrail: Drop duplicate return statement
  pinctrl: intel: Inline intel_gpio_community_irq_handler()
  dt-bindings: pinctrl: qcom: add missing type to GPIO hogs
  pinctrl: madera: Simplify with dev_err_probe()
  pinctrl: k210: Use devm_clk_get_enabled() helpers
  pinctrl: Join split messages and remove double whitespace
  pinctrl: renesas: rzg2l: Move pinconf_to_config_argument() call outside of switch cases
  pinctrl: renesas: rzg2l: Introduce single macro for digital noise filter configuration
  pinctrl: renesas: rzg2l: Replace of_node_to_fwnode() with more suitable API
  pinctrl: mvebu: Fix devinit_dove_pinctrl_probe function
  pinctrl: sunxi: Use devm_clk_get_enabled() helpers
  pinctrl: sophgo: cv18xx: fix missed __iomem type identifier
  pinctrl: stmfx: Use string_choices API instead of ternary operator
  pinctrl: nomadik: Use kmemdup_array instead of kmemdup for multiple allocation
  pinctrl: intel: Introduce for_each_intel_gpio_group() helper et al.
  pinctrl: intel: Constify intel_get_community() returned object
  pinctrl: intel: Implement high impedance support
  pinctrl: intel: Add __intel_gpio_get_direction() helper
  ...
2024-09-23 13:15:23 -07:00
Linus Torvalds
5f153b6330 Bug fixes for intel ntb driver debugfs, use after free in switchtec
driver, ntb transport rx ring buffers.  Also, cleanups in printks,
 kernel-docs, and idt driver comment.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEoE9b9c3U2JxX98mqbmZLrHqL0iMFAmbtj00ACgkQbmZLrHqL
 0iNH6A/+JzxP8uH8FK4yUM6EFtT03gFZvuhYZ80JnjHWk19mu/zxzdm3uwZ5DZ5h
 utOz7lHn2mljkq2y+rLTDyZqcpkRQgS1jsGSOdv/NnR6kZ6+siGl01sjYXUbv2+9
 eUyyLcq1CksLCqjCEc5WYARo84JIxugDxkEpAS0JZRZSU7DvrvGz0bmuEyLnaFY/
 HVi4bwmCS+/m13tJeh2f+OpNZkw8yt/pE8miYsmyJM/YUhFVr7hflOGo0ObtuG15
 csvyIx7sPVuRr7ecA6kfcAR6cRf5lMuaW7twuGywU2bnJ5UvQtn2lolB4qR6yg04
 L3QYnx/PMUpUe7RhuEmEylib4TistsIRH0+N+YLBnCaeqVF/UM372fo8g6U4iytV
 AnqAo+YfKNb92i1I+XdlqfT/u/HD+o6bXpbj1H+dxgKHrwqiqLT05TFn9kYCQdz+
 sNsE2S8JivqL3MmLlnk57v93CVW3/b5kewsjRwF/NsW2Eb75WCuEbmGi3aDRU20F
 DRBUxYDR2o15f4aK1jDYA6R7hDO7b6233VNWssNudypKPTCP6L8GRaMiwyiLaz8m
 7g2I0f2lmcYPR0FGoLaM+DA3z9a7fM8I5ynkWkbrmOsyZM1aaXMVEj/uvmSkeSHa
 bpZYQYNrL4g4mwofGH+dGeGtQNkCyLZL1lGhVWDjsoiWl+qWClk=
 =lFRT
 -----END PGP SIGNATURE-----

Merge tag 'ntb-6.12' of https://github.com/jonmason/ntb

Pull PCIe non-transparent bridge updates from Jon Mason:
 "Bug fixes for intel ntb driver debugfs, use after free in switchtec
  driver, ntb transport rx ring buffers. Also, cleanups in printks,
  kernel-docs, and idt driver comment"

* tag 'ntb-6.12' of https://github.com/jonmason/ntb:
  ntb: Force physically contiguous allocation of rx ring buffers
  ntb: ntb_hw_switchtec: Fix use after free vulnerability in switchtec_ntb_remove due to race condition
  ntb: idt: Fix the cacography in ntb_hw_idt.c
  NTB: epf: don't misuse kernel-doc marker
  NTB: ntb_transport: fix all kernel-doc warnings
  ntb: Constify struct bus_type
  ntb_perf: Fix printk format
  ntb: intel: Fix the NULL vs IS_ERR() bug for debugfs_create_dir()
2024-09-23 13:10:49 -07:00
Linus Torvalds
d7dfb07d4d firewire updates for v6.12
The batch of changes includes the followwing:
 
 - Replacing tasklet with usual workqueue for isochronous context
 - Replacing IDR with XArray
 - Utilizing guard macro where possible
 - Printing deprecation warning when enabling debug parameter of
   firewire-ohci module
 
 Additionally, it includes a single patch for sound subsystem which the
 subsystem maintainer acked:
 
 - Switching to nonatomic PCM operation
 
 In FireWire subsystem, tasklet has been used as the bottom half of 1394
 OHCi hardIRQ so long. In the recent kernel updates, BH workqueue has
 been available, and some developers have proposed replacing tasklet with
 BH workqueue. While it is fortunate that developers are still considering
 the legacy subsystem, a simple replacement is not necessarily suitable.
 
 As a first step towards dropping tasklet, I've investigated the
 feasibility for 1394 OHCI isochronous context, and concluded that usual
 workqueue is available. In the context, the batch of packets is processed
 in the specific queue, thus the timing jitter caused by task scheduling is
 not so critical. Additionally, DMA transmission can be scheduled
 per-packet basis, therefore the context can be sleep between the operation
 of transmissions. Furthermore, in-kernel protocol implementation involves
 some CPU-bound tasks, which can sometimes consumes CPU time so long. These
 characteristics suggest that usual workqueue is suitable, through BH
 workqueues are not.
 
 The replacement with usual workqueue allows unit drivers to process the
 content of packets in non-atomic context. It brings some reliefs to some
 drivers in sound subsystem that spin-lock is not mandatory anymore during
 isochronous packet processing.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQE66IEYNDXNBPeGKSsLtaWM8LwEwUCZu41yQAKCRCsLtaWM8Lw
 E4Y1AP43vZatH202NNMnbkLSW9axmHe6VHWEwDSsJT80vTbBNAD/WYV62EoQzlk1
 1lzdts11SSqYPhj6tJDuRgqULlNAows=
 =7VMx
 -----END PGP SIGNATURE-----

Merge tag 'firewire-updates-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394

Pull firewire updates from Takashi Sakamoto:
 "In the FireWire subsystem, tasklets have been used as the bottom half
  of 1394 OHCi hardIRQ. In recent kernel updates, BH workqueues have
  become available, and some developers have proposed replacing the
  tasklet with a BH workqueue.

  As a first step towards dropping tasklet use, the 1394 OHCI
  isochronous context can use regular workqueues. In this context, the
  batch of packets is processed in the specific queue, thus the timing
  jitter caused by task scheduling is not so critical.

  Additionally, DMA transmission can be scheduled per-packet basis,
  therefore the context can be sleep between the operation of
  transmissions. Furthermore, in-kernel protocol implementation involves
  some CPU-bound tasks, which can sometimes consumes CPU time so long.
  These characteristics suggest that normal workqueues are suitable,
  through BH workqueues are not.

  The replacement with a workqueue allows unit drivers to process the
  content of packets in non-atomic context. It brings some reliefs to
  some drivers in sound subsystem that spin-lock is not mandatory
  anymore during isochronous packet processing.

  Summary:

   - Replace tasklet with workqueue for isochronous context

   - Replace IDR with XArray

   - Utilize guard macro where possible

   - Print deprecation warning when enabling debug parameter of
     firewire-ohci module

   - Switch to nonatomic PCM operation"

* tag 'firewire-updates-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: (55 commits)
  firewire: core: rename cause flag of tracepoints event
  firewire: core: update documentation of kernel APIs for flushing completions
  firewire: core: add helper function to retire descriptors
  Revert "firewire: core: move workqueue handler from 1394 OHCI driver to core function"
  Revert "firewire: core: use mutex to coordinate concurrent calls to flush completions"
  firewire: core: use mutex to coordinate concurrent calls to flush completions
  firewire: core: move workqueue handler from 1394 OHCI driver to core function
  firewire: core: fulfill documentation of fw_iso_context_flush_completions()
  firewire: core: expose kernel API to schedule work item to process isochronous context
  firewire: core: use WARN_ON_ONCE() to avoid superfluous dumps
  ALSA: firewire: use nonatomic PCM operation
  firewire: core: non-atomic memory allocation for isochronous event to user client
  firewire: ohci: operate IT/IR events in sleepable work process instead of tasklet softIRQ
  firewire: core: add local API to queue work item to workqueue specific to isochronous contexts
  firewire: core: allocate workqueue to handle isochronous contexts in card
  firewire: ohci: obsolete direct usage of printk_ratelimit()
  firewire: ohci: deprecate debug parameter
  firewire: core: update fw_device outside of device_find_child()
  firewire: ohci: fix error path to detect initiated reset in TI TSB41BA3D phy
  firewire: core/ohci: minor refactoring for computation of configuration ROM size
  ...
2024-09-23 12:55:27 -07:00
Guenter Roeck
f89722faa3 ipe: Add missing terminator to list of unit tests
Add missing terminator to list of unit tests to avoid random crashes seen
when running the test.

Fixes: 10ca05a760 ("ipe: kunit test for parser")
Cc: Deven Bowers <deven.desai@linux.microsoft.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Fan Wu <wufan@linux.microsoft.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Fan Wu <wufan@linux.microsoft.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-09-23 15:53:37 -04:00
Linus Torvalds
3a37872316 pci-v6.12-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmbseugUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxdwxAAvdvDyTuiPo2R8pQtvKg4YL2IUnK5
 UR28mBxZDK5DFhLtD/QzmVVG/eaLY6bJHthHgJgTApzekkqU0h9dcRI0eegXrvcz
 I3HRsZK2yatUky9l8O148OLzF897r7vXL3QtGe6qjKU+9D83IEeooLKgBca+GoBC
 bRLvG/fYRzdjOe8UHFqCoeMIg3IOY7CNifvFOihAGpJpxfZQktj6hSKu6q7BL1Rx
 NRgYlxh0eLcb7vAJqz6RZpQ8PRCwhAjlDuu0BOkES8/6EwisD1xUh3qdDxfVgNA6
 FpcAb/53yr46cs4tM9ZTwluka86AskuXj3jwSKf7nE3zqr4nM9OD3sGOSYzK8UdE
 EDBKj+9iEpYRC6rJMk5gNH2AZkR1OEpNUisR6+kEn81A9yNNoTmkHdHUOWo8TuxD
 btc0sTM+eWApvTiZwgL4VjMZulQllV51K8tcfvODRhlMkbOPNWGWdmpWqEbUS2HU
 i7+zzQC3DC5iPlAKgRSeYB0aad6la6brqPW16sGhGovNhgwbzakDLCUJJGn/LNuO
 wd0UNpJTnHlfChbvNh2bBxiMOo0cab1tJ5Jp97STQYhLg2nW93s/dAfdpSAsYO4S
 5YzjSADWeyeuDsHE1RdUdDvYAPMb1VZBUd2OSHis5zw7kmh25c9KYXEkDJ25q/ju
 sVXK4oMNW/Gnd5M=
 =L3s9
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:

   - Wait for device readiness after reset by polling Vendor ID and
     looking for Configuration RRS instead of polling the Command
     register and looking for non-error completions, to avoid hardware
     retries done for RRS on non-Vendor ID reads (Bjorn Helgaas)

   - Rename CRS Completion Status to RRS ('Request Retry Status') to
     match PCIe r6.0 spec usage (Bjorn Helgaas)

   - Clear LBMS bit after a manual link retrain so we don't try to
     retrain a link when there's no downstream device anymore (Maciej W.
     Rozycki)

   - Revert to the original link speed after retraining fails instead of
     leaving it restricted to 2.5GT/s, so a future device has a chance
     to use higher speeds (Maciej W. Rozycki)

   - Wait for each level of downstream bus, not just the first, to
     become accessible before restoring devices on that bus (Ilpo
     Järvinen)

   - Add ARCH_PCI_DEV_GROUPS so s390 can add its own attribute_groups
     without having to stomp on the core's pdev->dev.groups (Lukas
     Wunner)

  Driver binding:

   - Export pcim_request_region(), a managed counterpart of
     pci_request_region(), for use by drivers (Philipp Stanner)

   - Export pcim_iomap_region() and deprecate pcim_iomap_regions()
     (Philipp Stanner)

   - Request the PCI BAR used by xboxvideo (Philipp Stanner)

   - Request and map drm/ast BARs with pcim_iomap_region() (Philipp
     Stanner)

  MSI:

   - Add MSI_FLAG_NO_AFFINITY flag for devices that mux MSIs onto a
     single IRQ line and cannot set the affinity of each MSI to a
     specific CPU core (Marek Vasut)

   - Use MSI_FLAG_NO_AFFINITY and remove unnecessary .irq_set_affinity()
     implementations in aardvark, altera, brcmstb, dwc, mediatek-gen3,
     mediatek, mobiveil, plda, rcar, tegra, vmd, xilinx-nwl,
     xilinx-xdma, and xilinx drivers to avoid 'IRQ: set affinity failed'
     warnings (Marek Vasut)

  Power management:

   - Add pwrctl support for ATH11K inside the WCN6855 package (Konrad
     Dybcio)

  PCI device hotplug:

   - Remove unnecessary hpc_ops struct from shpchp (ngn)

   - Check for PCI_POSSIBLE_ERROR(), not 0xffffffff, in cpqphp
     (weiyufeng)

  Virtualization:

   - Mark Creative Labs EMU20k2 INTx masking as broken (Alex Williamson)

   - Add an ACS quirk for Qualcomm SA8775P, which doesn't advertise ACS
     but does provide ACS-like features (Subramanian Ananthanarayanan)

  IOMMU:

   - Add function 0 DMA alias quirk for Glenfly Arise audio function,
     which uses the function 0 Requester ID (WangYuli)

  NPEM:

   - Add Native PCIe Enclosure Management (NPEM) support for sysfs
     control of NVMe RAID storage indicators (ok/fail/locate/
     rebuild/etc) (Mariusz Tkaczyk)

   - Add support for the ACPI _DSM PCIe SSD status LED management, which
     is functionally similar to NPEM but mediated by platform firmware
     (Mariusz Tkaczyk)

  Device trees:

   - Drop minItems and maxItems from ranges in PCI generic host binding
     since host bridges may have several MMIO and I/O port apertures
     (Frank Li)

   - Add kirin, rcar-gen2, uniphier DT binding top-level constraints for
     clocks (Krzysztof Kozlowski)

  Altera PCIe controller driver:

   - Convert altera DT bindings from text to YAML (Matthew Gerlach)

   - Replace TLP_REQ_ID() with macro PCI_DEVID(), which does the same
     thing and is what other drivers use (Jinjie Ruan)

  Broadcom STB PCIe controller driver:

   - Add DT binding maxItems for reset controllers (Jim Quinlan)

   - Use the 'bridge' reset method if described in the DT (Jim Quinlan)

   - Use the 'swinit' reset method if described in the DT (Jim Quinlan)

   - Add 'has_phy' so the existence of a 'rescal' reset controller
     doesn't imply software control of it (Jim Quinlan)

   - Add support for many inbound DMA windows (Jim Quinlan)

   - Rename SoC 'type' to 'soc_base' express the fact that SoCs come in
     families of multiple similar devices (Jim Quinlan)

   - Add Broadcom 7712 DT description and driver support (Jim Quinlan)

   - Sort enums, pcie_offsets[], pcie_cfg_data, .compatible strings for
     maintainability (Bjorn Helgaas)

  Freescale i.MX6 PCIe controller driver:

   - Add imx6q-pcie 'dbi2' and 'atu' reg-names for i.MX8M Endpoints
     (Richard Zhu)

   - Fix a code restructuring error that caused i.MX8MM and i.MX8MP
     Endpoints to fail to establish link (Richard Zhu)

   - Fix i.MX8MP Endpoint occasional failure to trigger MSI by enforcing
     outbound alignment requirement (Richard Zhu)

   - Call phy_power_off() in the .probe() error path (Frank Li)

   - Rename internal names from imx6_* to imx_* since i.MX7/8/9 are also
     supported (Frank Li)

   - Manage Refclk by using SoC-specific callbacks instead of switch
     statements (Frank Li)

   - Manage core reset by using SoC-specific callbacks instead of switch
     statements (Frank Li)

   - Expand comments for erratum ERR010728 workaround (Frank Li)

   - Use generic PHY APIs to configure mode, speed, and submode, which
     is harmless for devices that implement their own internal PHY
     management and don't set the generic imx_pcie->phy (Frank Li)

   - Add i.MX8Q (i.MX8QM, i.MX8QXP, and i.MX8DXL) DT binding and driver
     Root Complex support (Richard Zhu)

  Freescale Layerscape PCIe controller driver:

   - Replace layerscape-pcie DT binding compatible fsl,lx2160a-pcie with
     fsl,lx2160ar2-pcie (Frank Li)

   - Add layerscape-pcie DT binding deprecated 'num-viewport' property
     to address a DT checker warning (Frank Li)

   - Change layerscape-pcie DT binding 'fsl,pcie-scfg' to phandle-array
     (Frank Li)

  Loongson PCIe controller driver:

   - Increase max PCI hosts to 8 for Loongson-3C6000 and newer chipsets
     (Huacai Chen)

  Marvell Aardvark PCIe controller driver:

   - Fix issue with emulating Configuration RRS for two-byte reads of
     Vendor ID; previously it only worked for four-byte reads (Bjorn
     Helgaas)

  MediaTek PCIe Gen3 controller driver:

   - Add per-SoC struct mtk_gen3_pcie_pdata to support multiple SoC
     types (Lorenzo Bianconi)

   - Use reset_bulk APIs to manage PHY reset lines (Lorenzo Bianconi)

   - Add DT and driver support for Airoha EN7581 PCIe controller
     (Lorenzo Bianconi)

  Qualcomm PCIe controller driver:

   - Update qcom,pcie-sc7280 DT binding with eight interrupts (Rayyan
     Ansari)

   - Add back DT 'vddpe-3v3-supply', which was incorrectly removed
     earlier (Johan Hovold)

   - Drop endpoint redundant masking of global IRQ events (Manivannan
     Sadhasivam)

   - Clarify unknown global IRQ message and only log it once to avoid a
     flood (Manivannan Sadhasivam)

   - Add 'linux,pci-domain' property to endpoint DT binding (Manivannan
     Sadhasivam)

   - Assign PCI domain number for endpoint controllers (Manivannan
     Sadhasivam)

   - Add 'qcom_pcie_ep' and the PCI domain number to IRQ names for
     endpoint controller (Manivannan Sadhasivam)

   - Add global SPI interrupt for PCIe link events to DT binding
     (Manivannan Sadhasivam)

   - Add global RC interrupt handler to handle 'Link up' events and
     automatically enumerate hot-added devices (Manivannan Sadhasivam)

   - Avoid mirroring of DBI and iATU register space so it doesn't
     overlap BAR MMIO space (Prudhvi Yarlagadda)

   - Enable controller resources like PHY only after PERST# is
     deasserted to partially avoid the problem that the endpoint SoC
     crashes when accessing things when Refclk is absent (Manivannan
     Sadhasivam)

   - Add 16.0 GT/s equalization and RX lane margining settings (Shashank
     Babu Chinta Venkata)

   - Pass domain number to pci_bus_release_domain_nr() explicitly to
     avoid a NULL pointer dereference (Manivannan Sadhasivam)

  Renesas R-Car PCIe controller driver:

   - Make the read-only const array 'check_addr' static (Colin Ian King)

   - Add R-Car V4M (R8A779H0) PCIe host and endpoint to DT binding
     (Yoshihiro Shimoda)

  TI DRA7xx PCIe controller driver:

   - Request IRQF_ONESHOT for 'dra7xx-pcie-main' IRQ since the primary
     handler is NULL (Siddharth Vadapalli)

   - Handle IRQ request errors during root port and endpoint probe
     (Siddharth Vadapalli)

  TI J721E PCIe driver:

   - Add DT 'ti,syscon-acspcie-proxy-ctrl' and driver support to enable
     the ACSPCIE module to drive Refclk for the Endpoint (Siddharth
     Vadapalli)

   - Extract the cadence link setup from cdns_pcie_host_setup() so link
     setup can be done separately during resume (Thomas Richard)

   - Add T_PERST_CLK_US definition for the mandatory delay between
     Refclk becoming stable and PERST# being deasserted (Thomas Richard)

   - Add j721e suspend and resume support (Théo Lebrun)

  TI Keystone PCIe controller driver:

   - Fix NULL pointer checking when applying MRRS limitation quirk for
     AM65x SR 1.0 Errata #i2037 (Dan Carpenter)

  Xilinx NWL PCIe controller driver:

   - Fix off-by-one error in INTx IRQ handler that caused INTx
     interrupts to be lost or delivered as the wrong interrupt (Sean
     Anderson)

   - Rate-limit misc interrupt messages (Sean Anderson)

   - Turn off the clock on probe failure and device removal (Sean
     Anderson)

   - Add DT binding and driver support for enabling/disabling PHYs (Sean
     Anderson)

   - Add PCIe phy bindings for the ZCU102 (Sean Anderson)

  Xilinx XDMA PCIe controller driver:

   - Add support for Xilinx QDMA Soft IP PCIe Root Port Bridge to DT
     binding and xilinx-dma-pl driver (Thippeswamy Havalige)

  Miscellaneous:

   - Fix buffer overflow in kirin_pcie_parse_port() (Alexandra Diupina)

   - Fix minor kerneldoc issues and typos (Bjorn Helgaas)

   - Use PCI_DEVID() macro in aer_inject() instead of open-coding it
     (Jinjie Ruan)

   - Check pcie_find_root_port() return in x86 fixups to avoid NULL
     pointer dereferences (Samasth Norway Ananda)

   - Make pci_bus_type constant (Kunwu Chan)

   - Remove unused declarations of __pci_pme_wakeup() and
     pci_vpd_release() (Yue Haibing)

   - Remove any leftover .*.cmd files with make clean (zhang jiao)

   - Remove unused BILLION macro (zhang jiao)"

* tag 'pci-v6.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (132 commits)
  PCI: Fix typos
  dt-bindings: PCI: qcom: Allow 'vddpe-3v3-supply' again
  tools: PCI: Remove unused BILLION macro
  tools: PCI: Remove .*.cmd files with make clean
  PCI: Pass domain number to pci_bus_release_domain_nr() explicitly
  PCI: dra7xx: Fix error handling when IRQ request fails in probe
  PCI: dra7xx: Fix threaded IRQ request for "dra7xx-pcie-main" IRQ
  PCI: qcom: Add RX lane margining settings for 16.0 GT/s
  PCI: qcom: Add equalization settings for 16.0 GT/s
  PCI: dwc: Always cache the maximum link speed value in dw_pcie::max_link_speed
  PCI: dwc: Rename 'dw_pcie::link_gen' to 'dw_pcie::max_link_speed'
  PCI: qcom-ep: Enable controller resources like PHY only after refclk is available
  PCI: Mark Creative Labs EMU20k2 INTx masking as broken
  dt-bindings: PCI: imx6q-pcie: Add reg-name "dbi2" and "atu" for i.MX8M PCIe Endpoint
  dt-bindings: PCI: altera: msi: Convert to YAML
  PCI: imx6: Add i.MX8Q PCIe Root Complex (RC) support
  PCI: Rename CRS Completion Status to RRS
  PCI: aardvark: Correct Configuration RRS checking
  PCI: Wait for device readiness with Configuration RRS
  PCI: brcmstb: Sort enums, pcie_offsets[], pcie_cfg_data, .compatible strings
  ...
2024-09-23 12:47:06 -07:00
Mike Snitzer
736cd2c1ae nfs: add "NFS Client and Server Interlock" section to localio.rst
This section answers a new FAQ entry:

9. How does LOCALIO make certain that object lifetimes are managed
   properly given NFSD and NFS operate in different contexts?

   See the detailed "NFS Client and Server Interlock" section below.

The first half of the section details NeilBrown's elegant design
for LOCALIO's nfs_uuid_t based interlock and is heavily based on
Neil's "net namespace refcounting" description here:
  https://marc.info/?l=linux-nfs&m=172498546024767&w=2

The second half of the section details the per-cpu-refcount introduced
to ensure NFSD's nfsd_serv isn't destroyed while in use by a LOCALIO
client.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:31 -04:00
Trond Myklebust
f7128262b1 nfs: add FAQ section to Documentation/filesystems/nfs/localio.rst
Add a FAQ section to give answers to questions that have been raised
during review of the localio feature.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Co-developed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:31 -04:00
Mike Snitzer
92945bd81c nfs: add Documentation/filesystems/nfs/localio.rst
This document gives an overview of the LOCALIO auxiliary RPC protocol
added to the Linux NFS client and server to allow them to reliably
handshake to determine if they are on the same host.

Once an NFS client and server handshake as "local", the client will
bypass the network RPC protocol for read, write and commit operations.
Due to this XDR and RPC bypass, these operations will operate faster.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:31 -04:00
Mike Snitzer
56bcd0f07f nfs: implement client support for NFS_LOCALIO_PROGRAM
The LOCALIO auxiliary RPC protocol consists of a single "UUID_IS_LOCAL"
RPC method that allows the Linux NFS client to verify the local Linux
NFS server can see the nonce (single-use UUID) the client generated and
made available in nfs_common for subsequent lookup and verification
by the NFS server.  If matched, the NFS server populates members in the
nfs_uuid_t struct.  The NFS client then transfers these nfs_uuid_t
struct member pointers to the nfs_client struct and cleans up the
nfs_uuid_t struct.  See: fs/nfs/localio.c:nfs_local_probe()

This protocol isn't part of an IETF standard, nor does it need to be
considering it is Linux-to-Linux auxiliary RPC protocol that amounts
to an implementation detail.

Localio is only supported when UNIX-style authentication (AUTH_UNIX, aka
AUTH_SYS) is used (enforced by fs/nfs/localio.c:nfs_local_probe()).

The UUID_IS_LOCAL method encodes the client generated uuid_t in terms of
the fixed UUID_SIZE (16 bytes).  The fixed size opaque encode and decode
XDR methods are used instead of the less efficient variable sized
methods.

Having a nonce (single-use uuid) is better than using the same uuid
for the life of the server, and sending it proactively by client
rather than reactively by the server is also safer.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Co-developed-by: NeilBrown <neilb@suse.de>
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
Trond Myklebust
b9f5dd57f4 nfs/localio: use dedicated workqueues for filesystem read and write
For localio access, don't call filesystem read() and write() routines
directly.  This solves two problems:

1) localio writes need to use a normal (non-memreclaim) unbound
   workqueue.  This avoids imposing new requirements on how underlying
   filesystems process frontend IO, which would cause a large amount
   of work to update all filesystems.  Without this change, when XFS
   starts getting low on space, XFS flushes work on a non-memreclaim
   work queue, which causes a priority inversion problem:

00573 workqueue: WQ_MEM_RECLAIM writeback:wb_workfn is flushing !WQ_MEM_RECLAIM xfs-sync/vdc:xfs_flush_inodes_worker
00573 WARNING: CPU: 6 PID: 8525 at kernel/workqueue.c:3706 check_flush_dependency+0x2a4/0x328
00573 Modules linked in:
00573 CPU: 6 PID: 8525 Comm: kworker/u71:5 Not tainted 6.10.0-rc3-ktest-00032-g2b0a133403ab #18502
00573 Hardware name: linux,dummy-virt (DT)
00573 Workqueue: writeback wb_workfn (flush-0:33)
00573 pstate: 400010c5 (nZcv daIF -PAN -UAO -TCO -DIT +SSBS BTYPE=--)
00573 pc : check_flush_dependency+0x2a4/0x328
00573 lr : check_flush_dependency+0x2a4/0x328
00573 sp : ffff0000c5f06bb0
00573 x29: ffff0000c5f06bb0 x28: ffff0000c998a908 x27: 1fffe00019331521
00573 x26: ffff0000d0620900 x25: ffff0000c5f06ca0 x24: ffff8000828848c0
00573 x23: 1fffe00018be0d8e x22: ffff0000c1210000 x21: ffff0000c75fde00
00573 x20: ffff800080bfd258 x19: ffff0000cad63400 x18: ffff0000cd3a4810
00573 x17: 0000000000000000 x16: 0000000000000000 x15: ffff800080508d98
00573 x14: 0000000000000000 x13: 204d49414c434552 x12: 1fffe0001b6eeab2
00573 x11: ffff60001b6eeab2 x10: dfff800000000000 x9 : ffff60001b6eeab3
00573 x8 : 0000000000000001 x7 : 00009fffe491154e x6 : ffff0000db775593
00573 x5 : ffff0000db775590 x4 : ffff0000db775590 x3 : 0000000000000000
00573 x2 : 0000000000000027 x1 : ffff600018be0d62 x0 : dfff800000000000
00573 Call trace:
00573  check_flush_dependency+0x2a4/0x328
00573  __flush_work+0x184/0x5c8
00573  flush_work+0x18/0x28
00573  xfs_flush_inodes+0x68/0x88
00573  xfs_file_buffered_write+0x128/0x6f0
00573  xfs_file_write_iter+0x358/0x448
00573  nfs_local_doio+0x854/0x1568
00573  nfs_initiate_pgio+0x214/0x418
00573  nfs_generic_pg_pgios+0x304/0x480
00573  nfs_pageio_doio+0xe8/0x240
00573  nfs_pageio_complete+0x160/0x480
00573  nfs_writepages+0x300/0x4f0
00573  do_writepages+0x12c/0x4a0
00573  __writeback_single_inode+0xd4/0xa68
00573  writeback_sb_inodes+0x470/0xcb0
00573  __writeback_inodes_wb+0xb0/0x1d0
00573  wb_writeback+0x594/0x808
00573  wb_workfn+0x5e8/0x9e0
00573  process_scheduled_works+0x53c/0xd90
00573  worker_thread+0x370/0x8c8
00573  kthread+0x258/0x2e8
00573  ret_from_fork+0x10/0x20

2) Some filesystem writeback routines can end up taking up a lot of
   stack space (particularly XFS).  Instead of risking running over
   due to the extra overhead from the NFS stack, we should just call
   these routines from a workqueue job.  Since we need to do this to
   address 1) above we're able to avoid possibly blowing the stack
   "for free".

Use of dedicated workqueues improves performance over using the
system_unbound_wq.

Also, the creds used to open the file are used to override_creds() in
both nfs_local_call_read() and nfs_local_call_write() -- otherwise the
workqueue could have elevated capabilities (which the caller may not).

Lastly, care is taken to set PF_LOCAL_THROTTLE | PF_MEMALLOC_NOIO in
nfs_do_local_write() to avoid writeback deadlocks.

The PF_LOCAL_THROTTLE flag prevents deadlocks in balance_dirty_pages()
by causing writes to only be throttled against other writes to the
same bdi (it keeps the throttling local).  Normally all writes to
bdi(s) are throttled equally (after throughput factors are allowed
for).

The PF_MEMALLOC_NOIO flag prevents the lower filesystem IO from
causing memory reclaim to re-enter filesystems or IO devices and so
prevents deadlocks from occuring where IO that cleans pages is
waiting on IO to complete.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Co-developed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Co-developed-by: NeilBrown <neilb@suse.de>
Signed-off-by: NeilBrown <neilb@suse.de> # eliminated wait_for_completion
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
Trond Myklebust
d488b9d01f pnfs/flexfiles: enable localio support
If the DS is local to this client use localio to write the data.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
Trond Myklebust
fa88a7d6ae nfs: enable localio for non-pNFS IO
Try a local open of the file being written to, and if it succeeds,
then use localio to issue IO.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
Weston Andros Adamson
70ba381e1a nfs: add LOCALIO support
Add client support for bypassing NFS for localhost reads, writes, and
commits. This is only useful when the client and the server are
running on the same host.

nfs_local_probe() is stubbed out, later commits will enable client and
server handshake via a Linux-only LOCALIO auxiliary RPC protocol.

This has dynamic binding with the nfsd module (via nfs_localio module
which is part of nfs_common). LOCALIO will only work if nfsd is
already loaded.

The "localio_enabled" nfs kernel module parameter can be used to
disable and enable the ability to use LOCALIO support.

CONFIG_NFS_LOCALIO enables NFS client support for LOCALIO.

Lastly, LOCALIO uses an nfsd_file to initiate all IO. To make proper
use of nfsd_file (and nfsd's filecache) its lifetime (duration before
nfsd_file_put is called) must extend until after commit, read and
write operations. So rather than immediately drop the nfsd_file
reference in nfs_local_open_fh(), that doesn't happen until
nfs_local_pgio_release() for read/write and not until
nfs_local_release_commit_data() for commit. The same applies to the
reference held on nfsd's nn->nfsd_serv. Both objects' lifetimes and
associated references are managed through calls to
nfs_to->nfsd_file_put_local().

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Co-developed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: NeilBrown <neilb@suse.de> # nfs_open_local_fh
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
Mike Snitzer
df24c483e2 nfs: pass struct nfsd_file to nfs_init_pgio and nfs_init_commit
The nfsd_file will be passed, in future commits, by callers
that enable LOCALIO support (for both regular NFS and pNFS IO).

[Derived from patch authored by Weston Andros Adamson, but switched
 from passing struct file to struct nfsd_file]

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
Mike Snitzer
946af9b3a0 nfsd: implement server support for NFS_LOCALIO_PROGRAM
The LOCALIO auxiliary RPC protocol consists of a single "UUID_IS_LOCAL"
RPC method that allows the Linux NFS client to verify the local Linux
NFS server can see the nonce (single-use UUID) the client generated and
made available in nfs_common.  The server expects this protocol to use
the same transport as NFS and NFSACL for its RPCs.  This protocol
isn't part of an IETF standard, nor does it need to be considering it
is Linux-to-Linux auxiliary RPC protocol that amounts to an
implementation detail.

The UUID_IS_LOCAL method encodes the client generated uuid_t in terms of
the fixed UUID_SIZE (16 bytes).  The fixed size opaque encode and decode
XDR methods are used instead of the less efficient variable sized
methods.

The RPC program number for the NFS_LOCALIO_PROGRAM is 400122 (as assigned
by IANA, see https://www.iana.org/assignments/rpc-program-numbers/ ):
Linux Kernel Organization       400122  nfslocalio

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
[neilb: factored out and simplified single localio protocol]
Co-developed-by: NeilBrown <neilb@suse.de>
Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
Weston Andros Adamson
fa4983862e nfsd: add LOCALIO support
Add server support for bypassing NFS for localhost reads, writes, and
commits. This is only useful when both the client and server are
running on the same host.

If nfsd_open_local_fh() fails then the NFS client will both retry and
fallback to normal network-based read, write and commit operations if
localio is no longer supported.

Care is taken to ensure the same NFS security mechanisms are used
(authentication, etc) regardless of whether localio or regular NFS
access is used.  The auth_domain established as part of the traditional
NFS client access to the NFS server is also used for localio.  Store
auth_domain for localio in nfsd_uuid_t and transfer it to the client
if it is local to the server.

Relative to containers, localio gives the client access to the network
namespace the server has.  This is required to allow the client to
access the server's per-namespace nfsd_net struct.

This commit also introduces the use of NFSD's percpu_ref to interlock
nfsd_destroy_serv and nfsd_open_local_fh, to ensure nn->nfsd_serv is
not destroyed while in use by nfsd_open_local_fh and other LOCALIO
client code.

CONFIG_NFS_LOCALIO enables NFS server support for LOCALIO.

Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Co-developed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Co-developed-by: NeilBrown <neilb@suse.de>
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
Mike Snitzer
a61e147e6b nfs_common: prepare for the NFS client to use nfsd_file for LOCALIO
The next commit will introduce nfsd_open_local_fh() which returns an
nfsd_file structure.  This commit exposes LOCALIO's required NFSD
symbols to the NFS client:

- Make nfsd_open_local_fh() symbol and other required NFSD symbols
  available to NFS in a global 'nfs_to' nfsd_localio_operations
  struct (global access suggested by Trond, nfsd_localio_operations
  suggested by NeilBrown).  The next commit will also introduce
  nfsd_localio_ops_init() that init_nfsd() will call to initialize
  'nfs_to'.

- Introduce nfsd_file_file() that provides access to nfsd_file's
  backing file.  Keeps nfsd_file structure opaque to NFS client (as
  suggested by Jeff Layton).

- Introduce nfsd_file_put_local() that will put the reference to the
  nfsd_file's associated nn->nfsd_serv and then put the reference to
  the nfsd_file (as suggested by NeilBrown).

Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com> # nfs_to
Suggested-by: NeilBrown <neilb@suse.de> # nfsd_localio_operations
Suggested-by: Jeff Layton <jlayton@kernel.org> # nfsd_file_file
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
Mike Snitzer
2a33a85be4 nfs_common: add NFS LOCALIO auxiliary protocol enablement
fs/nfs_common/nfslocalio.c provides interfaces that enable an NFS
client to generate a nonce (single-use UUID) and associated nfs_uuid_t
struct, register it with nfs_common for subsequent lookup and
verification by the NFS server and if matched the NFS server populates
members in the nfs_uuid_t struct.

nfs_common's nfs_uuids list is the basis for localio enablement, as
such it has members that point to nfsd memory for direct use by the
client (e.g. 'net' is the server's network namespace, through it the
client can access nn->nfsd_serv).

This commit also provides the base nfs_uuid_t interfaces to allow
proper net namespace refcounting for the LOCALIO use case.

CONFIG_NFS_LOCALIO controls the nfs_common, NFS server and NFS client
enablement for LOCALIO. If both NFS_FS=m and NFSD=m then
NFS_COMMON_LOCALIO_SUPPORT=m and nfs_localio.ko is built (and provides
nfs_common's LOCALIO support).

  # lsmod | grep nfs_localio
  nfs_localio            12288  2 nfsd,nfs
  sunrpc                745472  35 nfs_localio,nfsd,auth_rpcgss,lockd,nfsv3,nfs

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Co-developed-by: NeilBrown <neilb@suse.de>
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
NeilBrown
86ab08beb3 SUNRPC: replace program list with program array
A service created with svc_create_pooled() can be given a linked list of
programs and all of these will be served.

Using a linked list makes it cumbersome when there are several programs
that can be optionally selected with CONFIG settings.

After this patch is applied, API consumers must use only
svc_create_pooled() when creating an RPC service that listens for more
than one RPC program.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
Weston Andros Adamson
199f212874 SUNRPC: add svcauth_map_clnt_to_svc_cred_local
Add new funtion svcauth_map_clnt_to_svc_cred_local which maps a
generic cred to a svc_cred suitable for use in nfsd.

This is needed by the localio code to map nfs client creds to nfs
server credentials.

Following from net/sunrpc/auth_unix.c:unx_marshal() it is clear that
->fsuid and ->fsgid must be used (rather than ->uid and ->gid).  In
addition, these uid and gid must be translated with from_kuid_munged()
so local client uses correct uid and gid when acting as local server.

Jeff Layton noted:
  This is where the magic happens. Since we're working in
  kuid_t/kgid_t, we don't need to worry about further idmapping.

Suggested-by: NeilBrown <neilb@suse.de> # to approximate unx_marshal()
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Co-developed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
Mike Snitzer
2c8919848d SUNRPC: remove call_allocate() BUG_ONs
Remove BUG_ON if p_arglen=0 to allow RPC with void arg.
Remove BUG_ON if p_replen=0 to allow RPC with void return.

The former was needed for the first revision of the LOCALIO protocol
which had an RPC that took a void arg:

    /* raw RFC 9562 UUID */
    typedef u8 uuid_t<UUID_SIZE>;

    program NFS_LOCALIO_PROGRAM {
        version LOCALIO_V1 {
            void
                NULL(void) = 0;

            uuid_t
                GETUUID(void) = 1;
        } = 1;
    } = 400122;

The latter is needed for the final revision of the LOCALIO protocol
which has a UUID_IS_LOCAL RPC which returns a void:

    /* raw RFC 9562 UUID */
    typedef u8 uuid_t<UUID_SIZE>;

    program NFS_LOCALIO_PROGRAM {
        version LOCALIO_V1 {
            void
                NULL(void) = 0;

            void
                UUID_IS_LOCAL(uuid_t) = 1;
        } = 1;
    } = 400122;

There is really no value in triggering a BUG_ON in response to either
of these previously unsupported conditions.

NeilBrown would like the entire 'if (proc->p_proc != 0)' branch
removed (not just the one BUG_ON that must be removed for LOCALIO's
immediate needs of returning void).

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
Mike Snitzer
47e988147f nfsd: add nfsd_serv_try_get and nfsd_serv_put
Introduce nfsd_serv_try_get and nfsd_serv_put and update the nfsd code
to prevent nfsd_destroy_serv from destroying nn->nfsd_serv until any
caller of nfsd_serv_try_get releases their reference using nfsd_serv_put.

A percpu_ref is used to implement the interlock between
nfsd_destroy_serv and any caller of nfsd_serv_try_get.

This interlock is needed to properly wait for the completion of client
initiated localio calls to nfsd (that are _not_ in the context of nfsd).

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
NeilBrown
c63f0e48fe nfsd: add nfsd_file_acquire_local()
nfsd_file_acquire_local() can be used to look up a file by filehandle
without having a struct svc_rqst.  This can be used by NFS LOCALIO to
allow the NFS client to bypass the NFS protocol to directly access a
file provided by the NFS server which is running in the same kernel.

In nfsd_file_do_acquire() care is taken to always use fh_verify() if
rqstp is not NULL (as is the case for non-LOCALIO callers).  Otherwise
the non-LOCALIO callers will not supply the correct and required
arguments to __fh_verify (e.g. gssclient isn't passed).

Introduce fh_verify_local() wrapper around __fh_verify to make it
clear that LOCALIO is intended caller.

Also, use GC for nfsd_file returned by nfsd_file_acquire_local.  GC
offers performance improvements if/when a file is reopened before
launderette cleans it from the filecache's LRU.

Suggested-by: Jeff Layton <jlayton@kernel.org> # use filecache's GC
Signed-off-by: NeilBrown <neilb@suse.de>
Co-developed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
NeilBrown
5e66d2d92a nfsd: factor out __fh_verify to allow NULL rqstp to be passed
__fh_verify() offers an interface like fh_verify() but doesn't require
a struct svc_rqst *, instead it also takes the specific parts as
explicit required arguments.  So it is safe to call __fh_verify() with
a NULL rqstp, but the net, cred, and client args must not be NULL.

__fh_verify() does not use SVC_NET(), nor does the functions it calls.

Rather than using rqstp->rq_client pass the client and gssclient
explicitly to __fh_verify and then to nfsd_set_fh_dentry().

Lastly, it should be noted that the previous commit prepared for 4
associated tracepoints to only be used if rqstp is not NULL (this is a
stop-gap that should be properly fixed so localio also benefits from
the utility these tracepoints provide when debugging fh_verify
issues).

Signed-off-by: NeilBrown <neilb@suse.de>
Co-developed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
Chuck Lever
71c61a0077 NFSD: Short-circuit fh_verify tracepoints for LOCALIO
LOCALIO will be able to call fh_verify() with a NULL rqstp. In this
case, the existing trace points need to be skipped because they
want to dereference the address fields in the passed-in rqstp.

Temporarily make these trace points conditional to avoid a seg
fault in this case. Putting the "rqstp != NULL" check in the trace
points themselves makes the check more efficient.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Acked-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
Chuck Lever
7c0b07b49b NFSD: Avoid using rqstp->rq_vers in nfsd_set_fh_dentry()
Currently, fh_verify() makes some daring assumptions about which
version of file handle the caller wants, based on the things it can
find in the passed-in rqstp. The about-to-be-introduced LOCALIO use
case sometimes has no svc_rqst context, so this logic won't work in
that case.

Instead, examine the passed-in file handle. It's .max_size field
should carry information to allow nfsd_set_fh_dentry() to initialize
the file handle appropriately.

The file handle used by lockd and the one created by write_filehandle
never need any of the version-specific fields (which affect things
like write and getattr requests and pre/post attributes).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
NeilBrown
b0d87dbd8b NFSD: Refactor nfsd_setuser_and_check_port()
There are several places where __fh_verify unconditionally dereferences
rqstp to check that the connection is suitably secure.  They look at
rqstp->rq_xprt which is not meaningful in the target use case of
"localio" NFS in which the client talks directly to the local server.

Prepare these to always succeed when rqstp is NULL.

Signed-off-by: NeilBrown <neilb@suse.de>
Co-developed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:30 -04:00
NeilBrown
0a183f24a7 NFSD: Handle @rqstp == NULL in check_nfsd_access()
LOCALIO-initiated open operations are not running in an nfsd thread
and thus do not have an associated svc_rqst context.

Signed-off-by: NeilBrown <neilb@suse.de>
Co-developed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:29 -04:00
Mike Snitzer
1545e488b1 nfs: factor out {encode,decode}_opaque_fixed to nfs_xdr.h
Eliminates duplicate functions in various files to allow for
additional callers.

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:29 -04:00
Mike Snitzer
1fcb16674e nfs_common: factor out nfs4_errtbl and nfs4_stat_to_errno
Common nfs4_stat_to_errno() is used by fs/nfs/nfs4xdr.c and will be
used by fs/nfs/localio.c

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:29 -04:00
Mike Snitzer
4806ded4c1 nfs_common: factor out nfs_errtbl and nfs_stat_to_errno
Common nfs_stat_to_errno() is used by both fs/nfs/nfs2xdr.c and
fs/nfs/nfs3xdr.c

Will also be used by fs/nfsd/localio.c

Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:29 -04:00
Dan Aloni
dfb07e990a nfs: add 'noalignwrite' option for lock-less 'lost writes' prevention
There are some applications that write to predefined non-overlapping
file offsets from multiple clients and therefore don't need to rely on
file locking. However, if these applications want non-aligned offsets
and sizes they need to either use locks or risk data corruption, as the
NFS client defaults to extending writes to whole pages.

This commit adds a new mount option `noalignwrite`, which allows to turn
that off and avoid the need of locking, as long as these applications
don't overlap on offsets.

Signed-off-by: Dan Aloni <dan.aloni@vastdata.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:13 -04:00
Li Lingfeng
6d26c5e4d8 nfs: fix the comment of nfs_get_root
The comment for nfs_get_root() needs to be updated as it would also be
used by NFS4 as follows:
@x[
    nfs_get_root+1
    nfs_get_tree_common+1819
    nfs_get_tree+2594
    vfs_get_tree+73
    fc_mount+23
    do_nfs4_mount+498
    nfs4_try_get_tree+134
    nfs_get_tree+2562
    vfs_get_tree+73
    path_mount+2776
    do_mount+226
    __se_sys_mount+343
    __x64_sys_mount+106
    do_syscall_64+69
    entry_SYSCALL_64_after_hwframe+97
, mount.nfs4]: 1

Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:13 -04:00
Roi Azarzar
615e693b14 NFSv4.2: Fix detection of "Proxying of Times" server support
According to draft-ietf-nfsv4-delstid-07:
   If a server informs the client via the fattr4_open_arguments
   attribute that it supports
   OPEN_ARGS_SHARE_ACCESS_WANT_DELEG_TIMESTAMPS and it returns a valid
   delegation stateid for an OPEN operation which sets the
   OPEN4_SHARE_ACCESS_WANT_DELEG_TIMESTAMPS flag, then it MUST query the
   client via a CB_GETATTR for the fattr4_time_deleg_access (see
   Section 5.2) attribute and fattr4_time_deleg_modify attribute (see
   Section 5.2).

Thus, we should look that the server supports proxying of times via
OPEN4_SHARE_ACCESS_WANT_DELEG_TIMESTAMPS.

We want to be extra pedantic and continue to check that FATTR4_TIME_DELEG_ACCESS
and FATTR4_TIME_DELEG_MODIFY are set. The server needs to expose both for the
client to correctly detect "Proxying of Times" support.

Signed-off-by: Roi Azarzar <roi.azarzar@vastdata.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Fixes: dcb3c20f74 ("NFSv4: Add a capability for delegated attributes")
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:13 -04:00
Trond Myklebust
af94dca79b NFSv4: Fail mounts if the lease setup times out
If the server is down when the client is trying to mount, so that the
calls to exchange_id or create_session fail, then we should allow the
mount system call to fail rather than hang and block other mount/umount
calls.

Reported-by: Oleksandr Tymoshenko <ovt@google.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:13 -04:00
Zhaoyang Huang
03e02b9417 fs: nfs: fix missing refcnt by replacing folio_set_private by folio_attach_private
This patch is inspired by a code review of fs codes which aims at
folio's extra refcnt that could introduce unwanted behavious when
judging refcnt, such as[1].That is, the folio passed to
mapping_evict_folio carries the refcnts from find_lock_entries,
page_cache, corresponding to PTEs and folio's private if has. However,
current code doesn't take the refcnt for folio's private which could
have mapping_evict_folio miss the one to only PTE and lead to
call filemap_release_folio wrongly.

[1]
long mapping_evict_folio(struct address_space *mapping, struct folio *folio)
{
...
//current code will misjudge here if there is one pte on the folio which
is be deemed as the one as folio's private
        if (folio_ref_count(folio) >
                        folio_nr_pages(folio) + folio_has_private(folio) + 1)
                return 0;
        if (!filemap_release_folio(folio, 0))
                return 0;

        return remove_mapping(mapping, folio);
}

Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:13 -04:00
Gaosheng Cui
40c80881eb nfs: Remove obsoleted declaration for nfs_read_prepare
The nfs_read_prepare() have been removed since
commit a4cdda5911 ("NFS: Create a common pgio_rpc_prepare function"),
and now it is useless, so remove it.

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:13 -04:00
Hongbo Li
64a3ab9967 net/sunrpc: make use of the helper macro LIST_HEAD()
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:13 -04:00
Siddh Raman Pant
2e001972e8 SUNRPC: clnt.c: Remove misleading comment
destroy_wait doesn't store all RPC clients. There was a list named
"all_clients" above it, which got moved to struct sunrpc_net in 2012,
but the comment was never removed.

Fixes: 70abc49b4f ("SUNRPC: make SUNPRC clients list per network namespace context")
Signed-off-by: Siddh Raman Pant <siddh.raman.pant@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:13 -04:00
Stephen Brennan
0b108e8379 SUNRPC: convert RPC_TASK_* constants to enum
The RPC_TASK_* constants are defined as macros, which means that most
kernel builds will not contain their definitions in the debuginfo.
However, it's quite useful for debuggers to be able to view the task
state constant and interpret it correctly. Conversion to an enum will
ensure the constants are present in debuginfo and can be interpreted by
debuggers without needing to hard-code them and track their changes.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:13 -04:00
Kunwu Chan
9090a7f786 SUNRPC: Fix -Wformat-truncation warning
Increase size of the servername array to avoid truncated output warning.

net/sunrpc/clnt.c:582:75: error:‘%s’ directive output may be truncated
writing up to 107 bytes into a region of size 48
[-Werror=format-truncation=]
  582 |                   snprintf(servername, sizeof(servername), "%s",
      |                                                             ^~

net/sunrpc/clnt.c:582:33: note:‘snprintf’ output
between 1 and 108 bytes into a destination of size 48
  582 |                     snprintf(servername, sizeof(servername), "%s",
      |                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  583 |                                          sun->sun_path);

Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Suggested-by: NeilBrown <neilb@suse.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:13 -04:00
Thorsten Blum
e343678ee9 nfs: Remove unnecessary NULL check before kfree()
Since kfree() already checks if its argument is NULL, an additional
check before calling kfree() is unnecessary and can be removed.

Remove it and thus also the following Coccinelle/coccicheck warning
reported by ifnullfree.cocci:

  WARNING: NULL check before some freeing functions is not needed

Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:12 -04:00
Thorsten Blum
bb8e4ce500 nfs: Annotate struct nfs_cache_array with __counted_by()
Add the __counted_by compiler attribute to the flexible array member
array to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and
CONFIG_FORTIFY_SOURCE.

Increment size before adding a new struct to the array.

Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:12 -04:00
NeilBrown
d98f722725 nfs: simplify and guarantee owner uniqueness.
I have evidence of an Linux NFS client getting NFS4ERR_BAD_SEQID to a
v4.0 LOCK request to a Linux server (which had fixed the problem with
RELEASE_LOCKOWNER bug fixed).
The LOCK request presented a "new" lock owner so there are two seq ids
in the request: that for the open file, and that for the new lock.
Given the context I am confident that the new lock owner was reported to
have the wrong seqid.  As lock owner identifiers are reused, the server
must still have a lock owner active which the client thinks is no longer
active.

I wasn't able to determine a root-cause but the simplest fix seems to be
to ensure lock owners are always unique much as open owners are (thanks
to a time stamp).  The easiest way to ensure uniqueness is with a 64bit
counter for each server.  That will never cycle (if updated once a
nanosecond the last 584 years.  A single NFS server would not handle
open/lock requests nearly that fast, and a Linux node is unlikely to
have an uptime approaching that).

This patch removes the 2 ida and instead uses a per-server
atomic64_t to provide uniqueness.

Note that the lock owner already encodes the id as 64 bits even though
it is a 32bit value.  So changing to a 64bit value does not change the
encoding of the lock owner.  The open owner encoding is now 4 bytes
larger.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:12 -04:00
Li Lingfeng
8f6a7c9467 nfs: fix memory leak in error path of nfs4_do_reclaim
Commit c77e22834a ("NFSv4: Fix a potential sleep while atomic in
nfs4_do_reclaim()") separate out the freeing of the state owners from
nfs4_purge_state_owners() and finish it outside the rcu lock.
However, the error path is omitted. As a result, the state owners in
"freeme" will not be released.
Fix it by adding freeing in the error path.

Fixes: c77e22834a ("NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim()")
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Cc: stable@vger.kernel.org # v5.3+
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23 15:03:12 -04:00
Linus Torvalds
18ba603446 NFSD 6.12 Release Notes
Notable features of this release include:
 
 - Pre-requisites for automatically determining the RPC server thread
   count
 - Clean-up and preparation for supporting LOCALIO, which will be
   merged via the NFS client tree
 - Enhancements and fixes to NFSv4.2 COPY offload
 - A new Python-based tool for generating kernel SunRPC XDR encoding
   and decoding functions, added as an aid for prototyping features
   in protocols based on the Linux kernel's SunRPC implementation.
 
 As always I am grateful to the NFSD contributors, reviewers,
 testers, and bug reporters who participated during this cycle.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmbxg60ACgkQM2qzM29m
 f5d+9A/+LiXAjR3x1vlbGFiMAW3Alixg5wE6AM7M1I/OH/dBCkWU1gzneWYaUXAk
 cIGp5sH2Uco2mFVswOZyQ3tX8T/2PeY+Kx5qrlK5h0bTUoz95AIyLe3LA/4o4CIL
 qMGlLQyVq9UolggPoRdigsDhKVwLcu3hWaG7ykkTquyrOPLBKgzRNSwVKLpFc/0/
 mQToOf6HLjgFkEUR3pmXAMsVq88/BpjHIXeNhx2Z1ekWslSKjrAu2gC0rc6/s9Wi
 JsTtzSdnqefc2jsNVZ8FT+V7mDF1sxrN4SnHruSLhJsd5tL/3HDkiZEvdG2Sh0nH
 zQlDpMpNbZyCvaWs6jgaZeMRiNSSl7q31zXUgX2bkWpL/EnagujZHtLZroUgLQfA
 BO8HhRqdt1wJohiv2aMlFvnlp+GhSH5FdcXv1cT/CmyTNGqbXENqoCUA1OT9kE55
 RvXVCLD4YbmCb5EpjLavhu/NuFOc9l9GitKlhiJlcX86QAu/C1Bu1DOyqgq5G0VW
 Xl/q7xIvNZz0mh7x8kKVV4bQHsm9pnoNz57CZFPahoHg/+BR4u6p8LepowpaHjHj
 Ef62BzYwQtuw0jCyufDea+uCt5CGwUM3Y5iBiQogtnvFK6ie8WwD0QTI2SYpcWZ/
 T6RwDOX5jlMWWmuibSK2STgwkblG3vAmMot0RtEbZILvB/ld9qw=
 =Ybsc
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd updates from Chuck Lever:
 "Notable features of this release include:

   - Pre-requisites for automatically determining the RPC server thread
     count

   - Clean-up and preparation for supporting LOCALIO, which will be
     merged via the NFS client tree

   - Enhancements and fixes to NFSv4.2 COPY offload

   - A new Python-based tool for generating kernel SunRPC XDR encoding
     and decoding functions, added as an aid for prototyping features in
     protocols based on the Linux kernel's SunRPC implementation

  As always I am grateful to the NFSD contributors, reviewers, testers,
  and bug reporters who participated during this cycle"

* tag 'nfsd-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (57 commits)
  xdrgen: Prevent reordering of encoder and decoder functions
  xdrgen: typedefs should use the built-in string and opaque functions
  xdrgen: Fix return code checking in built-in XDR decoders
  tools: Add xdrgen
  nfsd: fix delegation_blocked() to block correctly for at least 30 seconds
  nfsd: fix initial getattr on write delegation
  nfsd: untangle code in nfsd4_deleg_getattr_conflict()
  nfsd: enforce upper limit for namelen in __cld_pipe_inprogress_downcall()
  nfsd: return -EINVAL when namelen is 0
  NFSD: Wrap async copy operations with trace points
  NFSD: Clean up extra whitespace in trace_nfsd_copy_done
  NFSD: Record the callback stateid in copy tracepoints
  NFSD: Display copy stateids with conventional print formatting
  NFSD: Limit the number of concurrent async COPY operations
  NFSD: Async COPY result needs to return a write verifier
  nfsd: avoid races with wake_up_var()
  nfsd: use clear_and_wake_up_bit()
  sunrpc: xprtrdma: Use ERR_CAST() to return
  NFSD: Annotate struct pnfs_block_deviceaddr with __counted_by()
  nfsd: call cache_put if xdr_reserve_space returns NULL
  ...
2024-09-23 12:01:45 -07:00
Anna Schumaker
8c04a6d6e0 NFSD 6.12 Release Notes
Notable features of this release include:
 
 - Pre-requisites for automatically determining the RPC server thread
   count
 - Clean-up and preparation for supporting LOCALIO, which will be
   merged via the NFS client tree
 - Enhancements and fixes to NFSv4.2 COPY offload
 - A new Python-based tool for generating kernel SunRPC XDR encoding
   and decoding functions, added as an aid for prototyping features
   in protocols based on the Linux kernel's SunRPC implementation.
 
 As always I am grateful to the NFSD contributors, reviewers,
 testers, and bug reporters who participated during this cycle.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmbxg60ACgkQM2qzM29m
 f5d+9A/+LiXAjR3x1vlbGFiMAW3Alixg5wE6AM7M1I/OH/dBCkWU1gzneWYaUXAk
 cIGp5sH2Uco2mFVswOZyQ3tX8T/2PeY+Kx5qrlK5h0bTUoz95AIyLe3LA/4o4CIL
 qMGlLQyVq9UolggPoRdigsDhKVwLcu3hWaG7ykkTquyrOPLBKgzRNSwVKLpFc/0/
 mQToOf6HLjgFkEUR3pmXAMsVq88/BpjHIXeNhx2Z1ekWslSKjrAu2gC0rc6/s9Wi
 JsTtzSdnqefc2jsNVZ8FT+V7mDF1sxrN4SnHruSLhJsd5tL/3HDkiZEvdG2Sh0nH
 zQlDpMpNbZyCvaWs6jgaZeMRiNSSl7q31zXUgX2bkWpL/EnagujZHtLZroUgLQfA
 BO8HhRqdt1wJohiv2aMlFvnlp+GhSH5FdcXv1cT/CmyTNGqbXENqoCUA1OT9kE55
 RvXVCLD4YbmCb5EpjLavhu/NuFOc9l9GitKlhiJlcX86QAu/C1Bu1DOyqgq5G0VW
 Xl/q7xIvNZz0mh7x8kKVV4bQHsm9pnoNz57CZFPahoHg/+BR4u6p8LepowpaHjHj
 Ef62BzYwQtuw0jCyufDea+uCt5CGwUM3Y5iBiQogtnvFK6ie8WwD0QTI2SYpcWZ/
 T6RwDOX5jlMWWmuibSK2STgwkblG3vAmMot0RtEbZILvB/ld9qw=
 =Ybsc
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.12' into linux-next-with-localio

NFSD 6.12 Release Notes

Notable features of this release include:

- Pre-requisites for automatically determining the RPC server thread
  count
- Clean-up and preparation for supporting LOCALIO, which will be
  merged via the NFS client tree
- Enhancements and fixes to NFSv4.2 COPY offload
- A new Python-based tool for generating kernel SunRPC XDR encoding
  and decoding functions, added as an aid for prototyping features
  in protocols based on the Linux kernel's SunRPC implementation.

As always I am grateful to the NFSD contributors, reviewers,
testers, and bug reporters who participated during this cycle.
2024-09-23 15:00:07 -04:00
Linus Torvalds
721068dec4 gfs2 changes
- Eliminate the writepage address space operation (by Matthew Wilcox).
 
 - A syzkaller fix (by Julian Sun) and a minor cleanup (by Andreas
   Gruenbacher).
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmbwIfMUHGFncnVlbmJh
 QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTqj/RAApBJf8Da7U9Qn3rMtCdV3HU8p9VSF
 Fwzr1qVgthNNfFGSaNhTQcTglSL0OLMg8rLaAFN8G0RJrOyJw9Q0GclhLR+wGy5F
 KpvpVczkYh+UphYJ7LhygUE9/cP2GZCSI5EbI+cIifnioY8rztt+e8QqteFpAoFA
 3asfynVWeAnYYh6qIjZBSvi77KWzaxk9Kyv8WScJ//FTNHYYXLMKUhZEXuh0gCdv
 r9VwCnRNTm8X9YtbeaRduC7mSRcVwtV+KCCJ0Lxw292a795g5oWvdwnx4DQg5120
 XvdcGkZOV/sjQE19vhfkJot6kLPhP9PofTWBIbwqMwV/lkxEKlQAPJ5+mnjg56eU
 JekZjibxEewSDkG4riWibBP2WgIfqHkryl8o9a4dZSIjeNfzJYbvhyL2NwPYfFOY
 43VeC6AB6K49gwUtD0gRTOKV/EKEFswRh1Cstvt+hPrpnF7Y/ZUjCptaEk/ZicyK
 qWG/6YVtjXz9HvOo/eojTTlZuwlmh2l63aWwW9s8/aMCei6V5Hs/S3gtuZSg7bKL
 AtfUmwlNLfNSh4VavO6W3SxuCp12OVwPYe0WQPccMshA6fVbsH1QV8LSUS5WKaE7
 TGsevEWEanAm85zlbar2BvIk1PaxRNvNJ7BgZ1LYUwdHvm9/h+JPMZUYqBUDYbZC
 Fz5wF5zjbmeR91k=
 =cuQP
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-v6.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 update from Andreas Gruenbacher:

 - Convert the writepage address space operation to writepages (Matthew
   Wilcox)

 - A syzkaller fix (by Julian Sun) and a minor cleanup (Andreas
   Gruenbacher)

* tag 'gfs2-v6.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: Remove gfs2_aspace_writepage()
  gfs2: Remove gfs2_jdata_writepage()
  gfs2: Remove __gfs2_writepage()
  gfs2: Add gfs2_aspace_writepages()
  gfs2: fix double destroy_workqueue error
  gfs2: Minor gfs2_glock_cb cleanup
2024-09-23 11:55:17 -07:00