Commit Graph

134 Commits

Author SHA1 Message Date
Peter Zijlstra
cdd30ebb1b module: Convert symbol namespace to string literal
Clean up the existing export namespace code along the same lines of
commit 33def8498f ("treewide: Convert macro and uses of __section(foo)
to __section("foo")") and for the same reason, it is not desired for the
namespace argument to be a macro expansion itself.

Scripted using

  git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file;
  do
    awk -i inplace '
      /^#define EXPORT_SYMBOL_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /^#define MODULE_IMPORT_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /MODULE_IMPORT_NS/ {
        $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g");
      }
      /EXPORT_SYMBOL_NS/ {
        if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) {
  	if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ &&
  	    $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ &&
  	    $0 !~ /^my/) {
  	  getline line;
  	  gsub(/[[:space:]]*\\$/, "");
  	  gsub(/[[:space:]]/, "", line);
  	  $0 = $0 " " line;
  	}

  	$0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/,
  		    "\\1(\\2, \"\\3\")", "g");
        }
      }
      { print }' $file;
  done

Requested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc
Acked-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-02 11:34:44 -08:00
Dan Williams
3a2b97b321 cxl/test: Improve init-order fidelity relative to real-world systems
The investigation of an initialization failure [1] highlighted that
cxl_test does not reflect the init-order of real world systems. The
expected order is root/bus first then async probing of the memory
devices.

Fix up cxl_test to reflect that order. While it did not reproduce the
initial bug report (since that is dependent on built-in vs modular
builds), it did reveal a separate latent bug in the subsystem's decoder
shutdown flow. Fix for that sent separately.

Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1]
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/172964784521.81806.15791069994065969243.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-10-25 16:07:04 -05:00
Dan Williams
101c268bd2 cxl/port: Fix use-after-free, permit out-of-order decoder shutdown
In support of investigating an initialization failure report [1],
cxl_test was updated to register mock memory-devices after the mock
root-port/bus device had been registered. That led to cxl_test crashing
with a use-after-free bug with the following signature:

    cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem0:decoder7.0 @ 0 next: cxl_switch_uport.0 nr_eps: 1 nr_targets: 1
    cxl_port_attach_region: cxl region3: cxl_host_bridge.0:port3 decoder3.0 add: mem4:decoder14.0 @ 1 next: cxl_switch_uport.0 nr_eps: 2 nr_targets: 1
    cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[0] = cxl_switch_dport.0 for mem0:decoder7.0 @ 0
1)  cxl_port_setup_targets: cxl region3: cxl_switch_uport.0:port6 target[1] = cxl_switch_dport.4 for mem4:decoder14.0 @ 1
    [..]
    cxld_unregister: cxl decoder14.0:
    cxl_region_decode_reset: cxl_region region3:
    mock_decoder_reset: cxl_port port3: decoder3.0 reset
2)  mock_decoder_reset: cxl_port port3: decoder3.0: out of order reset, expected decoder3.1
    cxl_endpoint_decoder_release: cxl decoder14.0:
    [..]
    cxld_unregister: cxl decoder7.0:
3)  cxl_region_decode_reset: cxl_region region3:
    Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6bc3: 0000 [#1] PREEMPT SMP PTI
    [..]
    RIP: 0010:to_cxl_port+0x8/0x60 [cxl_core]
    [..]
    Call Trace:
     <TASK>
     cxl_region_decode_reset+0x69/0x190 [cxl_core]
     cxl_region_detach+0xe8/0x210 [cxl_core]
     cxl_decoder_kill_region+0x27/0x40 [cxl_core]
     cxld_unregister+0x5d/0x60 [cxl_core]

At 1) a region has been established with 2 endpoint decoders (7.0 and
14.0). Those endpoints share a common switch-decoder in the topology
(3.0). At teardown, 2), decoder14.0 is the first to be removed and hits
the "out of order reset case" in the switch decoder. The effect though
is that region3 cleanup is aborted leaving it in-tact and
referencing decoder14.0. At 3) the second attempt to teardown region3
trips over the stale decoder14.0 object which has long since been
deleted.

The fix here is to recognize that the CXL specification places no
mandate on in-order shutdown of switch-decoders, the driver enforces
in-order allocation, and hardware enforces in-order commit. So, rather
than fail and leave objects dangling, always remove them.

In support of making cxl_region_decode_reset() always succeed,
cxl_region_invalidate_memregion() failures are turned into warnings.
Crashing the kernel is ok there since system integrity is at risk if
caches cannot be managed around physical address mutation events like
CXL region destruction.

A new device_for_each_child_reverse_from() is added to cleanup
port->commit_end after all dependent decoders have been disabled. In
other words if decoders are allocated 0->1->2 and disabled 1->2->0 then
port->commit_end only decrements from 2 after 2 has been disabled, and
it decrements all the way to zero since 1 was disabled previously.

Link: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net [1]
Cc: stable@vger.kernel.org
Fixes: 176baefb2e ("cxl/hdm: Commit decoder state to hardware")
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Zijun Hu <quic_zijuhu@quicinc.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/172964782781.81806.17902885593105284330.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
2024-10-25 16:07:03 -05:00
Al Viro
5f60d5f6bb move asm/unaligned.h to linux/unaligned.h
asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.

auto-generated by the following:

for i in `git grep -l -w asm/unaligned.h`; do
	sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
	sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-10-02 17:23:23 -04:00
Dave Jiang
8d8081cecf cxl: Move mailbox related bits to the same context
Create a new 'struct cxl_mailbox' and move all mailbox related bits to
it. This allows isolation of all CXL mailbox data in order to export
some of the calls to external kernel callers and avoid exporting of CXL
driver specific bits such has device states. The allocation of
'struct cxl_mailbox' is also split out with cxl_mailbox_init() so the
mailbox can be created independently.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/20240905223711.1990186-3-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-12 08:38:01 -07:00
Yanfei Xu
5c6e3d5a5d cxl/pci: Remove duplicated implementation of waiting for memory_info_valid
commit ce17ad0d54 ("cxl: Wait Memory_Info_Valid before access memory
related info") added another implementation, which is
cxl_dvsec_mem_range_valid(), of waiting for memory_info_valid without
realizing it duplicated wait_for_valid(). Remove wait_for_valid() and
retain cxl_dvsec_mem_range_valid() as the former is hardcoded to check
only the Memory_Info_Valid bit of DVSEC range 1, while the latter allows
for selection between DVSEC range 1 or 2 via parameter.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Yanfei Xu <yanfei.xu@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20240828084231.1378789-3-yanfei.xu@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-09 11:33:44 -07:00
Li Ming
577a67662f cxl/pci: Rename cxl_setup_parent_dport() and cxl_dport_map_regs()
The name of cxl_setup_parent_dport() function is not clear, the function
is used to initialize AER and RAS capabilities on a dport, therefore,
rename the function to cxl_dport_init_ras_reporting(), it is easier for
user to understand what the function does. Besides, adjust the order of
the function parameters, the subject of cxl_dport_init_ras_reporting()
is a cxl dport, so a struct cxl_dport as the first parameter of the
function should be better.

cxl_dport_map_regs() is used to map CXL RAS capability on a cxl dport,
using cxl_dport_map_ras() as the function name.

Signed-off-by: Li Ming <ming4.li@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20240830061308.2327065-1-ming4.li@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-03 15:29:33 -07:00
Kunwu Chan
1c1d348b08 tools/testing/cxl: Use dev_is_platform()
Use dev_is_platform() instead of checking bus type directly.

Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20240827095123.168696-1-kunwu.chan@linux.dev
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-09-03 14:40:41 -07:00
Li Ming
2c402bd2e8 cxl/test: Skip cxl_setup_parent_dport() for emulated dports
The cxl_test unit test environment on qemu always hits below call trace
with KASAN enabled:

 BUG: KASAN: slab-out-of-bounds in cxl_setup_parent_dport+0x480/0x530 [cxl_core]
 Read of size 1 at addr ff110000676014f8 by task (udev-worker)/676[   24.424403] CPU: 2 PID: 676 Comm: (udev-worker) Tainted: G           O     N 6.10.0-qemucxl #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20240214-2.el9 02/14/2024
 Call Trace:
  <TASK>
  dump_stack_lvl+0xea/0x150
  print_report+0xce/0x610
  ? kasan_complete_mode_report_info+0x40/0x200
  kasan_report+0xcc/0x110
  __asan_report_load1_noabort+0x18/0x20
  cxl_setup_parent_dport+0x480/0x530 [cxl_core]
  cxl_mem_probe+0x49b/0xaa0 [cxl_mem]

cxl_test module models a CXL topology for testing, it creates some
emulated dports with platform devices in the CXL topology, so the
dport_dev of an emulated dport points to a platform device rather than a
pci device or a pci host bridge in the case. Currently,
cxl_setup_parent_dport() is used to set up RAS and AER capability on the
dport connected to the CXL memory device, but cxl_test does not support
RAS or AER functionality yet, so the fix is implementing a
__wrap_cxl_setup_parent_dport() to filter out all emulated dports,
guarantees only real dports can be handled by cxl_setup_parent_dport().

Fixes: f05fd10d13 ("cxl/pci: Add RCH downstream port AER register discovery")
Reported-by: Pengfei Xu <pengfei.xu@intel.com>
Closes: https://lore.kernel.org/linux-cxl/ZrHTBp2O+HtUe6kt@xpf.sh.intel.com/T/#t
Signed-off-by: Li Ming <ming4.li@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Tested-by: Ira Weiny <ira.weiny@intel.com>
Tested-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20240809082750.3015641-3-ming4.li@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-08-09 15:14:12 -07:00
Linus Torvalds
e62f81bbd2 CXL for v6.11 merge window
New Changes:
 - Refactor to a common struct for DRAM and general media CXL events
 - Add abstract distance calculation support for CXL
 - Add CXL maturity map documentation to detail current state of CXL enabling
 - Add warning on mixed CXL VH and RCH/RCD hierachy to inform unsupported config
 - Replace ENXIO with EBUSY for inject poison limit reached via debugfs
 - Replace ENXIO with EBUSY for inject poison cxl-test support
 - XOR math fixup for DPA to SPA translation. Current math works for MODULO arithmetic
   where HPA==SPA, however not for XOR decode.
 - Move pci config read in cxl_dvsec_rr_decode() to avoid unnecessary acess
 
 Fixes:
 - Add a fix to address race condition in CXL memory hotplug notifier
 - Add missing MODULE_DESCRIPTION() for CXL modules
 - Fix incorrect vendor debug UUID define
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmahJiMACgkQYGjFFmlT
 OEq8URAArcnzmH9yLvgE2pFOtaKg34vIGDWZGC4R1LpTnFEea04FuJslmxEKNgWo
 DJgPt9VZ66ump/oIvzcbvgLl/yMCTbnSxt5U6J8G5EmpO50PvxOTeWnEgAYVa0NH
 Diuzk/aF4GA94T3w+iAOzYx2N36kF+ezsY3/kqSORT7MC+DipSSUaPUiJcjr6FC6
 /ZIwkhhRi51ONJ8IgaXD+oEU9kxx7WUEyZoQZrJ9bv8/fGbeEfqy04pz2xDKHmLD
 rlQjm3l9um67VMsCvZ62Ce14HXqM213jZ3l0FmYjO4GbdXd2+0ZmIRNAb5vvTG9n
 5cY8vNsL6fND9FKkxlcRSdzI/O/vV+gcU+jzJxiul0p5fWHh/gaYjVH7fFq3dYc+
 vYE5lr97BfyA61bdmylIc2xwDH4yNKVQLZZPVTz5XTxfzBjYCjLPb5vGQKfg/nrB
 N66wjCIWLfCH6DqusUXem1c6BSrrjob8MwXpg00eBE0AA4ihieiy5fxuApnv9mI2
 f809AXRV1k24s5upStZ9iGZSEILBBqiw/KwDyWfRvxjNz36Z1Q2eiXBwbHrVQHBa
 PFtRPPFsZ9+ouIG/8otFaLwDQdITRdA0+drG8lmJ+gs8239Z3eIMMS0+CYdLDbva
 S8vo4POOQSS+cVUjLkC9zIxwPaXq96TLIkCtiLI9xUx5eIzv4K0=
 =HaEG
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL updates from Dave Jiang:
 "Core:

   - A CXL maturity map has been added to the documentation to detail
     the current state of CXL enabling.

     It provides the status of the current state of various CXL features
     to inform current and future contributors of where things are and
     which areas need contribution.

   - A notifier handler has been added in order for a newly created CXL
     memory region to trigger the abstract distance metrics calculation.

     This should bring parity for CXL memory to the same level vs
     hotplugged DRAM for NUMA abstract distance calculation. The
     abstract distance reflects relative performance used for memory
     tiering handling.

   - An addition for XOR math has been added to address the CXL DPA to
     SPA translation.

     CXL address translation did not support address interleave math
     with XOR prior to this change.

  Fixes:

   - Fix to address race condition in the CXL memory hotplug notifier

   - Add missing MODULE_DESCRIPTION() for CXL modules

   - Fix incorrect vendor debug UUID define

  Misc:

   - A warning has been added to inform users of an unsupported
     configuration when mixing CXL VH and RCH/RCD hierarchies

   - The ENXIO error code has been replaced with EBUSY for inject poison
     limit reached via debugfs and cxl-test support

   - Moving the PCI config read in cxl_dvsec_rr_decode() to avoid
     unnecessary PCI config reads

   - A refactor to a common struct for DRAM and general media CXL
     events"

* tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/core/pci: Move reading of control register to immediately before usage
  cxl: Remove defunct code calculating host bridge target positions
  cxl/region: Verify target positions using the ordered target list
  cxl: Restore XOR'd position bits during address translation
  cxl/core: Fold cxl_trace_hpa() into cxl_dpa_to_hpa()
  cxl/test: Replace ENXIO with EBUSY for inject poison limit reached
  cxl/memdev: Replace ENXIO with EBUSY for inject poison limit reached
  cxl/acpi: Warn on mixed CXL VH and RCH/RCD Hierarchy
  cxl/core: Fix incorrect vendor debug UUID define
  Documentation: CXL Maturity Map
  cxl/region: Simplify cxl_region_nid()
  cxl/region: Support to calculate memory tier abstract distance
  cxl/region: Fix a race condition in memory hotplug notifier
  cxl: add missing MODULE_DESCRIPTION() macros
  cxl/events: Use a common struct for DRAM and General Media events
2024-07-28 09:33:28 -07:00
Alison Schofield
3a8617c7df cxl/test: Replace ENXIO with EBUSY for inject poison limit reached
The CXL driver was recently updated to return EBUSY rather than
ENXIO when the device reports that an injection request exceeds
the device's limit. That change to EBUSY allows debug users to
differentiate between limit reached and inject failures for any
other reason.

Change cxl-test to also return EBUSY and tidy up the dev_dbg()
messaging to emit the correct limit.

Reminder: the cxl-test per device injection limit is a configurable
attribute: /sys/bus/platform/drivers/cxl_mock_mem/poison_inject_max

Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Tested-by: Xingtao Yao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://patch.msgid.link/ba1b80e1658b644d85d0d5e2287112d00a48b9cf.1720316188.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-07-10 17:12:42 -07:00
Fabio M. De Francesco
675e979db4 cxl/events: Use a common struct for DRAM and General Media events
cxl_event_common was an unfortunate naming choice and caused confusion with
the existing Common Event Record. Furthermore, its fields didn't map all
the common information between DRAM and General Media Events.

Remove cxl_event_common and introduce cxl_event_media_hdr to record common
information between DRAM and General Media events.

cxl_event_media_hdr, which is embedded in both cxl_event_gen_media and
cxl_event_dram, leverages the commonalities between the two events to
simplify their respective handling.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240607144423.48681-1-fabio.m.de.francesco@linux.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-07-02 12:52:25 -07:00
Yao Xingtao
84328c5ace cxl/region: check interleave capability
Since interleave capability is not verified, if the interleave
capability of a target does not match the region need, committing decoder
should have failed at the device end.

In order to checkout this error as quickly as possible, driver needs
to check the interleave capability of target during attaching it to
region.

Per CXL specification r3.1(8.2.4.20.1 CXL HDM Decoder Capability Register),
bits 11 and 12 indicate the capability to establish interleaving in 3, 6,
12 and 16 ways. If these bits are not set, the target cannot be attached to
a region utilizing such interleave ways.

Additionally, bits 8 and 9 represent the capability of the bits used for
interleaving in the address, Linux tracks this in the cxl_port
interleave_mask.

Per CXL specification r3.1(8.2.4.20.13 Decoder Protection):
  eIW means encoded Interleave Ways.
  eIG means encoded Interleave Granularity.

  in HPA:
  if eIW is 0 or 8 (interleave ways: 1, 3), all the bits of HPA are used,
  the interleave bits are none, the following check is ignored.

  if eIW is less than 8 (interleave ways: 2, 4, 8, 16), the interleave bits
  start at bit position eIG + 8 and end at eIG + eIW + 8 - 1.

  if eIW is greater than 8 (interleave ways: 6, 12), the interleave bits
  start at bit position eIG + 8 and end at eIG + eIW - 1.

  if the interleave mask is insufficient to cover the required interleave
  bits, the target cannot be attached to the region.

Fixes: 384e624bb2 ("cxl/region: Attach endpoint decoders")
Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20240614084755.59503-2-yaoxt.fnst@fujitsu.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-06-25 14:45:27 -07:00
Dave Jiang
d555105271 cxl/test: Add missing vmalloc.h for tools/testing/cxl/test/mem.c
tools/testing/cxl/test/mem.c uses vmalloc() and vfree() but does not
include linux/vmalloc.h. Kernel v6.10 made changes that causes the
currently included headers not depend on vmalloc.h and therefore
mem.c can no longer compile. Add linux/vmalloc.h to fix compile
issue.

  CC [M]  tools/testing/cxl/test/mem.o
tools/testing/cxl/test/mem.c: In function ‘label_area_release’:
tools/testing/cxl/test/mem.c:1428:9: error: implicit declaration of function ‘vfree’; did you mean ‘kvfree’? [-Werror=implicit-function-declaration]
 1428 |         vfree(lsa);
      |         ^~~~~
      |         kvfree
tools/testing/cxl/test/mem.c: In function ‘cxl_mock_mem_probe’:
tools/testing/cxl/test/mem.c:1466:22: error: implicit declaration of function ‘vmalloc’; did you mean ‘kmalloc’? [-Werror=implicit-function-declaration]
 1466 |         mdata->lsa = vmalloc(LSA_SIZE);
      |                      ^~~~~~~
      |                      kmalloc

Fixes: 7d3eb23c4c ("tools/testing/cxl: Introduce a mock memory device + driver")
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/r/20240528225551.1025977-1-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-05-28 16:08:04 -07:00
Linus Torvalds
2e9250022e CXL changes for v6.10 merge window
Topics:
 - Add CXL log related mailbox commands
   - Add Get Log Capabilities command
   - Add Get Supported Log Sub-List Commands command
   - Add Clear Log command
 - Add series for HPA to DPA translation for CXL events cxl_dram and cxl_general_media
 - Add support to send CPER records to CXL for more detailed parsing.
 
 Misc changes and fixes:
 - Fix for compile warning of cxl_security_ops
 - Add debug message for invalid interleave granularity
 - Enhancement to cxl-test event testing
 - Add dev_warn() on unsupported mixed mode decoder
 - Fix use of phys_to_target_node() for x86
 - Use helper function for decoder enum instead of open coding
 - Include missing headers for cxl-event
 - Fix MAINTAINERS file entry
 - Fix cxlr_pmem memory leak
 - Cleanup __cxl_parse_cfmws via scope-based resource menagement
 - Convert cxl_pmem_region_alloc() to scope-based resource management
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmZE5YcACgkQYGjFFmlT
 OEo5ww/+KaerkOcC+B11R7+HHf4yfnbbOzBmBFIrRncWtaSoObq4CJyvjmG3cWL8
 bP+SQ4MazruyPgW0MFUBBeTvJTXUrXDbv0vcCLLRPgcHAP/PROMh7u8LueCFXo/T
 vd0II5OtmBm8aub3Ty5nNNdkgo1/rRmtlCLnBUD5q/om11ZbXWaEZS1ivBC9Dzlf
 qVRe4EYYEldrQWeqRI8LNZIstNtYF/ifzN0T1DQ4ndPuc/9r+s2YLyohcVn1tCZx
 mqeNj5wBldiLPe0QL8perqDbgJzZj8JX9rXgdj+4xUQcnoCBLMxiSn5XFmbIKgzf
 c0Y9LeNR3Y3Ivy57Qd/dakeI6yhz9F0J8MR0ifEyHcmFNCXT13mikN0OoAuSeVRT
 pkGuK/V8D3eF8a3wWn6CT4mJGdGDC8v2DDH2WK75+JrNAfgi4+V6GYUu1bHZMofY
 C0BEolMBZQMmWtR+CYOYm8UAHSRfDyQg55JFaxbbQkc/PekY8TMVR4klUA/+54az
 VL3RONeBxAYUksEoN7vVDhyAGCe1uiW3NyZOlLyRVqtB+X8ZFdh5eXxzhxxBaLKo
 W1M7xo1nDlc/SAqIFOfU9mt99krVJyBi12IuFeTKG9tgO2nYhoazbN9Q4W3wbyEj
 KU84YPFpuwEA9fIyRd8n7vDfe3QbMVL3Xt84MlMcHLlFgBSeUKE=
 =rCAo
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL updates from Dave Jiang:

 - Three CXL mailbox passthrough commands are added to support the
   populating and clearing of vendor debug logs:
     - Get Log Capabilities
     - Get Supported Log Sub-List Commands
     - Clear Log

 - Add support of Device Phyiscal Address (DPA) to Host Physical Address
   (HPA) translation for CXL events of cxl_dram and cxl_general media.

   This allows user space to figure out which CXL region the event
   occured via trace event.

 - Connect CXL to CPER reporting.

   If a device is configured for firmware first, CXL event records are
   not sent directly to the host. Those records are reported through EFI
   Common Platform Error Records (CPER). Add support to route the CPER
   records through the CXL sub-system in order to provide DPA to HPA
   translation and also event decoding and tracing. This is useful for
   users to determine which system issues may correspond to specific
   hardware events.

 - A number of misc cleanups and fixes:
     - Fix for compile warning of cxl_security_ops
     - Add debug message for invalid interleave granularity
     - Enhancement to cxl-test event testing
     - Add dev_warn() on unsupported mixed mode decoder
     - Fix use of phys_to_target_node() for x86
     - Use helper function for decoder enum instead of open coding
     - Include missing headers for cxl-event
     - Fix MAINTAINERS file entry
     - Fix cxlr_pmem memory leak
     - Cleanup __cxl_parse_cfmws via scope-based resource menagement
     - Convert cxl_pmem_region_alloc() to scope-based resource management

* tag 'cxl-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (21 commits)
  cxl/cper: Remove duplicated GUID defines
  cxl/cper: Fix non-ACPI-APEI-GHES build
  cxl/pci: Process CPER events
  acpi/ghes: Process CXL Component Events
  cxl/region: Convert cxl_pmem_region_alloc to scope-based resource management
  cxl/acpi: Cleanup __cxl_parse_cfmws()
  cxl/region: Fix cxlr_pmem leaks
  cxl/core: Add region info to cxl_general_media and cxl_dram events
  cxl/region: Move cxl_trace_hpa() work to the region driver
  cxl/region: Move cxl_dpa_to_region() work to the region driver
  cxl/trace: Correct DPA field masks for general_media & dram events
  MAINTAINERS: repair file entry in COMPUTE EXPRESS LINK
  cxl/cxl-event: include missing <linux/types.h> and <linux/uuid.h>
  cxl/hdm: Debug, use decoder name function
  cxl: Fix use of phys_to_target_node() for x86
  cxl/hdm: dev_warn() on unsupported mixed mode decoder
  cxl/test: Enhance event testing
  cxl/hdm: Add debug message for invalid interleave granularity
  cxl: Fix compile warning for cxl_security_ops extern
  cxl/mbox: Add Clear Log mailbox command
  ...
2024-05-15 14:32:27 -07:00
Ira Weiny
364ee9f326 cxl/test: Enhance event testing
An issue was found in the processing of event logs when the output
buffer length was not reset.[1]

This bug was not caught with cxl-test for 2 reasons.  First, the test
harness mbox_send command [mock_get_event()] does not set the output
size based on the amount of data returned like the hardware command
does.  Second, the simplistic event log testing always returned the same
number of elements per-get command.

Enhance the simulation of the event log mailbox to better match the bug
found with real hardware to cover potential regressions.

NOTE: These changes will cause cxl-events.sh in ndctl to fail without
the fix from Kwangjin.  However, no changes to the user space test was
required.  Therefore ndctl itself will be compatible with old or new
kernels once both patches land in the new kernel.

[1] Link: https://lore.kernel.org/all/20240401091057.1044-1-kwangjin.ko@sk.com/

Cc: Kwangjin Ko <kwangjin.ko@sk.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20240401-enhance-event-test-v1-1-6669a524ed38@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30 10:43:48 -07:00
Dave Jiang
5d211c7090 cxl: Fix cxl_endpoint_get_perf_coordinate() support for RCH
Robert reported the following when booting a CXL host with Restricted CXL
Host (RCH) topology:
 [   39.815379] cxl_acpi ACPI0017:00: not a cxl_port device
 [   39.827123] WARNING: CPU: 46 PID: 1754 at drivers/cxl/core/port.c:592 to_cxl_port+0x56/0x70 [cxl_core]

... plus some related subsequent NULL pointer dereference:

 [   40.718708] BUG: kernel NULL pointer dereference, address: 00000000000002d8

The iterator to walk the PCIe path did not account for RCH topology.
However RCH does not support hotplug and the memory exported by the
Restricted CXL Device (RCD) should be covered by HMAT and therefore no
access_coordinate is needed. Add check to see if the endpoint device is
RCD and skip calculation.

Also add a call to cxl_endpoint_get_perf_coordinates() in cxl_test in order
to exercise the topology iterator. The dev_is_pci() check added is to help
with this test and should be harmless for normal operation.

Reported-by: Robert Richter <rrichter@amd.com>
Closes: https://lore.kernel.org/all/Ziv8GfSMSbvlBB0h@rric.localdomain/
Fixes: 592780b839 ("cxl: Fix retrieving of access_coordinates in PCIe path")
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Robert Richter <rrichter@amd.com>
Link: https://lore.kernel.org/r/20240426224913.1027420-1-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-29 09:03:26 -07:00
Dave Jiang
001c5d1934 cxl: Consolidate dport access_coordinate ->hb_coord and ->sw_coord into ->coord
The driver stores access_coordinate for host bridge in ->hb_coord and
switch CDAT access_coordinate in ->sw_coord. Since neither of these
access_coordinate clobber each other, the variable name can be consolidated
into ->coord to simplify the code.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20240403154844.3403859-5-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-08 08:25:21 -07:00
Dave Jiang
117132edc6 cxl/test: Add support for qos_class checking
Set a fake qos_class to a unique value in order to do simple testing of
qos_class for root decoders and mem devs via user cxl_test. A mock
function is added to set the fake qos_class values for memory device
and overrides cxl_endpoint_parse_cdat() in cxl driver code.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20240206190431.1810289-5-dave.jiang@intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-02-16 23:20:34 -08:00
Dan Williams
68deb99720 tools/testing/cxl: Disable "missing prototypes / declarations" warnings
Prevent warnings of the form:

tools/testing/cxl/test/mock.c:44:6: error: no previous prototype for
‘__wrap_is_acpi_device_node’ [-Werror=missing-prototypes]

tools/testing/cxl/test/mock.c:63:5: error: no previous prototype for
‘__wrap_acpi_table_parse_cedt’ [-Werror=missing-prototypes]

tools/testing/cxl/test/mock.c:81:13: error: no previous prototype for
‘__wrap_acpi_evaluate_integer’ [-Werror=missing-prototypes]

...by locally disabling some warnings.

It turns out that:

Commit 0fcb70851f ("Makefile.extrawarn: turn on missing-prototypes globally")

...in addition to expanding in-tree coverage, also impacts out-of-tree
module builds like those in tools/testing/cxl/.

Filter out the warning options on unit test code that does not effect
mainline builds.

Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://lore.kernel.org/r/170543983780.460832.10920261849128601697.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-01-22 10:41:59 -08:00
Dan Williams
3601311593 Merge branch 'for-6.8/cxl-cper' into for-6.8/cxl
Pick up the CPER to CXL driver integration work for v6.8. Some
additional cleanup of cper_estatus_print() messages is needed, but that
is to be handled incrementally.
2024-01-09 19:21:44 -08:00
Ira Weiny
f9c683386f cxl/events: Create a CXL event union
The CXL CPER and event log records share everything but a UUID/GUID in
their structures.

Define a cxl_event union without the UUID/GUID to be shared between the
CPER and event log record formats.  Adjust the code to use this union.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-6-1bb8a4ca2c7a@intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-01-09 15:41:22 -08:00
Ira Weiny
6eade11075 cxl/events: Separate UUID from event structures
The UEFI CXL CPER structure does not include the UUID.  Now that the
UUID is passed separately to the trace event there is no need to have
the UUID in those structures.

Move UUID from the event record header to the raw structures.  Adjust
cxl-test to Create dummy structures for creating test records.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-5-1bb8a4ca2c7a@intel.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-01-09 15:39:38 -08:00
Ira Weiny
4c115c9c1f cxl/events: Create common event UUID defines
Dan points out in review that the cxl_test code could be made better
through the use of UUID's defines rather than being open coded.[1]

Create UUID defines and use them rather than open coding them.

Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: http://lore.kernel.org/r/65738d09e30e2_45e0129451@dwillia2-xfh.jf.intel.com.notmuch [1]
Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-3-1bb8a4ca2c7a@intel.com
[djbw: clang-format uuid definitions]
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-01-09 15:39:38 -08:00
Dave Jiang
f2202f9904 tools/testing/cxl: Add hostbridge UID string for cxl_test mock hb devices
In order to support acpi_device_uid() call, add static string to
acpi_device->pnp.unique_id.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/170319622564.2212653.1534465446670631698.stgit@djiang5-mobl3
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-12-22 15:31:52 -08:00
Dave Jiang
ad6f04c026 cxl: Add callback to parse the DSMAS subtables from CDAT
Provide a callback function to the CDAT parser in order to parse the
Device Scoped Memory Affinity Structure (DSMAS). Each DSMAS structure
contains the DPA range and its associated attributes in each entry. See
the CDAT specification for details. The device handle and the DPA range
is saved and to be associated with the DSLBIS locality data when the
DSLBIS entries are parsed. The xarray is a local variable. When the
total path performance data is calculated and storred this xarray can be
discarded.

Coherent Device Attribute Table 1.03 2.1 Device Scoped memory Affinity
Structure (DSMAS)

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/170319619355.2212653.2675953129671561293.stgit@djiang5-mobl3
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-12-22 14:33:10 -08:00
Dave Jiang
e05501e8a8 cxl: Add cxl_num_decoders_committed() usage to cxl_test
Commit 458ba8189c ("cxl: Add cxl_decoders_committed() helper") missed the
conversion for cxl_test. Add usage of cxl_num_decoders_committed() to
replace the open coding.

Suggested-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Fan Ni <fan.ni@samsung.com>
Link: https://lore.kernel.org/r/169929160525.824083.11813222229025394254.stgit@djiang5-mobl3
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-12-04 16:46:14 -08:00
Dan Williams
7f946e6d83 Merge branch 'for-6.7/cxl-rch-eh' into cxl/next
Restricted CXL Host (RCH) Error Handling undoes the topology munging of
CXL 1.1 to enabled some AER recovery, and lands some base infrastructure
for handling Root-Complex-Event-Collectors (RCECs) with CXL. Include
this long running series finally for v6.7.
2023-10-31 10:59:00 -07:00
Robert Richter
f611d98a00 cxl/pci: Remove Component Register base address from struct cxl_dev_state
The Component Register base address @component_reg_phys is no longer
used after the rework of the Component Register setup which now uses
struct member @reg_map instead. Remove the base address.

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20231018171713.1883517-9-rrichter@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27 20:13:37 -07:00
Vishal Verma
8f61d48c83 tools/testing/cxl: Slow down the mock firmware transfer
The cxl-cli unit test for firmware update does operations like starting
an asynchronous firmware update, making sure it is in progress, and
attempting to cancel it. In some cases, such as with no or minimal
dynamic debugging turned on, the firmware update completes too quickly,
not allowing the test to have a chance to verify it was in progress.
This caused a failure of the signature:

  expected fw_update_in_progress:true
  test/cxl-update-firmware.sh: failed at line 88

Fix this by adding a delay (~1.5 - 2 ms) to each firmware transfer
request handled by the mocked interface.

Reported-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/20231026-vv-fw_upd_test_fix-v2-1-5282fd193883@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27 13:04:52 -07:00
Jim Harris
98a04c7ace cxl/region: Fix x1 root-decoder granularity calculations
Root decoder granularity must match value from CFWMS, which may not
be the region's granularity for non-interleaved root decoders.

So when calculating granularities for host bridge decoders, use the
region's granularity instead of the root decoder's granularity to ensure
the correct granularities are set for the host bridge decoders and any
downstream switch decoders.

Test configuration is 1 host bridge * 2 switches * 2 endpoints per switch.

Region created with 2048 granularity using following command line:

cxl create-region -m -d decoder0.0 -w 4 mem0 mem2 mem1 mem3 \
		  -g 2048 -s 2048M

Use "cxl list -PDE | grep granularity" to get a view of the granularity
set at each level of the topology.

Before this patch:
        "interleave_granularity":2048,
        "interleave_granularity":2048,
    "interleave_granularity":512,
        "interleave_granularity":2048,
        "interleave_granularity":2048,
    "interleave_granularity":512,
"interleave_granularity":256,

After:
        "interleave_granularity":2048,
        "interleave_granularity":2048,
    "interleave_granularity":4096,
        "interleave_granularity":2048,
        "interleave_granularity":2048,
    "interleave_granularity":4096,
"interleave_granularity":2048,

Fixes: 27b3f8d138 ("cxl/region: Program target lists")
Cc: <stable@vger.kernel.org>
Signed-off-by: Jim Harris <jim.harris@samsung.com>
Link: https://lore.kernel.org/r/169824893473.1403938.16110924262989774582.stgit@bgt-140510-bm03.eng.stellus.in
[djbw: fixup the prebuilt cxl_test region]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-27 13:04:52 -07:00
Dan Williams
cf009d4ec3 tools/testing/cxl: Add 'sanitize notifier' support
Allow for cxl_test regression of the sanitize notifier. Reuse the core
setup infrastructure, and trigger notifications upon any sanitize
submission with a programmable notification delay.

Cc: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-09 11:35:45 -07:00
Dan Williams
501b3d9fb0 tools/testing/cxl: Make cxl_memdev_state available to other command emulation
Move @mds out of the event specific 'struct mock_event_store' and into
the base 'struct cxl_mockmem_data' directly. This is in preparation for
enabling cxl_test to exercise the notifier flow for 'sanitize' operation
completion.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-09 11:35:45 -07:00
Dan Williams
f29a824b0b cxl/pci: Clarify devm host for memdev relative setup
It is all too easy to get confused about @dev usage in the CXL driver
stack. Before adding a new cxl_pci_probe() setup operation that has a
devm lifetime dependent on @cxlds->dev binding, but also references
@cxlmd->dev, and prints messages, rework the devm_cxl_add_memdev() and
cxl_memdev_setup_fw_upload() function signatures to make this
distinction explicit. I.e. pass in the devm context as an @host argument
rather than infer it from other objects.

This is in preparation for adding a devm_cxl_sanitize_setup_notifier().

Note the whitespace fixup near the change of the devm_cxl_add_memdev()
signature. That uncaught typo originated in the patch that added
cxl_memdev_security_init().

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-10-06 00:12:44 -07:00
Xiao Yang
70d49bbf96 tools/testing/cxl: Remove unused SZ_512G macro
SZ_512G macro has become useless since commit b2f3b74e10
("tools/testing/cxl: Move cxl_test resources to the top of memory")
so remove it directly.

Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
Link: https://lore.kernel.org/r/20230719163103.3392-1-yangx.jy@fujitsu.com
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2023-07-20 23:35:22 -06:00
Dan Williams
0c0df63177 Merge branch 'for-6.5/cxl-rch-eh' into for-6.5/cxl
Pick up the first half of the RCH error handling series. The back half
needs some fixups for test regressions. Small conflicts with the PMU
work around register enumeration and setup helpers.
2023-06-25 18:56:13 -07:00
Dan Williams
d2f9fe6953 Merge branch 'for-6.5/cxl-perf' into for-6.5/cxl
Pick up initial support for the CXL 3.0 performance monitoring
definition. Small conflicts with the firmware update work as they both
placed their init code in the same location.
2023-06-25 17:53:18 -07:00
Dan Williams
aeaefabc59 Merge branch 'for-6.5/cxl-type-2' into for-6.5/cxl
Pick up the driver cleanups identified in preparation for CXL "type-2"
(accelerator) device support. The major change here from a conflict
generation perspective is the split of 'struct cxl_memdev_state' from
the core 'struct cxl_dev_state'. Since an accelerator may not care about
all the optional features that are standard on a CXL "type-3" (host-only
memory expander) device.

A silent conflict also occurs with the move of the endpoint port to be a
formal property of a 'struct cxl_memdev' rather than drvdata.
2023-06-25 17:16:51 -07:00
Dan Williams
867eab655d Merge branch 'for-6.5/cxl-fwupd' into for-6.5/cxl
Add the first typical (non-sanitization) consumer of the new background
command infrastructure, firmware update. Given both firmware-update and
sanitization were developed in parallel from the common
background-command baseline, resolve some minor context conflicts.
2023-06-25 16:12:26 -07:00
Vishal Verma
f6448cb5f2 tools/testing/cxl: add firmware update emulation to CXL memdevs
Add emulation for the 'Get FW Info', 'Transfer FW', and 'Activate FW'
CXL mailbox commands to the cxl_test emulated memdevs to enable
end-to-end unit testing of a firmware update flow. For now, only
advertise an 'offline activation' capability as that is all the CXL
memdev driver currently implements.

Add some canned values for the serial number fields, and create a
platform device sysfs knob to calculate the sha256sum of the firmware
image that was received, so a unit test can compare it with the original
file that was uploaded.

Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Cc: Russ Weight <russell.h.weight@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ben Widawsky <bwidawsk@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/20230602-vv-fw_update-v4-4-c6265bd7343b@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25 15:58:40 -07:00
Vishal Verma
6e4ca04af7 tools/testing/cxl: Use named effects for the Command Effect Log
As more emulated mailbox commands are added to cxl_test, it is a pain
point to look up command effect numbers for each effect. Replace the
bare numbers in the mock driver with an enum that lists all possible
effects.

Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Cc: Russ Weight <russell.h.weight@intel.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ben Widawsky <bwidawsk@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Suggested-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/20230602-vv-fw_update-v4-3-c6265bd7343b@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25 15:58:40 -07:00
Vishal Verma
b46c5fa57c tools/testing/cxl: Fix command effects for inject/clear poison
The CXL spec (3.0, section 8.2.9.8.4) Lists Inject Poison and Clear
Poison as having the effects of "Immediate Data Change". Fix this in the
mock driver so that the command effect log is populated correctly.

Fixes: 371c16101e ("tools/testing/cxl: Mock the Inject Poison mailbox command")
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/20230602-vv-fw_update-v4-2-c6265bd7343b@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25 15:58:40 -07:00
Davidlohr Bueso
f337043b56 cxl/test: Add Secure Erase opcode support
Add support to emulate the CXL the "Secure Erase" operation.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/r/20230612181038.14421-8-dave@stgolabs.net
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25 15:35:16 -07:00
Davidlohr Bueso
c5c39217ff cxl/test: Add Sanitize opcode support
Add support to emulate the "Sanitize" operation, without
incurring in the background.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/r/20230612181038.14421-6-dave@stgolabs.net
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25 15:34:30 -07:00
Dan Williams
8f0220af58 Revert "cxl/port: Enable the HDM decoder capability for switch ports"
commit eb0764b822 ("cxl/port: Enable the HDM decoder capability for switch ports")

...was added on the observation of CXL memory not being accessible after
setting up a region on a "cold-plugged" device. A "cold-plugged" CXL
device is one that was not present at boot, so platform-firmware/BIOS
has no chance to set it up.

While it is true that the debug found the enable bit clear in the
host-bridge's instance of the global control register (CXL 3.0
8.2.4.19.2 CXL HDM Decoder Global Control Register), that bit is
described as:

"This bit is only applicable to CXL.mem devices and shall
return 0 on CXL Host Bridges and Upstream Switch Ports."

So it is meant to be zero, and further testing confirmed that this "fix"
had no effect on the failure. Revert it, and be more vigilant about
proposed fixes in the future. Since the original copied stable@, flag
this revert for stable@ as well.

Cc: <stable@vger.kernel.org>
Fixes: eb0764b822 ("cxl/port: Enable the HDM decoder capability for switch ports")
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/168685882012.3475336.16733084892658264991.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25 14:32:18 -07:00
Dan Williams
5aa39a9165 cxl/port: Rename CXL_DECODER_{EXPANDER, ACCELERATOR} => {HOSTONLYMEM, DEVMEM}
In preparation for support for HDM-D and HDM-DB configuration
(device-memory, and device-memory with back-invalidate). Rename the current
type designators to use HOSTONLYMEM and DEVMEM as a suffix.

HDM-DB can be supported by devices that are not accelerators, so DEVMEM is
a more generic term for that case.

Fixup one location where this type value was open coded.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/168679261369.3436160.7042443847605280593.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25 14:31:08 -07:00
Dan Williams
59f8d15107 cxl/mbox: Move mailbox related driver state to its own data structure
'struct cxl_dev_state' makes too many assumptions about the capabilities
of a CXL device. In particular it assumes a CXL device has a mailbox and
all of the infrastructure and state that comes along with that.

In preparation for supporting accelerator / Type-2 devices that may not
have a mailbox and in general maintain a minimal core context structure,
make mailbox functionality a super-set of  'struct cxl_dev_state' with
'struct cxl_memdev_state'.

With this reorganization it allows for CXL devices that support HDM
decoder mapping, but not other general-expander / Type-3 capabilities,
to only enable that subset without the rest of the mailbox
infrastructure coming along for the ride.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/168679260240.3436160.15520641540463704524.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25 14:31:08 -07:00
Dan Williams
4c77cfcfe1 tools/testing/cxl: Remove unused @cxlds argument
In preparation for plumbing a 'struct cxl_memdev_state' as a superset of
a 'struct cxl_dev_state' cleanup the usage of @cxlds in the unit test
infrastructure.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/168679258640.3436160.7641308222525246728.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25 14:31:08 -07:00
Dan Williams
7481653dee cxl: Rename 'uport' to 'uport_dev'
For symmetry with the recent rename of ->dport_dev for a 'struct
cxl_dport', add the "_dev" suffix to the ->uport property of a 'struct
cxl_port'. These devices represent the downstream-port-device and
upstream-port-device respectively in the CXL/PCIe topology.

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20230622205523.85375-6-terry.bowman@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25 11:37:54 -07:00
Dan Williams
0619337856 cxl/rch: Prepare for caching the MMIO mapped PCIe AER capability
Prepare cxl_probe_rcrb() for retrieving more than just the component
register block. The RCH AER handling code wants to get back to the AER
capability that happens to be MMIO mapped rather then configuration
cycles.

Move RCRB specific downstream port data, like the RCRB base and the
AER capability offset, into its own data structure ('struct
cxl_rcrb_info') for cxl_probe_rcrb() to fill. Extend 'struct
cxl_dport' to include a 'struct cxl_rcrb_info' attribute.

This centralizes all RCRB scanning in one routine.

Co-developed-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20230622205523.85375-4-terry.bowman@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-06-25 11:35:26 -07:00