linux/Documentation
Caleb Sander Mateos 61bf0009a7 dim: pass dim_sample to net_dim() by reference
net_dim() is currently passed a struct dim_sample argument by value.
struct dim_sample is 24 bytes. Since this is greater 16 bytes, x86-64
passes it on the stack. All callers have already initialized dim_sample
on the stack, so passing it by value requires pushing a duplicated copy
to the stack. Either witing to the stack and immediately reading it, or
perhaps dereferencing addresses relative to the stack pointer in a chain
of push instructions, seems to perform quite poorly.

In a heavy TCP workload, mlx5e_handle_rx_dim() consumes 3% of CPU time,
94% of which is attributed to the first push instruction to copy
dim_sample on the stack for the call to net_dim():
// Call ktime_get()
  0.26 |4ead2:   call   4ead7 <mlx5e_handle_rx_dim+0x47>
// Pass the address of struct dim in %rdi
       |4ead7:   lea    0x3d0(%rbx),%rdi
// Set dim_sample.pkt_ctr
       |4eade:   mov    %r13d,0x8(%rsp)
// Set dim_sample.byte_ctr
       |4eae3:   mov    %r12d,0xc(%rsp)
// Set dim_sample.event_ctr
  0.15 |4eae8:   mov    %bp,0x10(%rsp)
// Duplicate dim_sample on the stack
 94.16 |4eaed:   push   0x10(%rsp)
  2.79 |4eaf1:   push   0x10(%rsp)
  0.07 |4eaf5:   push   %rax
// Call net_dim()
  0.21 |4eaf6:   call   4eafb <mlx5e_handle_rx_dim+0x6b>

To allow the caller to reuse the struct dim_sample already on the stack,
pass the struct dim_sample by reference to net_dim().

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Arthur Kiyanovski <akiyano@amazon.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Link: https://patch.msgid.link/20241031002326.3426181-2-csander@purestorage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 12:36:54 -08:00
..
ABI Char/Misc and other driver changes for 6.12-rc1 2024-09-26 10:13:08 -07:00
accel drm next for 6.12-rc1 2024-09-19 10:18:15 +02:00
accounting
admin-guide cpufreq: docs: Reflect latency changes in docs 2024-10-21 13:20:03 +02:00
arch arm64 fixes for 6.12-rc2: 2024-10-04 12:20:09 -07:00
block docs: block: Fix grammar and spelling mistakes in bfq-iosched.rst 2024-09-05 14:38:10 -06:00
bpf docs/bpf: Add missing BPF program types to docs 2024-09-12 10:56:41 -07:00
cdrom
core-api Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-10-25 09:08:22 +02:00
cpu-freq
crypto
dev-tools The core clk framework is left largely untouched this time around except for 2024-09-23 15:01:48 -07:00
devicetree Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-10-31 18:10:07 -07:00
doc-guide doc-guide: add help documentation checktransupdate.rst 2024-07-30 07:56:22 -06:00
driver-api platform/x86: wmi: Update WMI driver API documentation 2024-10-06 12:48:52 +02:00
fault-injection Fix typo "allocateed" to allocated 2024-08-26 15:37:25 -06:00
fb
features x86: remove PG_uncached 2024-09-03 21:15:46 -07:00
filesystems vfs-6.12-rc5.fixes 2024-10-21 10:48:24 -07:00
firmware_class
firmware-guide
fpga
gpu Short summary of fixes pull: 2024-10-01 08:15:55 +10:00
hid Documentation: hid: intel-ish-hid: Add vendor custom firmware loading 2024-08-19 21:12:27 +02:00
hwmon hwmon: Remove devm_hwmon_device_unregister() API function 2024-09-13 07:27:36 -07:00
i2c i2c: testunit: add SMBusAlert trigger 2024-08-26 15:15:48 +02:00
iio doc: iio: ad4695: update for calibration support 2024-09-03 18:49:43 +01:00
images
infiniband
input
isdn
kbuild kbuild: doc: replace "gcc" in external module description 2024-09-24 03:07:21 +09:00
kernel-hacking
leds - Limited LED current based on thermal conditions in the QCOM flash LED driver. 2024-09-23 14:20:11 -07:00
litmus-tests
livepatch Documentation: livepatch: Correct release locks antonym 2024-09-04 13:42:27 +02:00
locking
maintainer
mhi
misc-devices
mm Docs/damon/maintainer-profile: update deprecated awslabs GitHub URLs 2024-10-17 00:28:09 -07:00
netlabel
netlink dpll: add clock quality level attribute and op 2024-11-03 08:39:07 -08:00
networking dim: pass dim_sample to net_dim() by reference 2024-11-03 12:36:54 -08:00
nvdimm
nvme Remove duplicate "and" in 'Linux NVMe docs. 2024-09-10 15:44:20 -06:00
PCI Documentation: PCI: fix typo in pci.rst 2024-09-10 15:30:42 -06:00
pcmcia
peci
power Documentation: PM: Discourage use of deprecated macros 2024-09-04 14:37:57 +02:00
process soc: fixes for 6.12 2024-10-17 09:43:36 -07:00
RCU Merge branches 'context_tracking.15.08.24a', 'csd.lock.15.08.24a', 'nocb.09.09.24a', 'rcutorture.14.08.24a', 'rcustall.09.09.24a', 'srcu.12.08.24a', 'rcu.tasks.14.08.24a', 'rcu_scaling_tests.15.08.24a', 'fixes.12.08.24a' and 'misc.11.08.24a' into next.09.09.24a 2024-09-09 00:09:47 +05:30
rust Rust changes for v6.12 2024-09-25 10:25:40 -07:00
scheduler sched_ext: Documentation: Update instructions for running example schedulers 2024-10-08 08:49:18 -10:00
scsi
security documentation: add IPE documentation 2024-08-20 14:03:47 -04:00
sound Docs/sound: Add documentation for userspace-driven ALSA timers 2024-08-18 09:55:54 +02:00
sphinx docs: kerneldoc-preamble.sty: Suppress extra spaces in CJK literal blocks 2024-09-05 14:16:41 -06:00
sphinx-static
spi spi: Enable controllers to extend the SPI protocol with MOSI idle configuration 2024-07-29 01:19:51 +01:00
staging xz: remove XZ_EXTERN and extern from functions 2024-09-01 20:43:27 -07:00
target
tee
timers treewide: Fix wrong singular form of jiffies in comments 2024-09-08 20:47:40 +02:00
tools
trace tracing/Documentation: Start a document on how to debug with tracing 2024-08-26 13:54:08 -04:00
translations move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
usb usb: gadget: f_uac1: Change volume name and remove alt names 2024-08-13 18:11:35 +02:00
userspace-api mseal: update mseal.rst 2024-10-28 21:40:41 -07:00
virt KVM: x86: Clean up documentation for KVM_X86_QUIRK_SLOT_ZAP_ALL 2024-10-20 07:31:05 -04:00
w1
watchdog [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
wmi platform/x86: dell-ddv: Fix typo in documentation 2024-10-06 12:47:40 +02:00
.gitignore
atomic_bitops.txt
atomic_t.txt
Changes
CodingStyle
conf.py
docutils.conf
dontdiff Kbuild updates for v6.12 2024-09-24 13:02:06 -07:00
index.rst
Kconfig
Makefile
memory-barriers.txt docs/memory-barriers.txt: Remove left-over references to "CACHE COHERENCY" 2024-09-13 23:56:44 -07:00
SubmittingPatches
subsystem-apis.rst