mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2025-01-07 22:03:14 +00:00
ee8b94c851
42226 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Linus Torvalds
|
b1983d427a |
Networking fixes for 6.5-rc2, including fixes from netfilter,
wireless and ebpf Current release - regressions: - netfilter: conntrack: gre: don't set assured flag for clash entries - wifi: iwlwifi: remove 'use_tfh' config to fix crash Previous releases - regressions: - ipv6: fix a potential refcount underflow for idev - icmp6: ifix null-ptr-deref of ip6_null_entry->rt6i_idev in icmp6_dev() - bpf: fix max stack depth check for async callbacks - eth: mlx5e: - check for NOT_READY flag state after locking - fix page_pool page fragment tracking for XDP - eth: igc: - fix tx hang issue when QBV gate is closed - fix corner cases for TSN offload - eth: octeontx2-af: Move validation of ptp pointer before its usage - eth: ena: fix shift-out-of-bounds in exponential backoff Previous releases - always broken: - core: prevent skb corruption on frag list segmentation - sched: - cls_fw: fix improper refcount update leads to use-after-free - sch_qfq: account for stab overhead in qfq_enqueue - netfilter: - report use refcount overflow - prevent OOB access in nft_byteorder_eval - wifi: mt7921e: fix init command fail with enabled device - eth: ocelot: fix oversize frame dropping for preemptible TCs - eth: fec: recycle pages for transmitted XDP frames Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmSv1YISHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkpQgP/1msj0MlIWJnMgzPiMonDSe746JGTg/j YengEjqcy3ozC4COBEeyBO6ilt6I+Wrb5H5jimn9h2djB+D7htWNaejQaqJrBxph F4lUC6OJqd2ncI3tXAG2BSX1duzDr6B7yL7d5InFIczw8vNh+chsyX0sjlzU12bt ppjcSb+Ffc796DB0ItJkBqluxcpjyXE15ZWTTV4GEHK6RoRdxNIGjd7NgvD8podB Q/464bHs1jJYkAavuobiOXV2fuxWLTs77E0Vloizoo+42UiRFMLJk+RX98PhSIMa eejkxfm+H6+6Qi2omYepvf7vDN3GtLjxbr5C3mTdWPuL4QbNY8agVJ7sS4XnL5/v B7EAjyGQK9SmD36zTu7QL/Ul6fSnRq8jz20B0mDa0imAWzi58A+jqbQAMoVOMSS+ Uv4yKJpIUyx7mUI77+EX3U9r1wytw5eniatTDU+GAsQb2CJ43CqDmn/7RcmGacBo P1q+il9JW4kzUQrisUSxmQDfpBvQi5wiygiEdUNI5FEhq6/iKe/lrJnmJZpaLkd5 P3oEKjapamAmcyrEr/7VD1Mb4jrRfpB7zVn/5OyvywbcLQxA+531iPpy4r4W6cWH 1MRLBVVHKyb3jfm8J3T4lpDEzd03+MiPS8JiKMUYYNUYkY8tYp92muwC7z2sGI4M 6eR2MeKD4vds =cELX -----END PGP SIGNATURE----- Merge tag 'net-6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter, wireless and ebpf. Current release - regressions: - netfilter: conntrack: gre: don't set assured flag for clash entries - wifi: iwlwifi: remove 'use_tfh' config to fix crash Previous releases - regressions: - ipv6: fix a potential refcount underflow for idev - icmp6: ifix null-ptr-deref of ip6_null_entry->rt6i_idev in icmp6_dev() - bpf: fix max stack depth check for async callbacks - eth: mlx5e: - check for NOT_READY flag state after locking - fix page_pool page fragment tracking for XDP - eth: igc: - fix tx hang issue when QBV gate is closed - fix corner cases for TSN offload - eth: octeontx2-af: Move validation of ptp pointer before its usage - eth: ena: fix shift-out-of-bounds in exponential backoff Previous releases - always broken: - core: prevent skb corruption on frag list segmentation - sched: - cls_fw: fix improper refcount update leads to use-after-free - sch_qfq: account for stab overhead in qfq_enqueue - netfilter: - report use refcount overflow - prevent OOB access in nft_byteorder_eval - wifi: mt7921e: fix init command fail with enabled device - eth: ocelot: fix oversize frame dropping for preemptible TCs - eth: fec: recycle pages for transmitted XDP frames" * tag 'net-6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (79 commits) selftests: tc-testing: add test for qfq with stab overhead net/sched: sch_qfq: account for stab overhead in qfq_enqueue selftests: tc-testing: add tests for qfq mtu sanity check net/sched: sch_qfq: reintroduce lmax bound check for MTU wifi: cfg80211: fix receiving mesh packets without RFC1042 header wifi: rtw89: debug: fix error code in rtw89_debug_priv_send_h2c_set() net: txgbe: fix eeprom calculation error net/sched: make psched_mtu() RTNL-less safe net: ena: fix shift-out-of-bounds in exponential backoff netdevsim: fix uninitialized data in nsim_dev_trap_fa_cookie_write() net/sched: flower: Ensure both minimum and maximum ports are specified MAINTAINERS: Add another mailing list for QUALCOMM ETHQOS ETHERNET DRIVER docs: netdev: update the URL of the status page wifi: iwlwifi: remove 'use_tfh' config to fix crash xdp: use trusted arguments in XDP hints kfuncs bpf: cpumap: Fix memory leak in cpu_map_update_elem wifi: airo: avoid uninitialized warning in airo_get_rate() octeontx2-pf: Add additional check for MCAM rules net: dsa: Removed unneeded of_node_put in felix_parse_ports_node net: fec: use netdev_err_once() instead of netdev_err() ... |
||
Linus Torvalds
|
ebc27aacee |
Tracing fixes and clean ups:
- Fix some missing-prototype warnings - Fix user events struct args (did not include size of struct) When creating a user event, the "struct" keyword is to denote that the size of the field will be passed in. But the parsing failed to handle this case. - Add selftest to struct sizes for user events - Fix sample code for direct trampolines. The sample code for direct trampolines attached to handle_mm_fault(). But the prototype changed and the direct trampoline sample code was not updated. Direct trampolines needs to have the arguments correct otherwise it can fail or crash the system. - Remove unused ftrace_regs_caller_ret() prototype. - Quiet false positive of FORTIFY_SOURCE Due to backward compatibility, the structure used to save stack traces in the kernel had a fixed size of 8. This structure is exported to user space via the tracing format file. A change was made to allow more than 8 functions to be recorded, and user space now uses the size field to know how many functions are actually in the stack. But the structure still has size of 8 (even though it points into the ring buffer that has the required amount allocated to hold a full stack. This was fine until the fortifier noticed that the memcpy(&entry->caller, stack, size) was greater than the 8 functions and would complain at runtime about it. Hide this by using a pointer to the stack location on the ring buffer instead of using the address of the entry structure caller field. - Fix a deadloop in reading trace_pipe that was caused by a mismatch between ring_buffer_empty() returning false which then asked to read the data, but the read code uses rb_num_of_entries() that returned zero, and causing a infinite "retry". - Fix a warning caused by not using all pages allocated to store ftrace functions, where this can happen if the linker inserts a bunch of "NULL" entries, causing the accounting of how many pages needed to be off. - Fix histogram synthetic event crashing when the start event is removed and the end event is still using a variable from it. - Fix memory leak in freeing iter->temp in tracing_release_pipe() -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZLBF6hQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qkswAP4mhdoFFfNosM7+Sh/R4t31IxKZApm9 M2Hf9jgvJ7b65AD/VV1XfO6skw2+5Yn9S4UyNE2MQaYxPwWpONcNFUzZ3Q8= =Nb+7 -----END PGP SIGNATURE----- Merge tag 'trace-v6.5-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix some missing-prototype warnings - Fix user events struct args (did not include size of struct) When creating a user event, the "struct" keyword is to denote that the size of the field will be passed in. But the parsing failed to handle this case. - Add selftest to struct sizes for user events - Fix sample code for direct trampolines. The sample code for direct trampolines attached to handle_mm_fault(). But the prototype changed and the direct trampoline sample code was not updated. Direct trampolines needs to have the arguments correct otherwise it can fail or crash the system. - Remove unused ftrace_regs_caller_ret() prototype. - Quiet false positive of FORTIFY_SOURCE Due to backward compatibility, the structure used to save stack traces in the kernel had a fixed size of 8. This structure is exported to user space via the tracing format file. A change was made to allow more than 8 functions to be recorded, and user space now uses the size field to know how many functions are actually in the stack. But the structure still has size of 8 (even though it points into the ring buffer that has the required amount allocated to hold a full stack. This was fine until the fortifier noticed that the memcpy(&entry->caller, stack, size) was greater than the 8 functions and would complain at runtime about it. Hide this by using a pointer to the stack location on the ring buffer instead of using the address of the entry structure caller field. - Fix a deadloop in reading trace_pipe that was caused by a mismatch between ring_buffer_empty() returning false which then asked to read the data, but the read code uses rb_num_of_entries() that returned zero, and causing a infinite "retry". - Fix a warning caused by not using all pages allocated to store ftrace functions, where this can happen if the linker inserts a bunch of "NULL" entries, causing the accounting of how many pages needed to be off. - Fix histogram synthetic event crashing when the start event is removed and the end event is still using a variable from it - Fix memory leak in freeing iter->temp in tracing_release_pipe() * tag 'trace-v6.5-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Fix memory leak of iter->temp when reading trace_pipe tracing/histograms: Add histograms to hist_vars if they have referenced variables tracing: Stop FORTIFY_SOURCE complaining about stack trace caller ftrace: Fix possible warning on checking all pages used in ftrace_process_locs() ring-buffer: Fix deadloop issue on reading trace_pipe tracing: arm64: Avoid missing-prototype warnings selftests/user_events: Test struct size match cases tracing/user_events: Fix struct arg size match check x86/ftrace: Remove unsued extern declaration ftrace_regs_caller_ret() arm64: ftrace: Add direct call trampoline samples support samples: ftrace: Save required argument registers in sample trampolines |
||
Zheng Yejian
|
d5a8218963 |
tracing: Fix memory leak of iter->temp when reading trace_pipe
kmemleak reports:
unreferenced object 0xffff88814d14e200 (size 256):
comm "cat", pid 336, jiffies 4294871818 (age 779.490s)
hex dump (first 32 bytes):
04 00 01 03 00 00 00 00 08 00 00 00 00 00 00 00 ................
0c d8 c8 9b ff ff ff ff 04 5a ca 9b ff ff ff ff .........Z......
backtrace:
[<ffffffff9bdff18f>] __kmalloc+0x4f/0x140
[<ffffffff9bc9238b>] trace_find_next_entry+0xbb/0x1d0
[<ffffffff9bc9caef>] trace_print_lat_context+0xaf/0x4e0
[<ffffffff9bc94490>] print_trace_line+0x3e0/0x950
[<ffffffff9bc95499>] tracing_read_pipe+0x2d9/0x5a0
[<ffffffff9bf03a43>] vfs_read+0x143/0x520
[<ffffffff9bf04c2d>] ksys_read+0xbd/0x160
[<ffffffff9d0f0edf>] do_syscall_64+0x3f/0x90
[<ffffffff9d2000aa>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
when reading file 'trace_pipe', 'iter->temp' is allocated or relocated
in trace_find_next_entry() but not freed before 'trace_pipe' is closed.
To fix it, free 'iter->temp' in tracing_release_pipe().
Link: https://lore.kernel.org/linux-trace-kernel/20230713141435.1133021-1-zhengyejian1@huawei.com
Cc: stable@vger.kernel.org
Fixes:
|
||
Mohamed Khalfella
|
6018b585e8 |
tracing/histograms: Add histograms to hist_vars if they have referenced variables
Hist triggers can have referenced variables without having direct
variables fields. This can be the case if referenced variables are added
for trigger actions. In this case the newly added references will not
have field variables. Not taking such referenced variables into
consideration can result in a bug where it would be possible to remove
hist trigger with variables being refenced. This will result in a bug
that is easily reproducable like so
$ cd /sys/kernel/tracing
$ echo 'synthetic_sys_enter char[] comm; long id' >> synthetic_events
$ echo 'hist:keys=common_pid.execname,id.syscall:vals=hitcount:comm=common_pid.execname' >> events/raw_syscalls/sys_enter/trigger
$ echo 'hist:keys=common_pid.execname,id.syscall:onmatch(raw_syscalls.sys_enter).synthetic_sys_enter($comm, id)' >> events/raw_syscalls/sys_enter/trigger
$ echo '!hist:keys=common_pid.execname,id.syscall:vals=hitcount:comm=common_pid.execname' >> events/raw_syscalls/sys_enter/trigger
[ 100.263533] ==================================================================
[ 100.264634] BUG: KASAN: slab-use-after-free in resolve_var_refs+0xc7/0x180
[ 100.265520] Read of size 8 at addr ffff88810375d0f0 by task bash/439
[ 100.266320]
[ 100.266533] CPU: 2 PID: 439 Comm: bash Not tainted 6.5.0-rc1 #4
[ 100.267277] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-20220807_005459-localhost 04/01/2014
[ 100.268561] Call Trace:
[ 100.268902] <TASK>
[ 100.269189] dump_stack_lvl+0x4c/0x70
[ 100.269680] print_report+0xc5/0x600
[ 100.270165] ? resolve_var_refs+0xc7/0x180
[ 100.270697] ? kasan_complete_mode_report_info+0x80/0x1f0
[ 100.271389] ? resolve_var_refs+0xc7/0x180
[ 100.271913] kasan_report+0xbd/0x100
[ 100.272380] ? resolve_var_refs+0xc7/0x180
[ 100.272920] __asan_load8+0x71/0xa0
[ 100.273377] resolve_var_refs+0xc7/0x180
[ 100.273888] event_hist_trigger+0x749/0x860
[ 100.274505] ? kasan_save_stack+0x2a/0x50
[ 100.275024] ? kasan_set_track+0x29/0x40
[ 100.275536] ? __pfx_event_hist_trigger+0x10/0x10
[ 100.276138] ? ksys_write+0xd1/0x170
[ 100.276607] ? do_syscall_64+0x3c/0x90
[ 100.277099] ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 100.277771] ? destroy_hist_data+0x446/0x470
[ 100.278324] ? event_hist_trigger_parse+0xa6c/0x3860
[ 100.278962] ? __pfx_event_hist_trigger_parse+0x10/0x10
[ 100.279627] ? __kasan_check_write+0x18/0x20
[ 100.280177] ? mutex_unlock+0x85/0xd0
[ 100.280660] ? __pfx_mutex_unlock+0x10/0x10
[ 100.281200] ? kfree+0x7b/0x120
[ 100.281619] ? ____kasan_slab_free+0x15d/0x1d0
[ 100.282197] ? event_trigger_write+0xac/0x100
[ 100.282764] ? __kasan_slab_free+0x16/0x20
[ 100.283293] ? __kmem_cache_free+0x153/0x2f0
[ 100.283844] ? sched_mm_cid_remote_clear+0xb1/0x250
[ 100.284550] ? __pfx_sched_mm_cid_remote_clear+0x10/0x10
[ 100.285221] ? event_trigger_write+0xbc/0x100
[ 100.285781] ? __kasan_check_read+0x15/0x20
[ 100.286321] ? __bitmap_weight+0x66/0xa0
[ 100.286833] ? _find_next_bit+0x46/0xe0
[ 100.287334] ? task_mm_cid_work+0x37f/0x450
[ 100.287872] event_triggers_call+0x84/0x150
[ 100.288408] trace_event_buffer_commit+0x339/0x430
[ 100.289073] ? ring_buffer_event_data+0x3f/0x60
[ 100.292189] trace_event_raw_event_sys_enter+0x8b/0xe0
[ 100.295434] syscall_trace_enter.constprop.0+0x18f/0x1b0
[ 100.298653] syscall_enter_from_user_mode+0x32/0x40
[ 100.301808] do_syscall_64+0x1a/0x90
[ 100.304748] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[ 100.307775] RIP: 0033:0x7f686c75c1cb
[ 100.310617] Code: 73 01 c3 48 8b 0d 65 3c 10 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 21 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 35 3c 10 00 f7 d8 64 89 01 48
[ 100.317847] RSP: 002b:00007ffc60137a38 EFLAGS: 00000246 ORIG_RAX: 0000000000000021
[ 100.321200] RAX: ffffffffffffffda RBX: 000055f566469ea0 RCX: 00007f686c75c1cb
[ 100.324631] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 000000000000000a
[ 100.328104] RBP: 00007ffc60137ac0 R08: 00007f686c818460 R09: 000000000000000a
[ 100.331509] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009
[ 100.334992] R13: 0000000000000007 R14: 000000000000000a R15: 0000000000000007
[ 100.338381] </TASK>
We hit the bug because when second hist trigger has was created
has_hist_vars() returned false because hist trigger did not have
variables. As a result of that save_hist_vars() was not called to add
the trigger to trace_array->hist_vars. Later on when we attempted to
remove the first histogram find_any_var_ref() failed to detect it is
being used because it did not find the second trigger in hist_vars list.
With this change we wait until trigger actions are created so we can take
into consideration if hist trigger has variable references. Also, now we
check the return value of save_hist_vars() and fail trigger creation if
save_hist_vars() fails.
Link: https://lore.kernel.org/linux-trace-kernel/20230712223021.636335-1-mkhalfella@purestorage.com
Cc: stable@vger.kernel.org
Fixes:
|
||
Steven Rostedt (Google)
|
bec3c25c24 |
tracing: Stop FORTIFY_SOURCE complaining about stack trace caller
The stack_trace event is an event created by the tracing subsystem to store stack traces. It originally just contained a hard coded array of 8 words to hold the stack, and a "size" to know how many entries are there. This is exported to user space as: name: kernel_stack ID: 4 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int size; offset:8; size:4; signed:1; field:unsigned long caller[8]; offset:16; size:64; signed:0; print fmt: "\t=> %ps\n\t=> %ps\n\t=> %ps\n" "\t=> %ps\n\t=> %ps\n\t=> %ps\n" "\t=> %ps\n\t=> %ps\n",i (void *)REC->caller[0], (void *)REC->caller[1], (void *)REC->caller[2], (void *)REC->caller[3], (void *)REC->caller[4], (void *)REC->caller[5], (void *)REC->caller[6], (void *)REC->caller[7] Where the user space tracers could parse the stack. The library was updated for this specific event to only look at the size, and not the array. But some older users still look at the array (note, the older code still checks to make sure the array fits inside the event that it read. That is, if only 4 words were saved, the parser would not read the fifth word because it will see that it was outside of the event size). This event was changed a while ago to be more dynamic, and would save a full stack even if it was greater than 8 words. It does this by simply allocating more ring buffer to hold the extra words. Then it copies in the stack via: memcpy(&entry->caller, fstack->calls, size); As the entry is struct stack_entry, that is created by a macro to both create the structure and export this to user space, it still had the caller field of entry defined as: unsigned long caller[8]. When the stack is greater than 8, the FORTIFY_SOURCE code notices that the amount being copied is greater than the source array and complains about it. It has no idea that the source is pointing to the ring buffer with the required allocation. To hide this from the FORTIFY_SOURCE logic, pointer arithmetic is used: ptr = ring_buffer_event_data(event); entry = ptr; ptr += offsetof(typeof(*entry), caller); memcpy(ptr, fstack->calls, size); Link: https://lore.kernel.org/all/20230612160748.4082850-1-svens@linux.ibm.com/ Link: https://lore.kernel.org/linux-trace-kernel/20230712105235.5fc441aa@gandalf.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Reported-by: Sven Schnelle <svens@linux.ibm.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
Zheng Yejian
|
26efd79c46 |
ftrace: Fix possible warning on checking all pages used in ftrace_process_locs()
As comments in ftrace_process_locs(), there may be NULL pointers in mcount_loc section: > Some architecture linkers will pad between > the different mcount_loc sections of different > object files to satisfy alignments. > Skip any NULL pointers. After commit |
||
Linus Torvalds
|
9a3236ce48 |
Probes fixes and clean ups for v6.5-rc1:
- Fix fprobe's rethook release timing issue(1). Release rethook after ftrace_ops is unregistered so that the rethook is not accessed after free. - Fix fprobe's rethook access timing issue(2). Stop rethook before ftrace_ops is unregistered so that the rethook is NOT keep using after exiting the unregister_fprobe(). - Fix eprobe cleanup logic. If it attaches to multiple events and failes to enable one of them, rollback all enabled events correctly. - Fix fprobe to unlock ftrace recursion lock correctly when it missed by another running kprobe. - Cleanup kprobe to remove unnecessary NULL. - Cleanup kprobe to remove unnecessary 0 initializations. -----BEGIN PGP SIGNATURE----- iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmStawEbHG1hc2FtaS5o aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bMBkIAJYun4zeXsFeUUNVZMP8 UlcyBt/uiB1Ch/t1T1wc55plWIDAvUfN+FEltwhb6MJsQgWEjKJxNcH+oquQeqSH OkUvV6a8BR73FWbCt1Tm2MQKEG1RHC1R4JCj5GCzP93rQSCnmvr0c1yb6+JKQRx4 aPgWUjDm9vhYlOXS6heHo0hf0MbXDl1kqHjvMU2MXUMk7NtQ2JEo1Whikf7Drl6D ufNqV54GmtuJhgIWAqSk+qWresRvXy0/5i4ONK1Kmq+pdhssjZ4KFsWFQTIkQzdU nP8t2EfOp3aglx7ANvEJ+COfFNi0aMnHVBo7BbzsOy7Cq7IZTfD6zOTYSfylGGdA Uw4= =RBbc -----END PGP SIGNATURE----- Merge tag 'probes-fixes-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fixes from Masami Hiramatsu: - Fix fprobe's rethook release issues: - Release rethook after ftrace_ops is unregistered so that the rethook is not accessed after free. - Stop rethook before ftrace_ops is unregistered so that the rethook is NOT used after exiting unregister_fprobe() - Fix eprobe cleanup logic. If it attaches to multiple events and failes to enable one of them, rollback all enabled events correctly. - Fix fprobe to unlock ftrace recursion lock correctly when it missed by another running kprobe. - Cleanup kprobe to remove unnecessary NULL. - Cleanup kprobe to remove unnecessary 0 initializations. * tag 'probes-fixes-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: fprobe: Ensure running fprobe_exit_handler() finished before calling rethook_free() kernel: kprobes: Remove unnecessary ‘0’ values kprobes: Remove unnecessary ‘NULL’ values from correct_ret_addr fprobe: add unlock to match a succeeded ftrace_test_recursion_trylock kernel/trace: Fix cleanup logic of enable_trace_eprobe fprobe: Release rethook after the ftrace_ops is unregistered |
||
Zheng Yejian
|
7e42907f3a |
ring-buffer: Fix deadloop issue on reading trace_pipe
Soft lockup occurs when reading file 'trace_pipe':
watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [cat:4488]
[...]
RIP: 0010:ring_buffer_empty_cpu+0xed/0x170
RSP: 0018:ffff88810dd6fc48 EFLAGS: 00000246
RAX: 0000000000000000 RBX: 0000000000000246 RCX: ffffffff93d1aaeb
RDX: ffff88810a280040 RSI: 0000000000000008 RDI: ffff88811164b218
RBP: ffff88811164b218 R08: 0000000000000000 R09: ffff88815156600f
R10: ffffed102a2acc01 R11: 0000000000000001 R12: 0000000051651901
R13: 0000000000000000 R14: ffff888115e49500 R15: 0000000000000000
[...]
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8d853c2000 CR3: 000000010dcd8000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
__find_next_entry+0x1a8/0x4b0
? peek_next_entry+0x250/0x250
? down_write+0xa5/0x120
? down_write_killable+0x130/0x130
trace_find_next_entry_inc+0x3b/0x1d0
tracing_read_pipe+0x423/0xae0
? tracing_splice_read_pipe+0xcb0/0xcb0
vfs_read+0x16b/0x490
ksys_read+0x105/0x210
? __ia32_sys_pwrite64+0x200/0x200
? switch_fpu_return+0x108/0x220
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x61/0xc6
Through the vmcore, I found it's because in tracing_read_pipe(),
ring_buffer_empty_cpu() found some buffer is not empty but then it
cannot read anything due to "rb_num_of_entries() == 0" always true,
Then it infinitely loop the procedure due to user buffer not been
filled, see following code path:
tracing_read_pipe() {
... ...
waitagain:
tracing_wait_pipe() // 1. find non-empty buffer here
trace_find_next_entry_inc() // 2. loop here try to find an entry
__find_next_entry()
ring_buffer_empty_cpu(); // 3. find non-empty buffer
peek_next_entry() // 4. but peek always return NULL
ring_buffer_peek()
rb_buffer_peek()
rb_get_reader_page()
// 5. because rb_num_of_entries() == 0 always true here
// then return NULL
// 6. user buffer not been filled so goto 'waitgain'
// and eventually leads to an deadloop in kernel!!!
}
By some analyzing, I found that when resetting ringbuffer, the 'entries'
of its pages are not all cleared (see rb_reset_cpu()). Then when reducing
the ringbuffer, and if some reduced pages exist dirty 'entries' data, they
will be added into 'cpu_buffer->overrun' (see rb_remove_pages()), which
cause wrong 'overrun' count and eventually cause the deadloop issue.
To fix it, we need to clear every pages in rb_reset_cpu().
Link: https://lore.kernel.org/linux-trace-kernel/20230708225144.3785600-1-zhengyejian1@huawei.com
Cc: stable@vger.kernel.org
Fixes:
|
||
Arnd Bergmann
|
7d8b31b73c |
tracing: arm64: Avoid missing-prototype warnings
These are all tracing W=1 warnings in arm64 allmodconfig about missing prototypes: kernel/trace/trace_kprobe_selftest.c:7:5: error: no previous prototype for 'kprobe_trace_selftest_target' [-Werror=missing-pro totypes] kernel/trace/ftrace.c:329:5: error: no previous prototype for '__register_ftrace_function' [-Werror=missing-prototypes] kernel/trace/ftrace.c:372:5: error: no previous prototype for '__unregister_ftrace_function' [-Werror=missing-prototypes] kernel/trace/ftrace.c:4130:15: error: no previous prototype for 'arch_ftrace_match_adjust' [-Werror=missing-prototypes] kernel/trace/fgraph.c:243:15: error: no previous prototype for 'ftrace_return_to_handler' [-Werror=missing-prototypes] kernel/trace/fgraph.c:358:6: error: no previous prototype for 'ftrace_graph_sleep_time_control' [-Werror=missing-prototypes] arch/arm64/kernel/ftrace.c:460:6: error: no previous prototype for 'prepare_ftrace_return' [-Werror=missing-prototypes] arch/arm64/kernel/ptrace.c:2172:5: error: no previous prototype for 'syscall_trace_enter' [-Werror=missing-prototypes] arch/arm64/kernel/ptrace.c:2195:6: error: no previous prototype for 'syscall_trace_exit' [-Werror=missing-prototypes] Move the declarations to an appropriate header where they can be seen by the caller and callee, and make sure the headers are included where needed. Link: https://lore.kernel.org/linux-trace-kernel/20230517125215.930689-1-arnd@kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Florent Revest <revest@chromium.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Catalin Marinas <catalin.marinas@arm.com> [ Fixed ftrace_return_to_handler() to handle CONFIG_HAVE_FUNCTION_GRAPH_RETVAL case ] Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
Pu Lehui
|
4369016497 |
bpf: cpumap: Fix memory leak in cpu_map_update_elem
Syzkaller reported a memory leak as follows:
BUG: memory leak
unreferenced object 0xff110001198ef748 (size 192):
comm "syz-executor.3", pid 17672, jiffies 4298118891 (age 9.906s)
hex dump (first 32 bytes):
00 00 00 00 4a 19 00 00 80 ad e3 e4 fe ff c0 00 ....J...........
00 b2 d3 0c 01 00 11 ff 28 f5 8e 19 01 00 11 ff ........(.......
backtrace:
[<ffffffffadd28087>] __cpu_map_entry_alloc+0xf7/0xb00
[<ffffffffadd28d8e>] cpu_map_update_elem+0x2fe/0x3d0
[<ffffffffadc6d0fd>] bpf_map_update_value.isra.0+0x2bd/0x520
[<ffffffffadc7349b>] map_update_elem+0x4cb/0x720
[<ffffffffadc7d983>] __se_sys_bpf+0x8c3/0xb90
[<ffffffffb029cc80>] do_syscall_64+0x30/0x40
[<ffffffffb0400099>] entry_SYSCALL_64_after_hwframe+0x61/0xc6
BUG: memory leak
unreferenced object 0xff110001198ef528 (size 192):
comm "syz-executor.3", pid 17672, jiffies 4298118891 (age 9.906s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffffadd281f0>] __cpu_map_entry_alloc+0x260/0xb00
[<ffffffffadd28d8e>] cpu_map_update_elem+0x2fe/0x3d0
[<ffffffffadc6d0fd>] bpf_map_update_value.isra.0+0x2bd/0x520
[<ffffffffadc7349b>] map_update_elem+0x4cb/0x720
[<ffffffffadc7d983>] __se_sys_bpf+0x8c3/0xb90
[<ffffffffb029cc80>] do_syscall_64+0x30/0x40
[<ffffffffb0400099>] entry_SYSCALL_64_after_hwframe+0x61/0xc6
BUG: memory leak
unreferenced object 0xff1100010fd93d68 (size 8):
comm "syz-executor.3", pid 17672, jiffies 4298118891 (age 9.906s)
hex dump (first 8 bytes):
00 00 00 00 00 00 00 00 ........
backtrace:
[<ffffffffade5db3e>] kvmalloc_node+0x11e/0x170
[<ffffffffadd28280>] __cpu_map_entry_alloc+0x2f0/0xb00
[<ffffffffadd28d8e>] cpu_map_update_elem+0x2fe/0x3d0
[<ffffffffadc6d0fd>] bpf_map_update_value.isra.0+0x2bd/0x520
[<ffffffffadc7349b>] map_update_elem+0x4cb/0x720
[<ffffffffadc7d983>] __se_sys_bpf+0x8c3/0xb90
[<ffffffffb029cc80>] do_syscall_64+0x30/0x40
[<ffffffffb0400099>] entry_SYSCALL_64_after_hwframe+0x61/0xc6
In the cpu_map_update_elem flow, when kthread_stop is called before
calling the threadfn of rcpu->kthread, since the KTHREAD_SHOULD_STOP bit
of kthread has been set by kthread_stop, the threadfn of rcpu->kthread
will never be executed, and rcpu->refcnt will never be 0, which will
lead to the allocated rcpu, rcpu->queue and rcpu->queue->queue cannot be
released.
Calling kthread_stop before executing kthread's threadfn will return
-EINTR. We can complete the release of memory resources in this state.
Fixes:
|
||
Beau Belgrave
|
d0a3022f30 |
tracing/user_events: Fix struct arg size match check
When users register an event the name of the event and it's argument are
checked to ensure they match if the event already exists. Normally all
arguments are in the form of "type name", except for when the type
starts with "struct ". In those cases, the size of the struct is passed
in addition to the name, IE: "struct my_struct a 20" for an argument
that is of type "struct my_struct" with a field name of "a" and has the
size of 20 bytes.
The current code does not honor the above case properly when comparing
a match. This causes the event register to fail even when the same
string was used for events that contain a struct argument within them.
The example above "struct my_struct a 20" generates a match string of
"struct my_struct a" omitting the size field.
Add the struct size of the existing field when generating a comparison
string for a struct field to ensure proper match checking.
Link: https://lkml.kernel.org/r/20230629235049.581-2-beaub@linux.microsoft.com
Cc: stable@vger.kernel.org
Fixes:
|
||
Masami Hiramatsu (Google)
|
195b9cb5b2 |
fprobe: Ensure running fprobe_exit_handler() finished before calling rethook_free()
Ensure running fprobe_exit_handler() has finished before calling rethook_free() in the unregister_fprobe() so that caller can free the fprobe right after unregister_fprobe(). unregister_fprobe() ensured that all running fprobe_entry/exit_handler() have finished by calling unregister_ftrace_function() which synchronizes RCU. But commit |
||
Li zeming
|
ed9492dfef |
kernel: kprobes: Remove unnecessary ‘0’ values
it is assigned first, so it does not need to initialize the assignment. Link: https://lore.kernel.org/all/20230711185353.3218-1-zeming@nfschina.com/ Signed-off-by: Li zeming <zeming@nfschina.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> |
||
Li zeming
|
e1164787f2 |
kprobes: Remove unnecessary ‘NULL’ values from correct_ret_addr
The 'correct_ret_addr' pointer is always set in the later code, no need to initialize it at definition time. Link: https://lore.kernel.org/all/20230704194359.3124-1-zeming@nfschina.com/ Signed-off-by: Li zeming <zeming@nfschina.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> |
||
Ze Gao
|
5f0c584daf |
fprobe: add unlock to match a succeeded ftrace_test_recursion_trylock
Unlock ftrace recursion lock when fprobe_kprobe_handler() is failed
because of some running kprobe.
Link: https://lore.kernel.org/all/20230703092336.268371-1-zegao@tencent.com/
Fixes:
|
||
Tzvetomir Stoyanov (VMware)
|
cf0a624dc7 |
kernel/trace: Fix cleanup logic of enable_trace_eprobe
The enable_trace_eprobe() function enables all event probes, attached
to given trace probe. If an error occurs in enabling one of the event
probes, all others should be roll backed. There is a bug in that roll
back logic - instead of all event probes, only the failed one is
disabled.
Link: https://lore.kernel.org/all/20230703042853.1427493-1-tz.stoyanov@gmail.com/
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes:
|
||
Linus Torvalds
|
f71f64210d |
dma-mapping fixes for Linux 6.5
- swiotlb area sizing fixes (Petr Tesarik) -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmSq57sLHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYOh0xAAkwklIxxzXvNlgjvy2hdgPWImPS8tGPDSIsqA9TDD WDZq89nz/ndnchdPObDvyJXmfBgqa0qCHqopBVPqMKv5a1pKZhrRXYlbajGQQwji MIqefTLZ/VGw7bDpEivOt+yadwQ3KxuxWWs7/JPKLLReSJ22H8P+jkrK7P7kBL5X YaMtZG9d86fvFHnQHKdAOlF1iCvnoZDHPcvaZbI6m5mMSZ+HIYqK5pP1MTUEAbIU MX4ZSI7/mL0q67+kZuM/NCSeq1pH0Cd0D2DGm+8k/y87G81GS6E5Wgg+xO7vYiXf 5ChuwlAO9K9KhH7NIRkKhkad/Ii89ENXSyv5gmPRoKYK5FXajnHSlJTUrZojV6XC Pbsd9ATVzV0rY61EPyh6G1a+Ttp/pwMp+W0I2fi032GVAePQ/vhB9x9O+2O/3QiC v80nUSatkciZncWqkguhp3NONsRmLKep3CCQnEAA/gLs27B0avdQeslnqbOOUQKd Si+djIoel8ONjQ+mW8eFCsVYMH1xFSo0aGsgGe0y2cyBE3DN1AW9eRnOXWT4C1PR UyDlx8ACs87ojec+YRQFYk2/PbsU7CQiH1pteXvBHcbhiVUAvrtXtg6ANQ+7066P IIduAZmlHcHb1BhyrSQbAtRllVLIp/l9IAkCSY9SvL0tjh/B5CaRBD5m2Taow5I/ KUI= =4Lfc -----END PGP SIGNATURE----- Merge tag 'dma-mapping-6.5-2023-07-09' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fixes from Christoph Hellwig: - swiotlb area sizing fixes (Petr Tesarik) * tag 'dma-mapping-6.5-2023-07-09' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: reduce the number of areas to match actual memory pool size swiotlb: always set the number of areas before allocating the pool |
||
Linus Torvalds
|
a9943ad3dd |
- Optimize IRQ domain's name assignment
-----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmSqa70ACgkQEsHwGGHe VUrTlhAAtfvA+mCeWVwn42827YlJeMvFV6RedERSk8x5CO/IJgHx2YFB0UXlznuM xQSIC7pkY2He62ggb+s6m+Pb1M3bxxJDaRSRnQGHOqXuSNuzokqjwvVMaSBE9s5W fgvdbZ4CDEynW7awiDq1PaMcJFzbc6RMbBDspNDke2ikZgGvqqWl3Gx2DHhGyNfQ UFJHETejxhfheU7Y9c06P2ToTyzc0RcaPFQfFmYIse+c6iTcjnqr9NRHCIklSygN F3VjSEXC1MkAZIAuV9pjkl3z01xfvDe1WqxH3jID/vM97AY08DqmBQC31VZOBmmV shndEUpc2yi7ued0WU1XLf1sMyEHm5rgF/19gRhFR9sxu4Zm9+JxOofo0URwqtsV Z3d0CxpDrF0ATJs2zTstUmOG0AD6SBfC+BQZi3M6rohXgR0z04dBnyTgCnxAs60u d5byFza5nZ/sX7KZXgVlKfH5ej0AcZuyFM/qMKqiTqzz5C4dv0Hq7y718aaecxmz W9A3olDGhC48pr9250TthEMhBzRdXvKgKQBUhC2TY/0hUBHyvmHeU//ql54XYhgh XVx3qI7yieZC3d27hZN0WTCG6qSCJoq8LROmGDIbx16Uj/cm1AHzJTJ+5CbCOafZ 41gPBoNCNYNXlgYNBXpVcPRtfcK98fGl3jpPOk0llaSa6IW+Lcc= =ATsc -----END PGP SIGNATURE----- Merge tag 'irq_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq update from Borislav Petkov: - Optimize IRQ domain's name assignment * tag 'irq_urgent_for_v6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqdomain: Use return value of strreplace() |
||
Suren Baghdasaryan
|
fb49c45532 |
fork: lock VMAs of the parent process when forking
When forking a child process, the parent write-protects anonymous pages
and COW-shares them with the child being forked using copy_present_pte().
We must not take any concurrent page faults on the source vma's as they
are being processed, as we expect both the vma and the pte's behind it
to be stable. For example, the anon_vma_fork() expects the parents
vma->anon_vma to not change during the vma copy.
A concurrent page fault on a page newly marked read-only by the page
copy might trigger wp_page_copy() and a anon_vma_prepare(vma) on the
source vma, defeating the anon_vma_clone() that wasn't done because the
parent vma originally didn't have an anon_vma, but we now might end up
copying a pte entry for a page that has one.
Before the per-vma lock based changes, the mmap_lock guaranteed
exclusion with concurrent page faults. But now we need to do a
vma_start_write() to make sure no concurrent faults happen on this vma
while it is being processed.
This fix can potentially regress some fork-heavy workloads. Kernel
build time did not show noticeable regression on a 56-core machine while
a stress test mapping 10000 VMAs and forking 5000 times in a tight loop
shows ~5% regression. If such fork time regression is unacceptable,
disabling CONFIG_PER_VMA_LOCK should restore its performance. Further
optimizations are possible if this regression proves to be problematic.
Suggested-by: David Hildenbrand <david@redhat.com>
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/
Reported-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/
Reported-by: Jacob Young <jacobly.alt@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624
Fixes:
|
||
Linus Torvalds
|
8066178f53 |
Tracing fixes for 6.5:
- Fix bad git merge of #endif in arm64 code A merge of the arm64 tree caused #endif to go into the wrong place - Fix crash on lseek of write access to tracefs/error_log Opening error_log as write only, and then doing an lseek() causes a kernel panic, because the lseek() handle expects a "seq_file" to exist (which is not done on write only opens). Use tracing_lseek() that tests for this instead of calling the default seq lseek handler. - Check for negative instead of -E2BIG for error on strscpy() returns Instead of testing for -E2BIG from strscpy(), to be more robust, check for less than zero, which will make sure it catches any error that strscpy() may someday return. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZKbQUhQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qpWaAP9zQ1eLQSfMt0dHH01OBSJvc2mMd4QJ VZtWZ+xTSvk+4gD/axDzDS7Qisfrrli+1oQSPwVik2SXiz0SPJqJ25m9zw4= =xMlg -----END PGP SIGNATURE----- Merge tag 'trace-v6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix bad git merge of #endif in arm64 code A merge of the arm64 tree caused #endif to go into the wrong place - Fix crash on lseek of write access to tracefs/error_log Opening error_log as write only, and then doing an lseek() causes a kernel panic, because the lseek() handle expects a "seq_file" to exist (which is not done on write only opens). Use tracing_lseek() that tests for this instead of calling the default seq lseek handler. - Check for negative instead of -E2BIG for error on strscpy() returns Instead of testing for -E2BIG from strscpy(), to be more robust, check for less than zero, which will make sure it catches any error that strscpy() may someday return. * tag 'trace-v6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/boot: Test strscpy() against less than zero for error arm64: ftrace: fix build error with CONFIG_FUNCTION_GRAPH_TRACER=n tracing: Fix null pointer dereference in tracing_err_log_open() |
||
Linus Torvalds
|
7b82e90411 |
asm-generic updates for 6.5
These are cleanups for architecture specific header files: - the comments in include/linux/syscalls.h have gone out of sync and are really pointless, so these get removed - The asm/bitsperlong.h header no longer needs to be architecture specific on modern compilers, so use a generic version for newer architectures that use new enough userspace compilers - A cleanup for virt_to_pfn/virt_to_bus to have proper type checking, forcing the use of pointers -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmSl138ACgkQYKtH/8kJ UieqWxAA2WjNVfyuieYckglOVE0PZPs2fzCwyzTY5iUTH3gE5cBFWJDWcg2EnouG v3X3htEQcowYWaCF9+rypQXaGiSx4WXi2Bjxnz3D/BcreqWPI4eSQ0fpGG5SURTY 2zYF72GTt4JGR++l+7/R9MZwPbwYDT9BsD5tkel8PxnyVLM6/c5xFvbjzRSKFE8x SMN1jGZ62ITLNf/8coAOEPNxBYtDT6yQyu7P2sx5cd65LAQq9yLKjFklnBBovgWT OoCIZAdGkhcNwOh1LjyHcdNdpfNJGceKyqKPqty07IhCQuF2jxiyFYFzuBbeyQfE S0itN8o/MIfUmxaQl3e8dPAVb1RlNVr1zfQ6y4tUtWNdkNL2WwSnSQSRHrBfHxCQ QCF++PMeFcLhGwMYtqdNJ7XGLQ0PsjD74pRf0vo+vjmqDk2BJsJBP57VU+8MJn5r SoxqnJ0WxLvm1TfrNKusV7zMNWquc2duJDW40zsOssP4itjYELSI6qa56qmzlqmX zKmRx6mxAlx9RRK8FHXFYHbz3p93vv8z9vTOZV3AjIjjED960CLknUAwCC8FoJyz 9b5wyMXsLQHQjGt8luAvPc6OiU0EiU9a4SPK+feWcv27serFvnjJlRTS/yG2Z3zd BYsUgsXHypsdoud+aE7MeCy7fE8n3mhoyMQQRBkOMFJ7RsG6wAE= =S/he -----END PGP SIGNATURE----- Merge tag 'asm-generic-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic updates from Arnd Bergmann: "These are cleanups for architecture specific header files: - the comments in include/linux/syscalls.h have gone out of sync and are really pointless, so these get removed - The asm/bitsperlong.h header no longer needs to be architecture specific on modern compilers, so use a generic version for newer architectures that use new enough userspace compilers - A cleanup for virt_to_pfn/virt_to_bus to have proper type checking, forcing the use of pointers" * tag 'asm-generic-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: syscalls: Remove file path comments from headers tools arch: Remove uapi bitsperlong.h of hexagon and microblaze asm-generic: Unify uapi bitsperlong.h for arm64, riscv and loongarch m68k/mm: Make pfn accessors static inlines arm64: memory: Make virt_to_pfn() a static inline ARM: mm: Make virt_to_pfn() a static inline asm-generic/page.h: Make pfn accessors static inlines xen/netback: Pass (void *) to virt_to_page() netfs: Pass a pointer to virt_to_page() cifs: Pass a pointer to virt_to_page() in cifsglob cifs: Pass a pointer to virt_to_page() riscv: mm: init: Pass a pointer to virt_to_page() ARC: init: Pass a pointer to virt_to_pfn() in init m68k: Pass a pointer to virt_to_pfn() virt_to_page() fs/proc/kcore.c: Pass a pointer to virt_addr_valid() |
||
Kumar Kartikeya Dwivedi
|
5415ccd50a |
bpf: Fix max stack depth check for async callbacks
The check_max_stack_depth pass happens after the verifier's symbolic
execution, and attempts to walk the call graph of the BPF program,
ensuring that the stack usage stays within bounds for all possible call
chains. There are two cases to consider: bpf_pseudo_func and
bpf_pseudo_call. In the former case, the callback pointer is loaded into
a register, and is assumed that it is passed to some helper later which
calls it (however there is no way to be sure), but the check remains
conservative and accounts the stack usage anyway. For this particular
case, asynchronous callbacks are skipped as they execute asynchronously
when their corresponding event fires.
The case of bpf_pseudo_call is simpler and we know that the call is
definitely made, hence the stack depth of the subprog is accounted for.
However, the current check still skips an asynchronous callback even if
a bpf_pseudo_call was made for it. This is erroneous, as it will miss
accounting for the stack usage of the asynchronous callback, which can
be used to breach the maximum stack depth limit.
Fix this by only skipping asynchronous callbacks when the instruction is
not a pseudo call to the subprog.
Fixes:
|
||
Linus Torvalds
|
6843306689 |
Including fixes from bluetooth, bpf and wireguard.
Current release - regressions: - nvme-tcp: fix comma-related oops after sendpage changes Current release - new code bugs: - ptp: make max_phase_adjustment sysfs device attribute invisible when not supported Previous releases - regressions: - sctp: fix potential deadlock on &net->sctp.addr_wq_lock - mptcp: - ensure subflow is unhashed before cleaning the backlog - do not rely on implicit state check in mptcp_listen() Previous releases - always broken: - net: fix net_dev_start_xmit trace event vs skb_transport_offset() - Bluetooth: - fix use-bdaddr-property quirk - L2CAP: fix multiple UaFs - ISO: use hci_sync for setting CIG parameters - hci_event: fix Set CIG Parameters error status handling - hci_event: fix parsing of CIS Established Event - MGMT: fix marking SCAN_RSP as not connectable - wireguard: queuing: use saner cpu selection wrapping - sched: act_ipt: various bug fixes for iptables <> TC interactions - sched: act_pedit: add size check for TCA_PEDIT_PARMS_EX - dsa: fixes for receiving PTP packets with 8021q and sja1105 tagging - eth: sfc: fix null-deref in devlink port without MAE access - eth: ibmvnic: do not reset dql stats on NON_FATAL err Misc: - xsk: honor SO_BINDTODEVICE on bind Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmSlu+MACgkQMUZtbf5S Irslgw//S7jf/GL8V6y8VL3te+/OPOZnLDTzFFOdy64/y97FE6XIacJUpyWRhtmz oSzcSNHETPW9U+xSGa2ZQlKhAXt6n9iRNvUegql+VBb13Iz+l7AdTeoxRv/YuwDo 5lTOIB6cBw+ATd0oxS6wr8SyUlcvktUKBfTAItjbVM55aXfIUpXIa84+F7avJgIA XP1u/3PHhwItmwo/hXhHH0+P0QA8ix1q2SvRB7DAlQLBsTuQhaKjXWQkYYTKw/Nt dtvh8iQSs/YXaHMjTa5CK28HOD8+ywIizr+uJ9VaNqIzV0W5JE9IE8P4NFpBcY7t kGjTYODOph7dkNmZ5RLj3N+B6CyC57OXDzoo/tr8940UytCLVj9EVyduarLGLx57 edqK9cUz5kWejyGoyZ4Pvlo/SKvCQ2HKMeiAJ0/nNpTJMFuygMoqGsaD6ttzkXMj fZLPjRUK3axd+15ZzhLEf8HyL5Qh+qPqqX9p7NljfMKwhxMWJ5fuICJfdGOSdMJR ndL+wPfRPFQwszZ4pbTY2Ivn29mo8ScBOSOEgQs2mOny+zFzTzmqNWz/jcFfQnjS cylxBEHrgudT2FuCImZ/v66TM5yakHXqIdpTGG+zsvJWQqjM96Z3I7WRvi0g9d75 n84il+j34mnzl90j2xEutqUiK7BQ9ZpZBsutPVTKBIHKWWiortI= =9yzk -----END PGP SIGNATURE----- Merge tag 'net-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth, bpf and wireguard. Current release - regressions: - nvme-tcp: fix comma-related oops after sendpage changes Current release - new code bugs: - ptp: make max_phase_adjustment sysfs device attribute invisible when not supported Previous releases - regressions: - sctp: fix potential deadlock on &net->sctp.addr_wq_lock - mptcp: - ensure subflow is unhashed before cleaning the backlog - do not rely on implicit state check in mptcp_listen() Previous releases - always broken: - net: fix net_dev_start_xmit trace event vs skb_transport_offset() - Bluetooth: - fix use-bdaddr-property quirk - L2CAP: fix multiple UaFs - ISO: use hci_sync for setting CIG parameters - hci_event: fix Set CIG Parameters error status handling - hci_event: fix parsing of CIS Established Event - MGMT: fix marking SCAN_RSP as not connectable - wireguard: queuing: use saner cpu selection wrapping - sched: act_ipt: various bug fixes for iptables <> TC interactions - sched: act_pedit: add size check for TCA_PEDIT_PARMS_EX - dsa: fixes for receiving PTP packets with 8021q and sja1105 tagging - eth: sfc: fix null-deref in devlink port without MAE access - eth: ibmvnic: do not reset dql stats on NON_FATAL err Misc: - xsk: honor SO_BINDTODEVICE on bind" * tag 'net-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (70 commits) nfp: clean mc addresses in application firmware when closing port selftests: mptcp: pm_nl_ctl: fix 32-bit support selftests: mptcp: depend on SYN_COOKIES selftests: mptcp: userspace_pm: report errors with 'remove' tests selftests: mptcp: userspace_pm: use correct server port selftests: mptcp: sockopt: return error if wrong mark selftests: mptcp: sockopt: use 'iptables-legacy' if available selftests: mptcp: connect: fail if nft supposed to work mptcp: do not rely on implicit state check in mptcp_listen() mptcp: ensure subflow is unhashed before cleaning the backlog s390/qeth: Fix vipa deletion octeontx-af: fix hardware timestamp configuration net: dsa: sja1105: always enable the send_meta options net: dsa: tag_sja1105: fix MAC DA patching from meta frames net: Replace strlcpy with strscpy pptp: Fix fib lookup calls. mlxsw: spectrum_router: Fix an IS_ERR() vs NULL check net/sched: act_pedit: Add size check for TCA_PEDIT_PARMS_EX xsk: Honor SO_BINDTODEVICE on bind ptp: Make max_phase_adjustment sysfs device attribute invisible when not supported ... |
||
Steven Rostedt (Google)
|
fddca7db4a |
tracing/boot: Test strscpy() against less than zero for error
Instead of checking for -E2BIG, it is better to just check for less than zero of strscpy() for error. Testing for -E2BIG is not very robust, and the calling code does not really care about the error code, just that there was an error. One of the updates to convert strlcpy() to strscpy() had a v2 version that changed the test from testing against -E2BIG to less than zero, but I took the v1 version that still tested for -E2BIG. Link: https://lore.kernel.org/linux-trace-kernel/20230615180420.400769-1-azeemshaikh38@gmail.com/ Link: https://lore.kernel.org/linux-trace-kernel/20230704100807.707d1605@rorschach.local.home Cc: Mark Rutland <mark.rutland@arm.com> Cc: Azeem Shaikh <azeemshaikh38@gmail.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> |
||
Mateusz Stachyra
|
02b0095e2f |
tracing: Fix null pointer dereference in tracing_err_log_open()
Fix an issue in function 'tracing_err_log_open'.
The function doesn't call 'seq_open' if the file is opened only with
write permissions, which results in 'file->private_data' being left as null.
If we then use 'lseek' on that opened file, 'seq_lseek' dereferences
'file->private_data' in 'mutex_lock(&m->lock)', resulting in a kernel panic.
Writing to this node requires root privileges, therefore this bug
has very little security impact.
Tracefs node: /sys/kernel/tracing/error_log
Example Kernel panic:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
Call trace:
mutex_lock+0x30/0x110
seq_lseek+0x34/0xb8
__arm64_sys_lseek+0x6c/0xb8
invoke_syscall+0x58/0x13c
el0_svc_common+0xc4/0x10c
do_el0_svc+0x24/0x98
el0_svc+0x24/0x88
el0t_64_sync_handler+0x84/0xe4
el0t_64_sync+0x1b4/0x1b8
Code: d503201f aa0803e0 aa1f03e1 aa0103e9 (c8e97d02)
---[ end trace 561d1b49c12cf8a5 ]---
Kernel panic - not syncing: Oops: Fatal exception
Link: https://lore.kernel.org/linux-trace-kernel/20230703155237eucms1p4dfb6a19caa14c79eb6c823d127b39024@eucms1p4
Link: https://lore.kernel.org/linux-trace-kernel/20230704102706eucms1p30d7ecdcc287f46ad67679fc8491b2e0f@eucms1p3
Cc: stable@vger.kernel.org
Fixes:
|
||
Linus Torvalds
|
f196220715 |
module: fix init_module_from_file() error handling
Vegard Nossum pointed out two different problems with the error handling
in init_module_from_file():
(a) the idempotent loading code didn't clean up properly in some error
cases, leaving the on-stack 'struct idempotent' element still in
the hash table
(b) failure to read the module file would nonsensically update the
'invalid_kread_bytes' stat counter with the error value
The first error is quite nasty, in that it can then cause subsequent
idempotent loads of that same file to access stale stack contents of the
previous failure. The case may not happen in any normal situation
(explaining all the "Tested-by's on the original change), and requires
admin privileges, but syzkaller triggers random bad behavior as a
result:
BUG: soft lockup in sys_finit_module
BUG: unable to handle kernel paging request in init_module_from_file
general protection fault in init_module_from_file
INFO: task hung in init_module_from_file
KASAN: out-of-bounds Read in init_module_from_file
KASAN: slab-out-of-bounds Read in init_module_from_file
...
The second error is fairly benign and just leads to nonsensical stats
(and has been around since the debug stats were added).
Vegard also provided a patch for the idempotent loading issue, but I'd
rather re-organize the code and make it more legible using another level
of helper functions than add the usual "goto out" error handling.
Link: https://lore.kernel.org/lkml/20230704100852.23452-1-vegard.nossum@oracle.com/
Fixes:
|
||
Linus Torvalds
|
eded37770c |
kgdb patches for 6.5
Fairly small changes this cycle. * An additional static inline function when kgdb is not enabled to reduce boilerplate in arch files. * kdb will now handle input with linefeeds more like carriage return. This will make little difference for interactive use but can make it script to use expect-like interaction with kdb. * A couple of warning fixes. Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEELzVBU1D3lWq6cKzwfOMlXTn3iKEFAmSiiJwACgkQfOMlXTn3 iKFAyQ//U+/08u01ikY0o8h//l7ELX4L7SUEJim1xE9Qk6DVF6R5c8+G6E0kWS5M ClKBf8J3YdjM2UvysNmVmh3/Sv1JP2GLT7YUoKNBYv+O8kveGy2N/vE6cYfoHK3n 1uOXtIxQ3IafjlgNF+FTTIYm2PDqDhYZTTYIKGlEDnwL3mjEP3iV+hNP2debcLvT 3AWeLcrvzni5KQ301kXqw4SmrgNAeLDKwKKQvDnho8EdAuMDmHTbmUt6Zgvvl6TN rH7nlxcUoDW+M1nMNxQ4Ko7xgEbqhpNQnkOIDv/j4j+KObQqvAVOF4dj9thBZUje IEPQD0O/rAeYDpnKTIx1bmaVkYIBIpiSokRKlHWpz3iHxU52M+cOXe6u2vMHj97t dHTc42TwcBQHSxCQ+R9i1pv6G7i2BfwUSju7opVqKISolwphp9Z4oIZnHwLtWpba P5Tv7Azv6nw1HoYl2tAP/2xNO2u7w1nfqUzfr+/XXFTcZGRt65084HDbRQSzjmiJ 6F5Qw7fwOTNUHEObybjL/A4AilIRK8sBl/vU3TXJUEf3CEryH3VK4lE7qlqKn75G nPplXMiBePbSTFE4aIJDdQFA4l3AOffg4w2Ce8v06eytr9EaNh8vS9W8Pv9dxwAz ZQrItOPO4i+1Sm9n/CvsPkUDVt3wYbb/QnPdQHsbGOoDN9H3JeI= =wAuD -----END PGP SIGNATURE----- Merge tag 'kgdb-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux Pull kgdb updates from Daniel Thompson: "Fairly small changes this cycle: - An additional static inline function when kgdb is not enabled to reduce boilerplate in arch files - kdb will now handle input with linefeeds more like carriage return. This will make little difference for interactive use but can make it script to use expect-like interaction with kdb - A couple of warning fixes" * tag 'kgdb-6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux: kdb: move kdb_send_sig() declaration to a better header file kdb: Handle LF in the command parser kdb: include kdb_private.h for function prototypes kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB |
||
SeongJae Park
|
3de4d22cc9 |
bpf, btf: Warn but return no error for NULL btf from __register_btf_kfunc_id_set()
__register_btf_kfunc_id_set() assumes .BTF to be part of the module's .ko
file if CONFIG_DEBUG_INFO_BTF is enabled. If that's not the case, the
function prints an error message and return an error. As a result, such
modules cannot be loaded.
However, the section could be stripped out during a build process. It would
be better to let the modules loaded, because their basic functionalities
have no problem [0], though the BTF functionalities will not be supported.
Make the function to lower the level of the message from error to warn, and
return no error.
[0] https://lore.kernel.org/bpf/20220219082037.ow2kbq5brktf4f2u@apollo.legion
Fixes:
|
||
Daniel Thompson
|
b6464883f4 |
kdb: move kdb_send_sig() declaration to a better header file
kdb_send_sig() is defined in the signal code and called from kdb, but the declaration is part of the kdb internal code. Move the declaration to the shared header to avoid the warning: kernel/signal.c:4789:6: error: no previous prototype for 'kdb_send_sig' [-Werror=missing-prototypes] Reported-by: Arnd Bergmann <arnd@arndb.de> Closes: https://lore.kernel.org/lkml/20230517125423.930967-1-arnd@kernel.org/ Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Link: https://lore.kernel.org/r/20230630201206.2396930-1-daniel.thompson@linaro.org |
||
Linus Torvalds
|
ad2885979e |
Kbuild updates for v6.5
- Remove the deprecated rule to build *.dtbo from *.dts - Refactor section mismatch detection in modpost - Fix bogus ARM section mismatch detections - Fix error of 'make gtags' with O= option - Add Clang's target triple to KBUILD_CPPFLAGS to fix a build error with the latest LLVM version - Rebuild the built-in initrd when KBUILD_BUILD_TIMESTAMP is changed - Ignore more compiler-generated symbols for kallsyms - Fix 'make local*config' to handle the ${CONFIG_FOO} form in Makefiles - Enable more kernel-doc warnings with W=2 - Refactor <linux/export.h> by generating KSYMTAB data by modpost - Deprecate <asm/export.h> and <asm-generic/export.h> - Remove the EXPORT_DATA_SYMBOL macro - Move the check for static EXPORT_SYMBOL back to modpost, which makes the build faster - Re-implement CONFIG_TRIM_UNUSED_KSYMS with one-pass algorithm - Warn missing MODULE_DESCRIPTION when building modules with W=1 - Make 'make clean' robust against too long argument error - Exclude more objects from GCOV to fix CFI failures with GCOV - Allow 'make modules_install' to install modules.builtin and modules.builtin.modinfo even when CONFIG_MODULES is disabled - Include modules.builtin and modules.builtin.modinfo in the linux-image Debian package even when CONFIG_MODULES is disabled - Revive "Entering directory" logging for the latest Make version -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmSf6B0VHG1hc2FoaXJv eUBrZXJuZWwub3JnAAoJED2LAQed4NsGS2wP/1izzNJ/64XmQoyBDhZCbuOl7ODF n4wgVJnsJmRnD/RxXR/AZ0JZwQHhzpGISWQM61rVIf/RVFOB7Apx1HpmomKUUjrL Yc53wLfhTEizGgwttP6tusLM3RO6jkuMKhjC4rllc0tDLJ3zCcwAjSyiOQQ9PBcH txwAb8r4/TZUzDDCJ0d98WdhIsNDca/ISeRXKHMiIkfvHe+6yizDKu25Y4B6BL5g 0VPJ9nVJZ+XVwRqdVR+UQoPYGZzZ/O2NqAtU7n4PpBKvFfLACILJW+aBDAz9SqN7 RSxn1ahxwq0vrhlB9bSrQRj3N0g8zsi7/xShEZSnGLCbyxYilr5Gq8C59+QxOIJf 5lGBwZlEgn5aWH+D9abwjEI/QOQbTI9kX09sVzweulGCN9iJlJqyIGsB0Ri0/S2R c/n7c8nLwnWnGF/+LXYvkrak8L9YRKori//YYf9zdvh4h1c2/0SS0nDoC29DhDru Am7YmhBAkJXXX3NUB2gLvtdp94GSumqefHeSJ5Sp9v/+f2Ft7ruY2ouJC81xDa4p nNpvolAq2txlZ9t5OU7x7DQiuCWYSws0W7PJ9FBhyHJchf21UHbcm97/HfDoU8rN ioLQGm+h+g6oZt8pArk45wccjkR3ydpEFDWenYbTEr2o3zLfeKigZps5uhCK3DW2 gnVk50VNagkzrzvA =Rc1z -----END PGP SIGNATURE----- Merge tag 'kbuild-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Remove the deprecated rule to build *.dtbo from *.dts - Refactor section mismatch detection in modpost - Fix bogus ARM section mismatch detections - Fix error of 'make gtags' with O= option - Add Clang's target triple to KBUILD_CPPFLAGS to fix a build error with the latest LLVM version - Rebuild the built-in initrd when KBUILD_BUILD_TIMESTAMP is changed - Ignore more compiler-generated symbols for kallsyms - Fix 'make local*config' to handle the ${CONFIG_FOO} form in Makefiles - Enable more kernel-doc warnings with W=2 - Refactor <linux/export.h> by generating KSYMTAB data by modpost - Deprecate <asm/export.h> and <asm-generic/export.h> - Remove the EXPORT_DATA_SYMBOL macro - Move the check for static EXPORT_SYMBOL back to modpost, which makes the build faster - Re-implement CONFIG_TRIM_UNUSED_KSYMS with one-pass algorithm - Warn missing MODULE_DESCRIPTION when building modules with W=1 - Make 'make clean' robust against too long argument error - Exclude more objects from GCOV to fix CFI failures with GCOV - Allow 'make modules_install' to install modules.builtin and modules.builtin.modinfo even when CONFIG_MODULES is disabled - Include modules.builtin and modules.builtin.modinfo in the linux-image Debian package even when CONFIG_MODULES is disabled - Revive "Entering directory" logging for the latest Make version * tag 'kbuild-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (72 commits) modpost: define more R_ARM_* for old distributions kbuild: revive "Entering directory" for Make >= 4.4.1 kbuild: set correct abs_srctree and abs_objtree for package builds scripts/mksysmap: Ignore prefixed KCFI symbols kbuild: deb-pkg: remove the CONFIG_MODULES check in buildeb kbuild: builddeb: always make modules_install, to install modules.builtin* modpost: continue even with unknown relocation type modpost: factor out Elf_Sym pointer calculation to section_rel() modpost: factor out inst location calculation to section_rel() kbuild: Disable GCOV for *.mod.o kbuild: Fix CFI failures with GCOV kbuild: make clean rule robust against too long argument error script: modpost: emit a warning when the description is missing kbuild: make modules_install copy modules.builtin(.modinfo) linux/export.h: rename 'sec' argument to 'license' modpost: show offset from symbol for section mismatch warnings modpost: merge two similar section mismatch warnings kbuild: implement CONFIG_TRIM_UNUSED_KSYMS without recursion modpost: use null string instead of NULL pointer for default namespace modpost: squash sym_update_namespace() into sym_add_exported() ... |
||
Linus Torvalds
|
d25f002575 |
cxl for v6.5
- Add infrastructure for supporting background commands along with support for device sanitization and firmware update - Introduce a CXL performance monitoring unit driver based on the common definition in the specification. - Land some preparatory cleanup and refactoring for the anticipated arrival of CXL type-2 (accelerator devices) and CXL RCH (CXL-v1.1 topology) error handling. - Rework CPU cache management with respect to region configuration (device hotplug or other dynamic changes to memory interleaving) - Fix region reconfiguration vs CXL decoder ordering rules. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCZJ9fkQAKCRDfioYZHlFs ZyWcAP9THJ6ZzX1mbAfHhPz9r+oxsrE3l1jQpNjNbh7MNW29MAEA36dmTE62JaHK lTPDgHxqBt1vrHPktYWOM9ZPHE2tLwA= =3fFL -----END PGP SIGNATURE----- Merge tag 'cxl-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull CXL updates from Dan Williams: "The highlights in terms of new functionality are support for the standard CXL Performance Monitor definition that appeared in CXL 3.0, support for device sanitization (wiping all data from a device), secure-erase (re-keying encryption of user data), and support for firmware update. The firmware update support is notable as it reuses the simple sysfs_upload interface to just cat(1) a blob to a sysfs file and pipe that to the device. Additionally there are a substantial number of cleanups and reorganizations to get ready for RCH error handling (RCH == Restricted CXL Host == current shipping hardware generation / pre CXL-2.0 topologies) and type-2 (accelerator / vendor specific) devices. For vendor specific devices they implement a subset of what the generic type-3 (generic memory expander) driver expects. As a result the rework decouples optional infrastructure from the core driver context. For RCH topologies, where the specification working group did not want to confuse pre-CXL-aware operating systems, many of the standard registers are hidden which makes support standard bus features like AER (PCIe Advanced Error Reporting) difficult. The rework arranges for the driver to help the PCI-AER core. Bjorn is on board with this direction but a late regression disocvery means the completion of this functionality needs to cook a bit longer, so it is code reorganizations only for now. Summary: - Add infrastructure for supporting background commands along with support for device sanitization and firmware update - Introduce a CXL performance monitoring unit driver based on the common definition in the specification. - Land some preparatory cleanup and refactoring for the anticipated arrival of CXL type-2 (accelerator devices) and CXL RCH (CXL-v1.1 topology) error handling. - Rework CPU cache management with respect to region configuration (device hotplug or other dynamic changes to memory interleaving) - Fix region reconfiguration vs CXL decoder ordering rules" * tag 'cxl-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (51 commits) cxl: Fix one kernel-doc comment cxl/pci: Use correct flag for sanitize polling docs: perf: Minimal introduction the the CXL PMU device and driver perf: CXL Performance Monitoring Unit driver tools/testing/cxl: add firmware update emulation to CXL memdevs tools/testing/cxl: Use named effects for the Command Effect Log tools/testing/cxl: Fix command effects for inject/clear poison cxl: add a firmware update mechanism using the sysfs firmware loader cxl/test: Add Secure Erase opcode support cxl/mem: Support Secure Erase cxl/test: Add Sanitize opcode support cxl/mem: Wire up Sanitization support cxl/mbox: Add sanitization handling machinery cxl/mem: Introduce security state sysfs file cxl/mbox: Allow for IRQ_NONE case in the isr Revert "cxl/port: Enable the HDM decoder capability for switch ports" cxl/memdev: Formalize endpoint port linkage cxl/pci: Unconditionally unmask 256B Flit errors cxl/region: Manage decoder target_type at decoder-attach time cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM ... |
||
Christian Brauner
|
dd546618ba |
pid: use struct_size_t() helper
Before commit |
||
Linus Torvalds
|
f4ce392b03 |
Livepatching changes for 6.5
-----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmSeigUACgkQUqAMR0iA lPIMbxAAqnbFKTDZ0alQibPlPulEvDsdVnh0FYLpWRW6sZDG96Lr/s3i1ghWxIUI cJrePc7i8FJBz6cDMvT6e9TOqoP3iYU9AuQdWr1aoN37O3gJO/3AvHN89CFy6tE3 scV/Uo2sM5RCQWptutpd2nu1pFLKYWAKegwX0g0+KkyDINDzDA54iSd6HAwfDUeu DEO2cp+kUgBKDOcGWEHQHQX5DUs1P7lJDBt5aTBq2KzAYSaUls/HBt1L2T9JYV/D yHE3vjmvniug3sYDzXd5aXA+yvIjIsdjqqxMpN7uz8e/q+Mx3045WM1aPCevPq9x V7jkc61CQcDURjUdHaZw8jMCX7kDjNkTzkt0OvATqqoL4wqfcohVExI5KUtf6mOq NAhsaPL4E+pGOa71uXenhbLClYjfpLfpHRxNHT8kh7O+DTMprt7srp0174vnrZTQ Gsfh2BfCoojZxt1SPJWDl1rBaAtZ7EZTmt0oV1opbTgZS1drDKrlbmHQs44oBDeg Uqy0uAB6kI92BzPbS66ZwHup3Tg2uz98YwCTMQ1dTuLSFzQNOju6h38BKRzMKkkz nWF/AQA89TKOgaOZG3HGVadru7zk1P5EvnOD2+iMI3Arw9ClHMu9wAMoNxNcuYTO i+h/hrtenmeH3owA+HVsR/lMnJhnMlxRRfan0pIzUfICIiYlZaw= =macg -----END PGP SIGNATURE----- Merge tag 'livepatching-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching Pull livepatching update from Petr Mladek: - Make a variable static to fix a sparse warning * tag 'livepatching-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: livepatch: Make 'klp_stack_entries' static |
||
Linus Torvalds
|
d2a6fd45c5 |
Probes updates for v6.5:
- fprobe: Pass return address to the fprobe entry/exit callbacks so that the callbacks don't need to analyze pt_regs/stack to find the function return address. - kprobe events: cleanup usage of TPARG_FL_FENTRY and TPARG_FL_RETURN flags so that those are not set at once. - fprobe events: . Add a new fprobe events for tracing arbitrary function entry and exit as a trace event. . Add a new tracepoint events for tracing raw tracepoint as a trace event. This allows user to trace non user-exposed tracepoints. . Move eprobe's event parser code into probe event common file. . Introduce BTF (BPF type format) support to kernel probe (kprobe, fprobe and tracepoint probe) events so that user can specify traced function arguments by name. This also applies the type of argument when fetching the argument. . Introduce '$arg*' wildcard support if BTF is available. This expands the '$arg*' meta argument to all function argument automatically. . Check the return value types by BTF. If the function returns 'void', '$retval' is rejected. . Add some selftest script for fprobe events, tracepoint events and BTF support. . Update documentation about the fprobe events. . Some fixes for above features, document and selftests. - selftests for ftrace (except for new fprobe events): . Add a test case for multiple consecutive probes in a function which checks if ftrace based kprobe, optimized kprobe and normal kprobe can be defined in the same target function. . Add a test case for optimized probe, which checks whether kprobe can be optimized or not. -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmSa+9MACgkQ2/sHvwUr PxsmOAgAmUOIWtvH5py7AZpIRhCj8B18F6KnT7w2hByCsRxf7SaCqMhpBCk9VnYv 9fJFBHpvYRJEmpHoH3o2ET5AGfKVNac9z96AGI2qJ4ECWITd6I5+WfTdZ5ueVn2d f6DQ10mHXDHSMFbuqfYWSHtkeivJpWpUNHhwzPb4doNOe06bZNfVuSgnksFg1at5 kq16HbvGnhPzdO4YHmvqwjmRHr5/nCI1KDE9xIBcqNtWFbiRigC11zaZEUkLX+vT F63ShyfCK718AiwDfnjXpGkXAiVOZuAIR8RELaSqQ92YHCFKq5k9K4++WllPR5f9 AxjVultFDiCd4oSPgYpQkjuZdFq9NA== =IhmY -----END PGP SIGNATURE----- Merge tag 'probes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes updates from Masami Hiramatsu: - fprobe: Pass return address to the fprobe entry/exit callbacks so that the callbacks don't need to analyze pt_regs/stack to find the function return address. - kprobe events: cleanup usage of TPARG_FL_FENTRY and TPARG_FL_RETURN flags so that those are not set at once. - fprobe events: - Add a new fprobe events for tracing arbitrary function entry and exit as a trace event. - Add a new tracepoint events for tracing raw tracepoint as a trace event. This allows user to trace non user-exposed tracepoints. - Move eprobe's event parser code into probe event common file. - Introduce BTF (BPF type format) support to kernel probe (kprobe, fprobe and tracepoint probe) events so that user can specify traced function arguments by name. This also applies the type of argument when fetching the argument. - Introduce '$arg*' wildcard support if BTF is available. This expands the '$arg*' meta argument to all function argument automatically. - Check the return value types by BTF. If the function returns 'void', '$retval' is rejected. - Add some selftest script for fprobe events, tracepoint events and BTF support. - Update documentation about the fprobe events. - Some fixes for above features, document and selftests. - selftests for ftrace (in addition to the new fprobe events): - Add a test case for multiple consecutive probes in a function which checks if ftrace based kprobe, optimized kprobe and normal kprobe can be defined in the same target function. - Add a test case for optimized probe, which checks whether kprobe can be optimized or not. * tag 'probes-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing/probes: Fix tracepoint event with $arg* to fetch correct argument Documentation: Fix typo of reference file name tracing/probes: Fix to return NULL and keep using current argc selftests/ftrace: Add new test case which checks for optimized probes selftests/ftrace: Add new test case which adds multiple consecutive probes in a function Documentation: tracing/probes: Add fprobe event tracing document selftests/ftrace: Add BTF arguments test cases selftests/ftrace: Add tracepoint probe test case tracing/probes: Add BTF retval type support tracing/probes: Add $arg* meta argument for all function args tracing/probes: Support function parameters if BTF is available tracing/probes: Move event parameter fetching code to common parser tracing/probes: Add tracepoint support on fprobe_events selftests/ftrace: Add fprobe related testcases tracing/probes: Add fprobe events for tracing function entry and exit. tracing/probes: Avoid setting TPARG_FL_FENTRY and TPARG_FL_RETURN fprobe: Pass return address to the handlers |
||
Linus Torvalds
|
cccf0c2ee5 |
Tracing updates for 6.5:
- Add new feature to have function graph tracer record the return value. Adds a new option: funcgraph-retval ; when set, will show the return value of a function in the function graph tracer. - Also add the option: funcgraph-retval-hex where if it is not set, and the return value is an error code, then it will return the decimal of the error code, otherwise it still reports the hex value. - Add the file /sys/kernel/tracing/osnoise/per_cpu/cpu<cpu>/timerlat_fd That when a application opens it, it becomes the task that the timer lat tracer traces. The application can also read this file to find out how it's being interrupted. - Add the file /sys/kernel/tracing/available_filter_functions_addrs that works just the same as available_filter_functions but also shows the addresses of the functions like kallsyms, except that it gives the address of where the fentry/mcount jump/nop is. This is used by BPF to make it easier to attach BPF programs to ftrace hooks. - Replace strlcpy with strscpy in the tracing boot code. -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZJy6ixQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qnzRAPsEI2YgjaJSHnuPoGRHbrNil6pq66wY LYaLizGI4Jv9BwEAqdSdcYcMiWo1SFBAO8QxEDM++BX3zrRyVgW8ahaTNgs= =TF0C -----END PGP SIGNATURE----- Merge tag 'trace-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: - Add new feature to have function graph tracer record the return value. Adds a new option: funcgraph-retval ; when set, will show the return value of a function in the function graph tracer. - Also add the option: funcgraph-retval-hex where if it is not set, and the return value is an error code, then it will return the decimal of the error code, otherwise it still reports the hex value. - Add the file /sys/kernel/tracing/osnoise/per_cpu/cpu<cpu>/timerlat_fd That when a application opens it, it becomes the task that the timer lat tracer traces. The application can also read this file to find out how it's being interrupted. - Add the file /sys/kernel/tracing/available_filter_functions_addrs that works just the same as available_filter_functions but also shows the addresses of the functions like kallsyms, except that it gives the address of where the fentry/mcount jump/nop is. This is used by BPF to make it easier to attach BPF programs to ftrace hooks. - Replace strlcpy with strscpy in the tracing boot code. * tag 'trace-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Fix warnings when building htmldocs for function graph retval riscv: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL tracing/boot: Replace strlcpy with strscpy tracing/timerlat: Add user-space interface tracing/osnoise: Skip running osnoise if all instances are off tracing/osnoise: Switch from PF_NO_SETAFFINITY to migrate_disable ftrace: Show all functions with addresses in available_filter_functions_addrs selftests/ftrace: Add funcgraph-retval test case LoongArch: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL x86/ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL arm64: ftrace: Enable HAVE_FUNCTION_GRAPH_RETVAL tracing: Add documentation for funcgraph-retval and funcgraph-retval-hex function_graph: Support recording and printing the return value of function fgraph: Add declaration of "struct fgraph_ret_regs" |
||
Linus Torvalds
|
533925cb76 |
RISC-V Patches for the 6.5 Merge Window, Part 1
* Support for ACPI. * Various cleanups to the ISA string parsing, including making them case-insensitive * Support for the vector extension. * Support for independent irq/softirq stacks. * Our CPU DT binding now has "unevaluatedProperties: false" -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmSe70ATHHBhbG1lckBk YWJiZWx0LmNvbQAKCRAuExnzX7sYiWNPD/0ZfSdQ0A/gMVOzAD4zFKPEqQ6ffW2V Zy6Jo7UDNqKsiai7QA4XB1uyYIv/y1yUKJ0oeBVcA9Nzyq+TW9QDcApDBTabxAUI agY19YKw6VVZ+p7I9sMsf6EbdJdkNfSAzcQACPxb4ScEoaf9X+oAK5qgXuRuWluh qQuVkkJlgWc/t1cuUkrRdJmHQYvjP3zL7z4o344q2IVpXJkNNu0GeP+HbF8BYKcA +I/TTA5JY3kCIaxkpF2rU6pE6T5T9xrPmRYZ7bZoPUPnbL+M8As/jx3ym52Y4WGp kf8pgkxixOjU64kVJOH66CA8GaOiaAH/ptjQb0ZmCaGrHhr7aOT9HrkX4rU1lS8T stPphfM4gGPcCoPgRqSl+mEhBzjII8maOBLtbricAoQi6efRq8fzoOGaif/QpCbc 6n0LGS4nQPGVyD3rAPfHxxfrlGJR+SsgyDvjZoDhqauFglims14GnK+eBeO8zrui Aj/uuAS63VIYprJWC1NOBJlU2WKZiOGhCANpZ6W6SH21PYn2WjsVILqaGh+WN8ZO KOHxZNaN8fQag0Yg7oNAUb7l6S0DHYtJIksFnFW2Rf2+VT58RAMYRQbpbhr7Tqr+ jLgIR8PkFrBERHE49IqLGhAxGDnNzAUysMRw9pIk7WIre2Jt4wPqUdl+ee+5ErIX jiYfSFZw9q28UA== =Fpq8 -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-6.5-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for ACPI - Various cleanups to the ISA string parsing, including making them case-insensitive - Support for the vector extension - Support for independent irq/softirq stacks - Our CPU DT binding now has "unevaluatedProperties: false" * tag 'riscv-for-linus-6.5-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (78 commits) riscv: hibernate: remove WARN_ON in save_processor_state dt-bindings: riscv: cpus: switch to unevaluatedProperties: false dt-bindings: riscv: cpus: add a ref the common cpu schema riscv: stack: Add config of thread stack size riscv: stack: Support HAVE_SOFTIRQ_ON_OWN_STACK riscv: stack: Support HAVE_IRQ_EXIT_ON_IRQ_STACK RISC-V: always report presence of extensions formerly part of the base ISA dt-bindings: riscv: explicitly mention assumption of Zicntr & Zihpm support RISC-V: remove decrement/increment dance in ISA string parser RISC-V: rework comments in ISA string parser RISC-V: validate riscv,isa at boot, not during ISA string parsing RISC-V: split early & late of_node to hartid mapping RISC-V: simplify register width check in ISA string parsing perf: RISC-V: Limit the number of counters returned from SBI riscv: replace deprecated scall with ecall riscv: uprobes: Restore thread.bad_cause riscv: mm: try VMA lock-based page fault handling first riscv: mm: Pre-allocate PGD entries for vmalloc/modules area RISC-V: hwprobe: Expose Zba, Zbb, and Zbs RISC-V: Track ISA extensions per hart ... |
||
Linus Torvalds
|
d8b0bd57c2 |
powerpc updates for 6.5
- Extend KCSAN support to 32-bit and BookE. Add some KCSAN annotations. - Make ELFv2 ABI the default for 64-bit big-endian kernel builds, and use the -mprofile-kernel option (kernel specific ftrace ABI) for big endian ELFv2 kernels. - Add initial Dynamic Execution Control Register (DEXCR) support, and allow the ROP protection instructions to be used on Power 10. - Various other small features and fixes. Thanks to: Aditya Gupta, Aneesh Kumar K.V, Benjamin Gray, Brian King, Christophe Leroy, Colin Ian King, Dmitry Torokhov, Gaurav Batra, Jean Delvare, Joel Stanley, Marco Elver, Masahiro Yamada, Nageswara R Sastry, Nathan Chancellor, Naveen N Rao, Nayna Jain, Nicholas Piggin, Paul Gortmaker, Randy Dunlap, Rob Herring, Rohan McLure, Russell Currey, Sachin Sant, Timothy Pearson, Tom Rix, Uwe Kleine-König. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmSeqQMTHG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgKukD/sGUceX6gIc7UcjWhL1ZCVco0bsgLjT JrY1NenisGKjwwRd/o+2+h3ziJDoO5AsQfT72EiNLhaYJhnlb1d0vXzsvN0THc+2 W5RrxAZUNhBy+c7gSSEJjy8+vBIwSQAliQLChHGOSejGCj94SxF5+zjUFvSX458I z0+ZQK+Fiw5NcpzEnBT0XPnLzap74a7TL0JcG1MLbj2QtHXhbfjIlkkPDX3kK0Gw xbelFy38X7KKbQsXXYSTCGqwRdJ3yqu21nEsjRuo2yT5H5rQbjCNggkMOL1DecDd ULGxit/z13Pt1Ad3oe+6FF17ggOiCG0F75DONASjFDthFYx6NQffkJS1n1VZauQj jU6LtWeD3HkgIYm6Udjq+LaSmkAmn5a+9tsElE/K+V1WG4rKyMeVmE3z/tCJG0l2 yhLKyFs+glXN/LiWHyX0mrQIIVZdRK237X1qXJuIvAuB7Drm5duXFAHR8pdJD0dg H23OhoO2FvLxb9GvnzGxqjdazzattctz31wU/1RgnPxumYkJ9PlBcXn9h1uXa8/K rDZFJADsQhEfRCjmLG3GIaFWqZdc4Cn+ZUk4iHkjPDFDL05Fq7JYHIuwteiN6/wP NHRvtKdJisu583NI9RN9300JykrEqjSRbMOWlc3vuFwbRLGioXvWhWlIZ3/t58jG R8s+f0nKSPr+fg== =ssit -----END PGP SIGNATURE----- Merge tag 'powerpc-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Extend KCSAN support to 32-bit and BookE. Add some KCSAN annotations - Make ELFv2 ABI the default for 64-bit big-endian kernel builds, and use the -mprofile-kernel option (kernel specific ftrace ABI) for big endian ELFv2 kernels - Add initial Dynamic Execution Control Register (DEXCR) support, and allow the ROP protection instructions to be used on Power 10 - Various other small features and fixes Thanks to Aditya Gupta, Aneesh Kumar K.V, Benjamin Gray, Brian King, Christophe Leroy, Colin Ian King, Dmitry Torokhov, Gaurav Batra, Jean Delvare, Joel Stanley, Marco Elver, Masahiro Yamada, Nageswara R Sastry, Nathan Chancellor, Naveen N Rao, Nayna Jain, Nicholas Piggin, Paul Gortmaker, Randy Dunlap, Rob Herring, Rohan McLure, Russell Currey, Sachin Sant, Timothy Pearson, Tom Rix, and Uwe Kleine-König. * tag 'powerpc-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (76 commits) powerpc: remove checks for binutils older than 2.25 powerpc: Fail build if using recordmcount with binutils v2.37 powerpc/iommu: TCEs are incorrectly manipulated with DLPAR add/remove of memory powerpc/iommu: Only build sPAPR access functions on pSeries powerpc: powernv: Annotate data races in opal events powerpc: Mark writes registering ipi to host cpu through kvm and polling powerpc: Annotate accesses to ipi message flags powerpc: powernv: Fix KCSAN datarace warnings on idle_state contention powerpc: Mark [h]ssr_valid accesses in check_return_regs_valid powerpc: qspinlock: Enforce qnode writes prior to publishing to queue powerpc: qspinlock: Mark accesses to qnode lock checks powerpc/powernv/pci: Remove last IODA1 defines powerpc/powernv/pci: Remove MVE code powerpc/powernv/pci: Remove ioda1 support powerpc: 52xx: Make immr_id DT match tables static powerpc: mpc512x: Remove open coded "ranges" parsing powerpc: fsl_soc: Use of_range_to_resource() for "ranges" parsing powerpc: fsl: Use of_property_read_reg() to parse "reg" powerpc: fsl_rio: Use of_range_to_resource() for "ranges" parsing macintosh: Use of_property_read_reg() to parse "reg" ... |
||
Kees Cook
|
b69f0aeb06 |
pid: Replace struct pid 1-element array with flex-array
For pid namespaces, struct pid uses a dynamically sized array member, "numbers". This was implemented using the ancient 1-element fake flexible array, which has been deprecated for decades. Replace it with a C99 flexible array, refactor the array size calculations to use struct_size(), and address elements via indexes. Note that the static initializer (which defines a single element) works as-is, and requires no special handling. Without this, CONFIG_UBSAN_BOUNDS (and potentially CONFIG_FORTIFY_SOURCE) will trigger bounds checks: https://lore.kernel.org/lkml/20230517-bushaltestelle-super-e223978c1ba6@brauner Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jeff Xu <jeffxu@google.com> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Daniel Verkamp <dverkamp@chromium.org> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Jeff Xu <jeffxu@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Reported-by: syzbot+ac3b41786a2d0565b6d5@syzkaller.appspotmail.com [brauner: dropped unrelated changes and remove 0 with NULL cast] Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Douglas Anderson
|
1ed0555850 |
kdb: Handle LF in the command parser
The main kdb command parser only handles CR (ASCII 13 AKA '\r') today, but not LF (ASCII 10 AKA '\n'). That means that the kdb command parser can handle terminals that send just CR or that send CR+LF but can't handle terminals that send just LF. The fact that kdb didn't handle LF in the command parser tripped up a tool I tried to use with it. Specifically, I was trying to send a command to my device to resume it from kdb using a ChromeOS tool like: dut-control cpu_uart_cmd:"g" That tool only terminates lines with LF, not CR+LF. Arguably the ChromeOS tool should be fixed. After all, officially kdb seems to be designed such that CR+LF is the official line ending transmitted over the wire and that internally a line ending is just '\n' (LF). Some evidence: * uart_poll_put_char(), which is used by kdb, notices a '\n' and converts it to '\r\n'. * kdb functions specifically use '\r' to get a carriage return without a newline. You can see this in the pager where kdb will write a '\r' and then write over the pager prompt. However, all that being said there's no real harm in accepting LF as a command terminator in the kdb parser and doing so seems like it would improve compatibility. After this, I'd expect that things would work OK-ish with a remote terminal that used any of CR, CR+LF, or LF as a line ending. Someone using CR as a line ending might get some ugliness where kdb wasn't able to overwrite the last line, but basic commands would work. Someone using just LF as a line ending would probably also work OK. A few other notes: - It can be noted that "bash" running on an "agetty" handles LF as a line termination with no complaints. - Historically, kdb's "pager" actually handled either CR or LF fine. A very quick inspection would make one think that kdb's pager actually could have paged down two lines instead of one for anyone using CR+LF, but this is generally avoided because of kdb_input_flush(). - Conceivably one could argue that some of this special case logic belongs in uart_poll_get_char() since uart_poll_put_char() handles the '\n' => '\r\n' conversion. I would argue that perhaps we should eventually do the opposite and move the '\n' => '\r\n' out of uart_poll_put_char(). Having that conversion at such a low level could interfere if we ever want to transfer binary data. In addition, if we truly made uart_poll_get_char() the inverse of uart_poll_put_char() it would convert back to '\n' and (ironically) kdb's parser currently only looks for '\r' to find the end of a command. Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20230628125612.1.I5cc6c3d916195f5bcfdf5b75d823f2037707f5dc@changeid Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> |
||
Andy Shevchenko
|
67a4e1a3bf |
irqdomain: Use return value of strreplace()
Since strreplace() returns the pointer to the string itself, use it directly. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230628150251.17832-1-andriy.shevchenko@linux.intel.com |
||
Linus Torvalds
|
1e6d5dea34 |
dma-mapping uodates for Linux 6.5
- swiotlb cleanups (Petr Tesarik) - use kvmalloc_array (gaoxu) - a small step towards removing is_swiotlb_active (Christoph Hellwig) - fix a Kconfig typo Sui Jingfeng) -----BEGIN PGP SIGNATURE----- iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmScHTELHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYOoGhAAuKdcWAumhEuKXu/vuA4/xH5Kvn2uhrAVs/RSKKh7 Wbh2WXh+9GOQs4dwpYI5/chRmCJgyC2ics/E7dny2ovJumM/RGOR9zff/+zJk5yh 3tpiZ4VW/2x/9qmoRpdQVTaj1uB2Ug8wpztUohyeEYPpjLX0/Vq/zTItAyvQs4kO vskg2u9EGwzgV4qmHWq2fjowljHS1sMZq2URKGX8JDHrwOh0+uCke626ze/WznSQ 5quWW4mE0IYq/nldIMTvJxUx6273zBMm+yoZSfFY6K0VrP3WDE5endP5oYR7d251 zNRPyWqW78HQT0z/gntaPOKDB0t7YxsWVLwam98Yv8wl1wLJQYe5sRoL7qXL0yBz cu4uTOGU6Kvw3ZHWQOk2qkGxYAxDLnaFejz8ZDCycWQA7T2+TNdaVXNnl/vE+aRN 25mnT0JtClfZOvHihhHKaiRXMKE0WgPy4vBZBT0SAUVuUR7JwtYDv/5giGf6Bi3x QwNeXAmFD75VELJoqX1KL/UiJh8FzyWZMI7l7XmDVy9lJsA1P3PO+npI8SOmA+e8 q4GSD36BxDoDE06Hg/BQcRqPohDCEsrRgfvOaa7xld9SCwe1lGaN3kDsyv6HoBzX NRS3aishdqHf21dNctjX9f+II+30NFkPbAa64B8L4tP5aQLy3bgjNLweLP9fhKgl KQs= =yYCS -----END PGP SIGNATURE----- Merge tag 'dma-mapping-6.5-2023-06-28' of git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping updates from Christoph Hellwig: - swiotlb cleanups (Petr Tesarik) - use kvmalloc_array (gaoxu) - a small step towards removing is_swiotlb_active (Christoph Hellwig) - fix a Kconfig typo Sui Jingfeng) * tag 'dma-mapping-6.5-2023-06-28' of git://git.infradead.org/users/hch/dma-mapping: drm/nouveau: stop using is_swiotlb_active swiotlb: use the atomic counter of total used slabs if available swiotlb: remove unused field "used" from struct io_tlb_mem dma-remap: use kvmalloc_array/kvfree for larger dma memory remap dma-mapping: fix a Kconfig typo |
||
Linus Torvalds
|
82a2a51055 |
sysctl-6.5-rc1-fixes
Linus, included in this pull request is just a minor fix which Matthieu Baerts noted I had not picked up. I adjusted the Fixes commit ID to reflect the latest tree. -----BEGIN PGP SIGNATURE----- iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmSeA/ISHG1jZ3JvZkBr ZXJuZWwub3JnAAoJEM4jHQowkoinUq4QAIGJi5oRY+tAqzJ1VigkUgwheDeND3KL ixwn/9RoZ6Sk9hpXbFVdS4pUGueYmyCnbeKF48aTYKMSRV8Vm3E61RevYnpjSXBA D+zhJ80NS95MacrHM0cK77UD/A8T4Y57e/rUJfL/3RqWMPpJbPciqO/Y+9MZJRFT RCC8jbPUjdmvbccEMQhALSZb2nOUtHouUWgG4SgyiELjY5djNojINEOQ1xIh9W1G LWWCM/99NeDZfQtF+36k6F1uc568xrSunELXBKcQBk6PlETt4dtQta/PqEAQ+tHT qZDnwepw9RCNPxuGv0oialYe3weFB4zC2kr3ZYePs0qLfM4GGqXjDy7bIT5qa59V 9AsAhSEuLu/ZAnYgQNMcJZss/PGrxSDJbKOUEW3pJ00nlj9zwWXpGX4jJsBNKdoe smalelXUOR3Kr5BZm7cm0125kiQrvnMx8L/5waG81sCKsW+LcHQgTJu8RVD49f5H k/LCUlxk3jyXO3xlyYvQKzpA70P4vromIhWjBxcK2cAbPJL5N6ux4WX7L+m/niID cD0/74XJ+es1qAUHkl1on5s7U1DOryY4xJA+liQK/pKFIMyuKaUYlLmgD8lworI4 jqJ4BgtSASc/W81TX7x32CAbKWwjxKDQK3mLBUdVAJPtcr5vFOnmHh2vgZ0/A/yI eMgoAPB5E+UD =Tyyn -----END PGP SIGNATURE----- Merge tag 'sysctl-6.5-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull sysctl fix from Luis Chamberlain: "A missed minor fix which Matthieu Baerts noted I had not picked up" * tag 'sysctl-6.5-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: sysctl: fix unused proc_cap_handler() function warning |
||
Linus Torvalds
|
3ad7b12c72 |
tracing: Fix user event write on buffer disabled
The user events write currently returns the size of what was suppose to be written when tracing is disabled and nothing was written. Instead, behave like trace_marker and return -EBADF, as that is what is returned if a file is opened for read only, and a write is performed on it. Writing to the buffer that is disabled is like trying to write to a file opened for read only, as the buffer still can be read, but just not written to. This also includes test cases for this use case -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZJxLzRQccm9zdGVkdEBn b29kbWlzLm9yZwAKCRAp5XQQmuv6qs9eAP0ZRDempFNyMhi+pfENtwv65CHxRX/3 3s1Lmt04oqwXfgD/dCfojrVAd++kpq3p9cGxJYWuNiM4s47mlD8VLgQ7AAY= =aZ8/ -----END PGP SIGNATURE----- Merge tag 'trace-v6.4-rc7-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fix from Steven Rostedt: "Fix user event write on buffer disabled. The user events write currently returns the size of what was supposed to be written when tracing is disabled and nothing was written. Instead, behave like trace_marker and return -EBADF, as that is what is returned if a file is opened for read only, and a write is performed on it. Writing to the buffer that is disabled is like trying to write to a file opened for read only, as the buffer still can be read, but just not written to. This also includes test cases for this use case" * tag 'trace-v6.4-rc7-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: selftests/user_events: Add test cases when event is disabled selftests/user_events: Enable the event before write_fault test in ftrace self-test tracing/user_events: Fix incorrect return value for writing operation when events are disabled |
||
Linus Torvalds
|
632f54b4d6 |
slab updates for 6.5
-----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmSZtjsACgkQu+CwddJF iJqCTwf/XVhmAD7zMOj6g1aak5oHNZDRG5jufM5UNXmiWjCWT3w4DpltrJkz0PPm mg3Ac5fjNUqesZ1SGtUbvoc363smroBrRudGEFrsUhqBcpR+S4fSneoDk+xqMypf VLXP/8kJlFEBGMiR7ouAWnR4+u6JgY4E8E8JIPNzao5KE/L1lD83nY+Usjc/01ek oqMyYVFRfncsGjGJXc5fOOTTCj768mRroF0sLmEegIonnwQkSHE7HWJ/nyaVraDV bomnTIgMdVIDqharin08ZPIM7qBIWM09Uifaf0lIs6fIA94pQP+5Ko3mum2P/S+U ON/qviSrlNgRXoHPJ3hvPHdfEU9cSg== =1d0v -----END PGP SIGNATURE----- Merge tag 'slab-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab updates from Vlastimil Babka: - SLAB deprecation: Following the discussion at LSF/MM 2023 [1] and no objections, the SLAB allocator is deprecated by renaming the config option (to make its users notice) to CONFIG_SLAB_DEPRECATED with updated help text. SLUB should be used instead. Existing defconfigs with CONFIG_SLAB are also updated. - SLAB_NO_MERGE kmem_cache flag (Jesper Dangaard Brouer): There are (very limited) cases where kmem_cache merging is undesirable, and existing ways to prevent it are hacky. Introduce a new flag to do that cleanly and convert the existing hacky users. Btrfs plans to use this for debug kernel builds (that use case is always fine), networking for performance reasons (that should be very rare). - Replace the usage of weak PRNGs (David Keisar Schmidt): In addition to using stronger RNGs for the security related features, the code is a bit cleaner. - Misc code cleanups (SeongJae Parki, Xiongwei Song, Zhen Lei, and zhaoxinchao) Link: https://lwn.net/Articles/932201/ [1] * tag 'slab-for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: mm/slab_common: use SLAB_NO_MERGE instead of negative refcount mm/slab: break up RCU readers on SLAB_TYPESAFE_BY_RCU example code mm/slab: add a missing semicolon on SLAB_TYPESAFE_BY_RCU example code mm/slab_common: reduce an if statement in create_cache() mm/slab: introduce kmem_cache flag SLAB_NO_MERGE mm/slab: rename CONFIG_SLAB to CONFIG_SLAB_DEPRECATED mm/slab: remove HAVE_HARDENED_USERCOPY_ALLOCATOR mm/slab_common: Replace invocation of weak PRNG mm/slab: Replace invocation of weak PRNG slub: Don't read nr_slabs and total_objects directly slub: Remove slabs_node() function slub: Remove CONFIG_SMP defined check slub: Put objects_show() into CONFIG_SLUB_DEBUG enabled block slub: Correct the error code when slab_kset is NULL mm/slab: correct return values in comment for _kmem_cache_create() |
||
Arnd Bergmann
|
554588e8e9 |
sysctl: fix unused proc_cap_handler() function warning
Since usermodehelper_table() is marked static now, we get a
warning about it being unused when SYSCTL is disabled:
kernel/umh.c:497:12: error: 'proc_cap_handler' defined but not used [-Werror=unused-function]
Just move it inside of the same #ifdef.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Fixes:
|
||
Arnd Bergmann
|
0914e4d3cd |
kdb: include kdb_private.h for function prototypes
The kdb_kbd_cleanup_state() is called from another file through the kdb_private.h file, but that is not included before the definition, causing a W=1 warning: kernel/debug/kdb/kdb_keyboard.c:198:6: error: no previous prototype for 'kdb_kbd_cleanup_state' [-Werror=missing-prototypes] Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20230517124802.929751-1-arnd@kernel.org Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> |
||
Petr Tesarik
|
8ac0406335 |
swiotlb: reduce the number of areas to match actual memory pool size
Although the desired size of the SWIOTLB memory pool is increased in
swiotlb_adjust_nareas() to match the number of areas, the actual allocation
may be smaller, which may require reducing the number of areas.
For example, Xen uses swiotlb_init_late(), which in turn uses the page
allocator. On x86, page size is 4 KiB and MAX_ORDER is 10 (1024 pages),
resulting in a maximum memory pool size of 4 MiB. This corresponds to 2048
slots of 2 KiB each. The minimum area size is 128 (IO_TLB_SEGSIZE),
allowing at most 2048 / 128 = 16 areas.
If num_possible_cpus() is greater than the maximum number of areas, areas
are smaller than IO_TLB_SEGSIZE and contiguous groups of free slots will
span multiple areas. When allocating and freeing slots, only one area will
be properly locked, causing race conditions on the unlocked slots and
ultimately data corruption, kernel hangs and crashes.
Fixes:
|
||
Petr Tesarik
|
aabd12609f |
swiotlb: always set the number of areas before allocating the pool
The number of areas defaults to the number of possible CPUs. However, the
total number of slots may have to be increased after adjusting the number
of areas. Consequently, the number of areas must be determined before
allocating the memory pool. This is even explained with a comment in
swiotlb_init_remap(), but swiotlb_init_late() adjusts the number of areas
after slots are already allocated. The areas may end up being smaller than
IO_TLB_SEGSIZE, which breaks per-area locking.
While fixing swiotlb_init_late(), move all relevant comments before the
definition of swiotlb_adjust_nareas() and convert them to kernel-doc.
Fixes:
|
||
Linus Torvalds
|
3a8a670eee |
Networking changes for 6.5.
Core ---- - Rework the sendpage & splice implementations. Instead of feeding data into sockets page by page extend sendmsg handlers to support taking a reference on the data, controlled by a new flag called MSG_SPLICE_PAGES. Rework the handling of unexpected-end-of-file to invoke an additional callback instead of trying to predict what the right combination of MORE/NOTLAST flags is. Remove the MSG_SENDPAGE_NOTLAST flag completely. - Implement SCM_PIDFD, a new type of CMSG type analogous to SCM_CREDENTIALS, but it contains pidfd instead of plain pid. - Enable socket busy polling with CONFIG_RT. - Improve reliability and efficiency of reporting for ref_tracker. - Auto-generate a user space C library for various Netlink families. Protocols --------- - Allow TCP to shrink the advertised window when necessary, prevent sk_rcvbuf auto-tuning from growing the window all the way up to tcp_rmem[2]. - Use per-VMA locking for "page-flipping" TCP receive zerocopy. - Prepare TCP for device-to-device data transfers, by making sure that payloads are always attached to skbs as page frags. - Make the backoff time for the first N TCP SYN retransmissions linear. Exponential backoff is unnecessarily conservative. - Create a new MPTCP getsockopt to retrieve all info (MPTCP_FULL_INFO). - Avoid waking up applications using TLS sockets until we have a full record. - Allow using kernel memory for protocol ioctl callbacks, paving the way to issuing ioctls over io_uring. - Add nolocalbypass option to VxLAN, forcing packets to be fully encapsulated even if they are destined for a local IP address. - Make TCPv4 use consistent hash in TIME_WAIT and SYN_RECV. Ensure in-kernel ECMP implementation (e.g. Open vSwitch) select the same link for all packets. Support L4 symmetric hashing in Open vSwitch. - PPPoE: make number of hash bits configurable. - Allow DNS to be overwritten by DHCPACK in the in-kernel DHCP client (ipconfig). - Add layer 2 miss indication and filtering, allowing higher layers (e.g. ACL filters) to make forwarding decisions based on whether packet matched forwarding state in lower devices (bridge). - Support matching on Connectivity Fault Management (CFM) packets. - Hide the "link becomes ready" IPv6 messages by demoting their printk level to debug. - HSR: don't enable promiscuous mode if device offloads the proto. - Support active scanning in IEEE 802.15.4. - Continue work on Multi-Link Operation for WiFi 7. BPF --- - Add precision propagation for subprogs and callbacks. This allows maintaining verification efficiency when subprograms are used, or in fact passing the verifier at all for complex programs, especially those using open-coded iterators. - Improve BPF's {g,s}setsockopt() length handling. Previously BPF assumed the length is always equal to the amount of written data. But some protos allow passing a NULL buffer to discover what the output buffer *should* be, without writing anything. - Accept dynptr memory as memory arguments passed to helpers. - Add routing table ID to bpf_fib_lookup BPF helper. - Support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands. - Drop bpf_capable() check in BPF_MAP_FREEZE command (used to mark maps as read-only). - Show target_{obj,btf}_id in tracing link fdinfo. - Addition of several new kfuncs (most of the names are self-explanatory): - Add a set of new dynptr kfuncs: bpf_dynptr_adjust(), bpf_dynptr_is_null(), bpf_dynptr_is_rdonly(), bpf_dynptr_size() and bpf_dynptr_clone(). - bpf_task_under_cgroup() - bpf_sock_destroy() - force closing sockets - bpf_cpumask_first_and(), rework bpf_cpumask_any*() kfuncs Netfilter --------- - Relax set/map validation checks in nf_tables. Allow checking presence of an entry in a map without using the value. - Increase ip_vs_conn_tab_bits range for 64BIT builds. - Allow updating size of a set. - Improve NAT tuple selection when connection is closing. Driver API ---------- - Integrate netdev with LED subsystem, to allow configuring HW "offloaded" blinking of LEDs based on link state and activity (i.e. packets coming in and out). - Support configuring rate selection pins of SFP modules. - Factor Clause 73 auto-negotiation code out of the drivers, provide common helper routines. - Add more fool-proof helpers for managing lifetime of MDIO devices associated with the PCS layer. - Allow drivers to report advanced statistics related to Time Aware scheduler offload (taprio). - Allow opting out of VF statistics in link dump, to allow more VFs to fit into the message. - Split devlink instance and devlink port operations. New hardware / drivers ---------------------- - Ethernet: - Synopsys EMAC4 IP support (stmmac) - Marvell 88E6361 8 port (5x1GE + 3x2.5GE) switches - Marvell 88E6250 7 port switches - Microchip LAN8650/1 Rev.B0 PHYs - MediaTek MT7981/MT7988 built-in 1GE PHY driver - WiFi: - Realtek RTL8192FU, 2.4 GHz, b/g/n mode, 2T2R, 300 Mbps - Realtek RTL8723DS (SDIO variant) - Realtek RTL8851BE - CAN: - Fintek F81604 Drivers ------- - Ethernet NICs: - Intel (100G, ice): - support dynamic interrupt allocation - use meta data match instead of VF MAC addr on slow-path - nVidia/Mellanox: - extend link aggregation to handle 4, rather than just 2 ports - spawn sub-functions without any features by default - OcteonTX2: - support HTB (Tx scheduling/QoS) offload - make RSS hash generation configurable - support selecting Rx queue using TC filters - Wangxun (ngbe/txgbe): - add basic Tx/Rx packet offloads - add phylink support (SFP/PCS control) - Freescale/NXP (enetc): - report TAPRIO packet statistics - Solarflare/AMD: - support matching on IP ToS and UDP source port of outer header - VxLAN and GENEVE tunnel encapsulation over IPv4 or IPv6 - add devlink dev info support for EF10 - Virtual NICs: - Microsoft vNIC: - size the Rx indirection table based on requested configuration - support VLAN tagging - Amazon vNIC: - try to reuse Rx buffers if not fully consumed, useful for ARM servers running with 16kB pages - Google vNIC: - support TCP segmentation of >64kB frames - Ethernet embedded switches: - Marvell (mv88e6xxx): - enable USXGMII (88E6191X) - Microchip: - lan966x: add support for Egress Stage 0 ACL engine - lan966x: support mapping packet priority to internal switch priority (based on PCP or DSCP) - Ethernet PHYs: - Broadcom PHYs: - support for Wake-on-LAN for BCM54210E/B50212E - report LPI counter - Microsemi PHYs: support RGMII delay configuration (VSC85xx) - Micrel PHYs: receive timestamp in the frame (LAN8841) - Realtek PHYs: support optional external PHY clock - Altera TSE PCS: merge the driver into Lynx PCS which it is a variant of - CAN: Kvaser PCIEcan: - support packet timestamping - WiFi: - Intel (iwlwifi): - major update for new firmware and Multi-Link Operation (MLO) - configuration rework to drop test devices and split the different families - support for segmented PNVM images and power tables - new vendor entries for PPAG (platform antenna gain) feature - Qualcomm 802.11ax (ath11k): - Multiple Basic Service Set Identifier (MBSSID) and Enhanced MBSSID Advertisement (EMA) support in AP mode - support factory test mode - RealTek (rtw89): - add RSSI based antenna diversity - support U-NII-4 channels on 5 GHz band - RealTek (rtl8xxxu): - AP mode support for 8188f - support USB RX aggregation for the newer chips Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmSbJM4ACgkQMUZtbf5S IrtoDhAAhEim1+LBIKf4lhPcVdZ2p/TkpnwTz5jsTwSeRBAxTwuNJ2fQhFXg13E3 MnRq6QaEp8G4/tA/gynLvQop+FEZEnv+horP0zf/XLcC8euU7UrKdrpt/4xxdP07 IL/fFWsoUGNO+L9LNaHwBo8g7nHvOkPscHEBHc2Xrvzab56TJk6vPySfLqcpKlNZ CHWDwTpgRqNZzSKiSpoMVd9OVMKUXcPYHpDmfEJ5l+e8vTXmZzOLHrSELHU5nP5f mHV7gxkDCTshoGcaed7UTiOvgu1p6E5EchDJxiLaSUbgsd8SZ3u4oXwRxgj33RK/ fB2+UaLrRt/DdlHvT/Ph8e8Ygu77yIXMjT49jsfur/zVA0HEA2dFb7V6QlsYRmQp J25pnrdXmE15llgqsC0/UOW5J1laTjII+T2T70UOAqQl4LWYAQDG4WwsAqTzU0KY dueydDouTp9XC2WYrRUEQxJUzxaOaazskDUHc5c8oHp/zVBT+djdgtvVR9+gi6+7 yy4elI77FlEEqL0ItdU/lSWINayAlPLsIHkMyhSGKX0XDpKjeycPqkNx4UterXB/ JKIR5RBWllRft+igIngIkKX0tJGMU0whngiw7d1WLw25wgu4sB53hiWWoSba14hv tXMxwZs5iGaPcT38oRVMZz8I1kJM4Dz3SyI7twVvi4RUut64EG4= =9i4I -----END PGP SIGNATURE----- Merge tag 'net-next-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking changes from Jakub Kicinski: "WiFi 7 and sendpage changes are the biggest pieces of work for this release. The latter will definitely require fixes but I think that we got it to a reasonable point. Core: - Rework the sendpage & splice implementations Instead of feeding data into sockets page by page extend sendmsg handlers to support taking a reference on the data, controlled by a new flag called MSG_SPLICE_PAGES Rework the handling of unexpected-end-of-file to invoke an additional callback instead of trying to predict what the right combination of MORE/NOTLAST flags is Remove the MSG_SENDPAGE_NOTLAST flag completely - Implement SCM_PIDFD, a new type of CMSG type analogous to SCM_CREDENTIALS, but it contains pidfd instead of plain pid - Enable socket busy polling with CONFIG_RT - Improve reliability and efficiency of reporting for ref_tracker - Auto-generate a user space C library for various Netlink families Protocols: - Allow TCP to shrink the advertised window when necessary, prevent sk_rcvbuf auto-tuning from growing the window all the way up to tcp_rmem[2] - Use per-VMA locking for "page-flipping" TCP receive zerocopy - Prepare TCP for device-to-device data transfers, by making sure that payloads are always attached to skbs as page frags - Make the backoff time for the first N TCP SYN retransmissions linear. Exponential backoff is unnecessarily conservative - Create a new MPTCP getsockopt to retrieve all info (MPTCP_FULL_INFO) - Avoid waking up applications using TLS sockets until we have a full record - Allow using kernel memory for protocol ioctl callbacks, paving the way to issuing ioctls over io_uring - Add nolocalbypass option to VxLAN, forcing packets to be fully encapsulated even if they are destined for a local IP address - Make TCPv4 use consistent hash in TIME_WAIT and SYN_RECV. Ensure in-kernel ECMP implementation (e.g. Open vSwitch) select the same link for all packets. Support L4 symmetric hashing in Open vSwitch - PPPoE: make number of hash bits configurable - Allow DNS to be overwritten by DHCPACK in the in-kernel DHCP client (ipconfig) - Add layer 2 miss indication and filtering, allowing higher layers (e.g. ACL filters) to make forwarding decisions based on whether packet matched forwarding state in lower devices (bridge) - Support matching on Connectivity Fault Management (CFM) packets - Hide the "link becomes ready" IPv6 messages by demoting their printk level to debug - HSR: don't enable promiscuous mode if device offloads the proto - Support active scanning in IEEE 802.15.4 - Continue work on Multi-Link Operation for WiFi 7 BPF: - Add precision propagation for subprogs and callbacks. This allows maintaining verification efficiency when subprograms are used, or in fact passing the verifier at all for complex programs, especially those using open-coded iterators - Improve BPF's {g,s}setsockopt() length handling. Previously BPF assumed the length is always equal to the amount of written data. But some protos allow passing a NULL buffer to discover what the output buffer *should* be, without writing anything - Accept dynptr memory as memory arguments passed to helpers - Add routing table ID to bpf_fib_lookup BPF helper - Support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands - Drop bpf_capable() check in BPF_MAP_FREEZE command (used to mark maps as read-only) - Show target_{obj,btf}_id in tracing link fdinfo - Addition of several new kfuncs (most of the names are self-explanatory): - Add a set of new dynptr kfuncs: bpf_dynptr_adjust(), bpf_dynptr_is_null(), bpf_dynptr_is_rdonly(), bpf_dynptr_size() and bpf_dynptr_clone(). - bpf_task_under_cgroup() - bpf_sock_destroy() - force closing sockets - bpf_cpumask_first_and(), rework bpf_cpumask_any*() kfuncs Netfilter: - Relax set/map validation checks in nf_tables. Allow checking presence of an entry in a map without using the value - Increase ip_vs_conn_tab_bits range for 64BIT builds - Allow updating size of a set - Improve NAT tuple selection when connection is closing Driver API: - Integrate netdev with LED subsystem, to allow configuring HW "offloaded" blinking of LEDs based on link state and activity (i.e. packets coming in and out) - Support configuring rate selection pins of SFP modules - Factor Clause 73 auto-negotiation code out of the drivers, provide common helper routines - Add more fool-proof helpers for managing lifetime of MDIO devices associated with the PCS layer - Allow drivers to report advanced statistics related to Time Aware scheduler offload (taprio) - Allow opting out of VF statistics in link dump, to allow more VFs to fit into the message - Split devlink instance and devlink port operations New hardware / drivers: - Ethernet: - Synopsys EMAC4 IP support (stmmac) - Marvell 88E6361 8 port (5x1GE + 3x2.5GE) switches - Marvell 88E6250 7 port switches - Microchip LAN8650/1 Rev.B0 PHYs - MediaTek MT7981/MT7988 built-in 1GE PHY driver - WiFi: - Realtek RTL8192FU, 2.4 GHz, b/g/n mode, 2T2R, 300 Mbps - Realtek RTL8723DS (SDIO variant) - Realtek RTL8851BE - CAN: - Fintek F81604 Drivers: - Ethernet NICs: - Intel (100G, ice): - support dynamic interrupt allocation - use meta data match instead of VF MAC addr on slow-path - nVidia/Mellanox: - extend link aggregation to handle 4, rather than just 2 ports - spawn sub-functions without any features by default - OcteonTX2: - support HTB (Tx scheduling/QoS) offload - make RSS hash generation configurable - support selecting Rx queue using TC filters - Wangxun (ngbe/txgbe): - add basic Tx/Rx packet offloads - add phylink support (SFP/PCS control) - Freescale/NXP (enetc): - report TAPRIO packet statistics - Solarflare/AMD: - support matching on IP ToS and UDP source port of outer header - VxLAN and GENEVE tunnel encapsulation over IPv4 or IPv6 - add devlink dev info support for EF10 - Virtual NICs: - Microsoft vNIC: - size the Rx indirection table based on requested configuration - support VLAN tagging - Amazon vNIC: - try to reuse Rx buffers if not fully consumed, useful for ARM servers running with 16kB pages - Google vNIC: - support TCP segmentation of >64kB frames - Ethernet embedded switches: - Marvell (mv88e6xxx): - enable USXGMII (88E6191X) - Microchip: - lan966x: add support for Egress Stage 0 ACL engine - lan966x: support mapping packet priority to internal switch priority (based on PCP or DSCP) - Ethernet PHYs: - Broadcom PHYs: - support for Wake-on-LAN for BCM54210E/B50212E - report LPI counter - Microsemi PHYs: support RGMII delay configuration (VSC85xx) - Micrel PHYs: receive timestamp in the frame (LAN8841) - Realtek PHYs: support optional external PHY clock - Altera TSE PCS: merge the driver into Lynx PCS which it is a variant of - CAN: Kvaser PCIEcan: - support packet timestamping - WiFi: - Intel (iwlwifi): - major update for new firmware and Multi-Link Operation (MLO) - configuration rework to drop test devices and split the different families - support for segmented PNVM images and power tables - new vendor entries for PPAG (platform antenna gain) feature - Qualcomm 802.11ax (ath11k): - Multiple Basic Service Set Identifier (MBSSID) and Enhanced MBSSID Advertisement (EMA) support in AP mode - support factory test mode - RealTek (rtw89): - add RSSI based antenna diversity - support U-NII-4 channels on 5 GHz band - RealTek (rtl8xxxu): - AP mode support for 8188f - support USB RX aggregation for the newer chips" * tag 'net-next-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1602 commits) net: scm: introduce and use scm_recv_unix helper af_unix: Skip SCM_PIDFD if scm->pid is NULL. net: lan743x: Simplify comparison netlink: Add __sock_i_ino() for __netlink_diag_dump(). net: dsa: avoid suspicious RCU usage for synced VLAN-aware MAC addresses Revert "af_unix: Call scm_recv() only after scm_set_cred()." phylink: ReST-ify the phylink_pcs_neg_mode() kdoc libceph: Partially revert changes to support MSG_SPLICE_PAGES net: phy: mscc: fix packet loss due to RGMII delays net: mana: use vmalloc_array and vcalloc net: enetc: use vmalloc_array and vcalloc ionic: use vmalloc_array and vcalloc pds_core: use vmalloc_array and vcalloc gve: use vmalloc_array and vcalloc octeon_ep: use vmalloc_array and vcalloc net: usb: qmi_wwan: add u-blox 0x1312 composition perf trace: fix MSG_SPLICE_PAGES build error ipvlan: Fix return value of ipvlan_queue_xmit() netfilter: nf_tables: fix underflow in chain reference counter netfilter: nf_tables: unbind non-anonymous set if rule construction fails ... |
||
Linus Torvalds
|
6a8cbd9253 |
v6.5-rc1-sysctl-next
The changes queued up for v6.5-rc1 for sysctl are in line with prior efforts to stop usage of deprecated routines which incur recursion and also make it hard to remove the empty array element in each sysctl array declaration. The most difficult user to modify was parport which required a bit of re-thinking of how to declare shared sysctls there, Joel Granados has stepped up to the plate to do most of this work and eventual removal of register_sysctl_table(). That work ended up saving us about 1465 bytes according to bloat-o-meter. Since we gained a few bloat-o-meter karma points I moved two rather small sysctl arrays from kernel/sysctl.c leaving us only two more sysctl arrays to move left. Most changes have been tested on linux-next for about a month. The last straggler patches are a minor parport fix, changes to the sysctl kernel selftest so to verify correctness and prevent regressions for the future change he made to provide an alternative solution for the special sysctl mount point target which was using the now deprecated sysctl child element. This is all prep work to now finally be able to remove the empty array element in all sysctl declarations / registrations which is expected to save us a bit of bytes all over the kernel. That work will be tested early after v6.5-rc1 is out. -----BEGIN PGP SIGNATURE----- iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmSceh0SHG1jZ3JvZkBr ZXJuZWwub3JnAAoJEM4jHQowkoinBFQQAK9WdpcU8ODoDzoSls4jsCQpZUCfZ+ED pbCgQqUqu9VPs6bnJ+aXVa6Fh3uCr6+TIfNFM55qI/Sbo2issZ7bm0nvKmGgc6/m giqDP7btvHqiAsEootci8DVdbBXKkdH4dx3pSwleyN8pdinewH0hrKImaPpahyo6 1mB1du0iI89yjsZmheHVVSyfXXYAnP0PqRVy5Y+qxY7yYlIegQ5uAZmwRE62lfTf TuiV7OFuDZ2DBYOmqIhfGKGRnfOL5ZVF3iHCrfUpX3p+fEFzDmwvm3vr73PTSrFw /aRRLa/hOWr5ilw1bvnMcazgQzFEOlQb3DMhBKH7gLl3XHVrM+TaaqYHjUia1+6Y e2axz/duA2q9uLMW81daRApvHMCgy0exkpC7prfOxF5bgTe4TjA7ZWvGpqG1kPKT PPSxw80XvG5hLZm4tB0ZWJ5rOfFpiUGGneSeRQwyuClBt73SIO+F03jyGpt83slU jFE50ac14Zwh1oxpCQtYoR1+bXWdq1QwM5vQBNEuaoTSnJfVjrXqBz/BnqJChtjr m1vA27+4/dfki2P3gVWF1lGx43ir3uJvqk+BjWXm2CDDJqpRi3N0qcUwZwLuqAAz /LEgFqK61bpHi/C8c2NWAxIoeWRU4NUOaoiKmZwyt0sKAWU1Yzg70xssYeg7VYqZ 3pvFNVBqkV+F =sXUU -----END PGP SIGNATURE----- Merge tag 'v6.5-rc1-sysctl-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull sysctl updates from Luis Chamberlain: "The changes for sysctl are in line with prior efforts to stop usage of deprecated routines which incur recursion and also make it hard to remove the empty array element in each sysctl array declaration. The most difficult user to modify was parport which required a bit of re-thinking of how to declare shared sysctls there, Joel Granados has stepped up to the plate to do most of this work and eventual removal of register_sysctl_table(). That work ended up saving us about 1465 bytes according to bloat-o-meter. Since we gained a few bloat-o-meter karma points I moved two rather small sysctl arrays from kernel/sysctl.c leaving us only two more sysctl arrays to move left. Most changes have been tested on linux-next for about a month. The last straggler patches are a minor parport fix, changes to the sysctl kernel selftest so to verify correctness and prevent regressions for the future change he made to provide an alternative solution for the special sysctl mount point target which was using the now deprecated sysctl child element. This is all prep work to now finally be able to remove the empty array element in all sysctl declarations / registrations which is expected to save us a bit of bytes all over the kernel. That work will be tested early after v6.5-rc1 is out" * tag 'v6.5-rc1-sysctl-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: sysctl: replace child with an enumeration sysctl: Remove debugging dump_stack test_sysclt: Test for registering a mount point test_sysctl: Add an option to prevent test skip test_sysctl: Add an unregister sysctl test test_sysctl: Group node sysctl test under one func test_sysctl: Fix test metadata getters parport: plug a sysctl register leak sysctl: move security keys sysctl registration to its own file sysctl: move umh sysctl registration to its own file signal: move show_unhandled_signals sysctl to its own file sysctl: remove empty dev table sysctl: Remove register_sysctl_table sysctl: Refactor base paths registrations sysctl: stop exporting register_sysctl_table parport: Removed sysctl related defines parport: Remove register_sysctl_table from parport_default_proc_register parport: Remove register_sysctl_table from parport_device_proc_register parport: Remove register_sysctl_table from parport_proc_register parport: Move magic number "15" to a define |