linux-next

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git synced 2024-12-28 00:32:00 +00:00

Author	SHA1	Message	Date
Stephen Rothwell	24896579b9	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git	2024-12-20 15:11:58 +11:00
Stephen Rothwell	5e76f6a874	Merge branch 'for-next/kspp' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git	2024-12-20 15:11:33 +11:00
Stephen Rothwell	e570a07187	Merge branch 'for-next/execve' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git	2024-12-20 15:11:26 +11:00
Stephen Rothwell	7b5a9f355d	Merge branch 'slab/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git	2024-12-20 15:11:19 +11:00
Stephen Rothwell	e25b845fe9	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching	2024-12-20 14:48:21 +11:00
Stephen Rothwell	c7dd1920cd	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git	2024-12-20 14:39:25 +11:00
Stephen Rothwell	f44b7127cb	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext.git	2024-12-20 14:16:41 +11:00
Stephen Rothwell	3feff3acfa	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git	2024-12-20 14:16:39 +11:00
Stephen Rothwell	2d119f6afa	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux.git # Conflicts: # kernel/rcu/tree.c	2024-12-20 13:35:44 +11:00
Stephen Rothwell	ae884f86d7	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git	2024-12-20 13:32:54 +11:00
Stephen Rothwell	ea2356f802	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git	2024-12-20 13:32:50 +11:00
Stephen Rothwell	70d508f98a	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit.git	2024-12-20 13:26:08 +11:00
Stephen Rothwell	f5cf996f5c	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm.git	2024-12-20 13:16:33 +11:00
Stephen Rothwell	8eaf5c20fb	Merge branch 'for-next' of git://git.kernel.dk/linux-block.git	2024-12-20 13:10:04 +11:00
Stephen Rothwell	bfb2b1bbdb	Merge branch 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux.git	2024-12-20 12:26:56 +11:00
Stephen Rothwell	49e3ee1413	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git	2024-12-20 11:48:40 +11:00
Stephen Rothwell	0b744f9e35	Merge branch 'main' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git	2024-12-20 11:48:38 +11:00
Stephen Rothwell	1dae1421fa	Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git	2024-12-20 11:29:38 +11:00
Stephen Rothwell	23b66e8e8b	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux.git	2024-12-20 11:07:32 +11:00
Stephen Rothwell	07ccd1271c	Merge branch 'fs-next' of linux-next	2024-12-20 10:45:33 +11:00
Stephen Rothwell	b86e29c311	Merge branch 'mm-everything' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm	2024-12-20 10:23:48 +11:00
Stephen Rothwell	6488329e36	Merge branch 'tip/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git	2024-12-20 09:42:15 +11:00
Stephen Rothwell	bf034d3155	Merge branch 'ring-buffer/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git	2024-12-20 09:42:14 +11:00
Stephen Rothwell	412ef23451	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git	2024-12-20 09:41:34 +11:00
Stephen Rothwell	7743f57150	Merge branch 'mm-hotfixes-unstable' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm	2024-12-20 09:41:31 +11:00
Stephen Rothwell	cd07c43f9b	Merge branch 'vfs.all' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git	2024-12-20 09:19:26 +11:00
Stephen Rothwell	8dad5129f0	Merge branch 'for_next' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git	2024-12-20 09:19:18 +11:00
Jakub Kicinski	07e5c4eb94	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.13-rc4). No conflicts. Adjacent changes: drivers/net/ethernet/renesas/rswitch.h `32fd46f5b6` ("net: renesas: rswitch: remove speed from gwca structure") `922b4b955a` ("net: renesas: rswitch: rework ts tags management") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-12-19 11:35:07 -08:00
Ingo Molnar	c779bc69c8	Merge branch into tip/master: 'sched/core' # New commits in sched/core: `af98d8a36a` ("sched/fair: Fix CPU bandwidth limit bypass during CPU hotplug") `7675361ff9` ("sched: deadline: Cleanup goto label in pick_earliest_pushable_dl_task") `7d5265ffcd` ("rseq: Validate read-only fields under DEBUG_RSEQ config") `2a77e4be12` ("sched/fair: Untangle NEXT_BUDDY and pick_next_task()") `95d9fed3a2` ("sched/fair: Mark m*_vruntime() with __maybe_unused") `0429489e09` ("sched/fair: Fix variable declaration position") `61b82dfb6b` ("sched/fair: Do not try to migrate delayed dequeue task") `736c55a02c` ("sched/fair: Rename cfs_rq.nr_running into nr_queued") `43eef7c3a4` ("sched/fair: Remove unused cfs_rq.idle_nr_running") `31898e7b87` ("sched/fair: Rename cfs_rq.idle_h_nr_running into h_nr_idle") `9216582b0b` ("sched/fair: Removed unsued cfs_rq.h_nr_delayed") `1a49104496` ("sched/fair: Use the new cfs_rq.h_nr_runnable") `c2a295bffe` ("sched/fair: Add new cfs_rq.h_nr_runnable") `7b8a702d94` ("sched/fair: Rename h_nr_running into h_nr_queued") `c907cd44a1` ("sched: Unify HK_TYPE_{TIMER\|TICK\|MISC} to HK_TYPE_KERNEL_NOISE") `6010d245dd` ("sched/isolation: Consolidate housekeeping cpumasks that are always identical") `1174b9344b` ("sched/isolation: Make "isolcpus=nohz" equivalent to "nohz_full"") `ae5c677729` ("sched/core: Remove HK_TYPE_SCHED") `a76328d44c` ("sched/fair: Remove CONFIG_CFS_BANDWIDTH=n definition of cfs_bandwidth_used()") `3a181f20fb` ("sched/deadline: Consolidate Timer Cancellation") `53916d5fd3` ("sched/deadline: Check bandwidth overflow earlier for hotplug") `d4742f6ed7` ("sched/deadline: Correctly account for allocated bandwidth during hotplug") `41d4200b71` ("sched/deadline: Restore dl_server bandwidth on non-destructive root domain changes") `59297e2093` ("sched: add READ_ONCE to task_on_rq_queued") `108ad09990` ("sched: Don't try to catch up excess steal time.") Signed-off-by: Ingo Molnar <mingo@kernel.org>	2024-12-19 20:24:25 +01:00
Ingo Molnar	08eccca432	Merge branch into tip/master: 'perf/core' # New commits in perf/core: `02c56362a7` ("uprobes: Guard against kmemdup() failing in dup_return_instance()") `d29e744c71` ("perf/x86: Relax privilege filter restriction on AMD IBS") `6057b90ecc` ("perf/core: Export perf_exclude_event()") `8622e45b5d` ("uprobes: Reuse return_instances between multiple uretprobes within task") `0cf981de76` ("uprobes: Ensure return_instance is detached from the list before freeing") `636666a1c7` ("uprobes: Decouple return_instance list traversal and freeing") `2ff913ab3f` ("uprobes: Simplify session consumer tracking") `e0925f2dc4` ("uprobes: add speculative lockless VMA-to-inode-to-uprobe resolution") `83e3dc9a5d` ("uprobes: simplify find_active_uprobe_rcu() VMA checks") `03a001b156` ("mm: introduce mmap_lock_speculate_{try_begin\|retry}") `eb449bd969` ("mm: convert mm_lock_seq to a proper seqcount") `7528585290` ("mm/gup: Use raw_seqcount_try_begin()") `96450ead16` ("seqlock: add raw_seqcount_try_begin") `b4943b8bfc` ("perf/x86/rapl: Add core energy counter support for AMD CPUs") `54d2759778` ("perf/x86/rapl: Move the cntr_mask to rapl_pmus struct") `bdc57ec705` ("perf/x86/rapl: Remove the global variable rapl_msrs") `abf03d9bd2` ("perf/x86/rapl: Modify the generic variable names to _pkg") `eeca4c6b25` ("perf/x86/rapl: Add arguments to the init and cleanup functions") `cd29d83a6d` ("perf/x86/rapl: Make rapl_model struct global") `8bf1c86e5a` ("perf/x86/rapl: Rename rapl_pmu variables") `1d5e2f637a` ("perf/x86/rapl: Remove the cpu_to_rapl_pmu() function") `e4b4443477` ("x86/topology: Introduce topology_logical_core_id()") `2f2db34707` ("perf/x86/rapl: Remove the unused get_rapl_pmu_cpumask() function") `ae55e308bd` ("perf/x86/intel/ds: Simplify the PEBS records processing for adaptive PEBS") `3c00ed344c` ("perf/x86/intel/ds: Factor out functions for PEBS records processing") `7087bfb0ad` ("perf/x86/intel/ds: Clarify adaptive PEBS processing") `faac6f105e` ("perf/core: Check sample_type in perf_sample_save_brstack") `f226805bc5` ("perf/core: Check sample_type in perf_sample_save_callchain") `b9c44b9147` ("perf/core: Save raw sample data conditionally based on sample type") Signed-off-by: Ingo Molnar <mingo@kernel.org>	2024-12-19 20:24:25 +01:00
Ingo Molnar	c46f39a3e7	Merge branch into tip/master: 'locking/core' # New commits in locking/core: `63a48181fb` ("smp/scf: Evaluate local cond_func() before IPI side-effects") `d387ceb171` ("locking/lockdep: Enforce PROVE_RAW_LOCK_NESTING only if ARCH_SUPPORTS_RT") Signed-off-by: Ingo Molnar <mingo@kernel.org>	2024-12-19 20:24:24 +01:00
Ingo Molnar	f391ba1ed5	Merge branch into tip/master: 'irq/core' # New commits in irq/core: `b4706d8149` ("genirq/kexec: Prevent redundant IRQ masking by checking state before shutdown") `bad6722e47` ("kexec: Consolidate machine_kexec_mask_interrupts() implementation") `429f49ad36` ("genirq: Reuse irq_thread_fn() for forced thread case") `6f8b79683d` ("genirq: Move irq_thread_fn() further up in the code") Signed-off-by: Ingo Molnar <mingo@kernel.org>	2024-12-19 20:24:23 +01:00
Ingo Molnar	6371c819b1	Merge branch into tip/master: 'locking/urgent' # New commits in locking/urgent: `4a07791457` ("locking/rtmutex: Make sure we wake anything on the wake_q when we release the lock->wait_lock") Signed-off-by: Ingo Molnar <mingo@kernel.org>	2024-12-19 20:24:22 +01:00
Tvrtko Ursulin	de35994ecd	workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker After commit `746ae46c11` ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM") amdgpu started seeing the following warning: [ ] workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work [gpu_sched] is flushing !WQ_MEM_RECLAIM events:amdgpu_device_delay_enable_gfx_off [amdgpu] ... [ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched] ... [ ] Call Trace: [ ] <TASK> ... [ ] ? check_flush_dependency+0xf5/0x110 ... [ ] cancel_delayed_work_sync+0x6e/0x80 [ ] amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu] [ ] amdgpu_ring_alloc+0x40/0x50 [amdgpu] [ ] amdgpu_ib_schedule+0xf4/0x810 [amdgpu] [ ] ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched] [ ] amdgpu_job_run+0xaa/0x1f0 [amdgpu] [ ] drm_sched_run_job_work+0x257/0x430 [gpu_sched] [ ] process_one_work+0x217/0x720 ... [ ] </TASK> The intent of the verifcation done in check_flush_depedency is to ensure forward progress during memory reclaim, by flagging cases when either a memory reclaim process, or a memory reclaim work item is flushed from a context not marked as memory reclaim safe. This is correct when flushing, but when called from the cancel(_delayed)_work_sync() paths it is a false positive because work is either already running, or will not be running at all. Therefore cancelling it is safe and we can relax the warning criteria by letting the helper know of the calling context. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: `fca839c00a` ("workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue") References: `746ae46c11` ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM") Cc: Tejun Heo <tj@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v4.5+ Signed-off-by: Tejun Heo <tj@kernel.org>	2024-12-19 06:15:35 -10:00
Rafael J. Wysocki	432f1f00f7	Merge branches 'pm-em', 'pm-sleep' and 'pm-cpufreq' into linux-next * pm-em: PM: EM: Move sched domains rebuild function from schedutil to EM * pm-sleep: PM: wakeup: implement devm_device_init_wakeup() helper * pm-cpufreq: cpufreq: schedutil: Fix superfluous updates caused by need_freq_update cpufreq: intel_pstate: Use CPUFREQ_POLICY_UNKNOWN	2024-12-19 12:36:59 +01:00
Andrew Morton	45f41efd96	foo	2024-12-18 19:51:48 -08:00
Yunhui Cui	04f910643d	watchdog: output this_cpu when printing hard LOCKUP When printing "Watchdog detected hard LOCKUP on cpu", also output the detecting CPU. It's more intuitive. Link: https://lkml.kernel.org/r/20241210095238.63444-1-cuiyunhui@bytedance.com Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Cc: Bitao Hu <yaoma@linux.alibaba.com> Cc: Joel Granados <joel.granados@kernel.org> Cc: John Ogness <john.ogness@linutronix.de> Cc: Liu Song <liusong@linux.alibaba.com> Cc: Song Liu <song@kernel.org> Cc: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:51:38 -08:00
MengEn Sun	2b05eacc98	ucounts: move kfree() out of critical zone protected by ucounts_lock Although kfree is a non-sleep function, it is possible to enter a long chain of calls probabilistically, so it looks better to move kfree from alloc_ucounts() out of the critical zone of ucounts_lock. Link: https://lkml.kernel.org/r/1733458427-11794-1-git-send-email-mengensun@tencent.com Signed-off-by: MengEn Sun <mengensun@tencent.com> Reviewed-by: YueHong Wu <yuehongwu@tencent.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Andrei Vagin <avagin@google.com> Cc: Joel Granados <joel.granados@kernel.org> Cc: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:51:32 -08:00
Yaxin Wang	e1698e9117	delayacct: update docs and fix some spelling errors Update delay-accounting.rst to include the 'delay max' in the output of getdelays, and fix some spelling errors before. Link: https://lkml.kernel.org/r/20241213192700771XKZ8H30OtHSeziGqRVMs0@zte.com.cn Signed-off-by: Yaxin Wang <wang.yaxin@zte.com.cn> Signed-off-by: Jiang Kun <jiang.kun2@zte.com.cn> Cc: Balbir Singh <bsingharora@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Fan Yu <fan.yu9@zte.com.cn> Cc: Peilin He <he.peilin@zte.com.cn> Cc: tuqiang <tu.qiang35@zte.com.cn> Cc: Wang Yong <wang.yong12@zte.com.cn> Cc: xu xin <xu.xin16@zte.com.cn> Cc: ye xingchen <ye.xingchen@zte.com.cn> Cc: Yunkai Zhang <zhang.yunkai@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:51:30 -08:00
Wang Yaxin	036e1b3af4	delayacct: add delay max to record delay peak Introduce the use cases of delay max, which can help quickly detect potential abnormal delays in the system and record the types and specific details of delay spikes. Problem ======== Delay accounting can track the average delay of processes to show system workload. However, when a process experiences a significant delay, maybe a delay spike, which adversely affects performance, getdelays can only display the average system delay over a period of time. Yet, average delay is unhelpful for diagnosing delay peak. It is not even possible to determine which type of delay has spiked, as this information might be masked by the average delay. Solution ========= the 'delay max' can display delay peak since the system's startup, which can record potential abnormal delays over time, including the type of delay and the maximum delay. This is helpful for quickly identifying crash caused by delay. Use case ========= bash# ./getdelays -d -p 244 print delayacct stats ON PID 244 CPU count real total virtual total delay total delay average delay max 68 192000000 213676651 705643 0.010ms 0.306381ms IO count delay total delay average delay max 0 0 0.000ms 0.000000ms SWAP count delay total delay average delay max 0 0 0.000ms 0.000000ms RECLAIM count delay total delay average delay max 0 0 0.000ms 0.000000ms THRASHING count delay total delay average delay max 0 0 0.000ms 0.000000ms COMPACT count delay total delay average delay max 0 0 0.000ms 0.000000ms WPCOPY count delay total delay average delay max 235 15648284 0.067ms 0.263842ms IRQ count delay total delay average delay max 0 0 0.000ms 0.000000ms Link: https://lkml.kernel.org/r/20241203164848805CS62CQPQWG9GLdQj2_BxS@zte.com.cn Co-developed-by: Wang Yong <wang.yong12@zte.com.cn> Signed-off-by: Wang Yong <wang.yong12@zte.com.cn> Co-developed-by: xu xin <xu.xin16@zte.com.cn> Signed-off-by: xu xin <xu.xin16@zte.com.cn> Co-developed-by: Wang Yaxin <wang.yaxin@zte.com.cn> Signed-off-by: Wang Yaxin <wang.yaxin@zte.com.cn> Signed-off-by: Kun Jiang <jiang.kun2@zte.com.cn> Cc: Balbir Singh <bsingharora@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Fan Yu <fan.yu9@zte.com.cn> Cc: Peilin He <he.peilin@zte.com.cn> Cc: tuqiang <tu.qiang35@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: ye xingchen <ye.xingchen@zte.com.cn> Cc: Yunkai Zhang <zhang.yunkai@zte.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:51:30 -08:00
Zijun Hu	bcaadbb2ee	kernel/resource: simplify API __devm_release_region() implementation Simplify __devm_release_region() implementation by dedicated API devres_release() which have below advantages than current __release_region() + devres_destroy(): It is simpler if __devm_release_region() is undoing what __devm_request_region() did, otherwise, it can avoid wrong and undesired __release_region(). Link: https://lkml.kernel.org/r/20241017-release_region_fix-v1-1-84a3e8441284@quicinc.com Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:51:30 -08:00
Mateusz Guzik	7bd439f024	get_task_exe_file: check PF_KTHREAD locklessly Same thing as `8ac5dc6659` ("get_task_mm: check PF_KTHREAD lockless") Nowadays PF_KTHREAD is sticky and it was never protected by ->alloc_lock. Move the PF_KTHREAD check outside of task_lock() section to make this code more understandable. Link: https://lkml.kernel.org/r/20241119143526.704986-1-mjguzik@gmail.com Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:51:25 -08:00
Uros Bizjak	9c4ba50565	percpu: use TYPEOF_UNQUAL() in variable declarations Use TYPEOF_UNQUAL() to declare variables as a corresponding type without named address space qualifier to avoid "`__seg_gs' specified for auto variable `var'" errors. Link: https://lkml.kernel.org/r/20241208204708.3742696-4-ubizjak@gmail.com Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Acked-by: Nadav Amit <nadav.amit@gmail.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Dennis Zhou <dennis@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Arnd Bergmann <arnd@arndb.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:51:04 -08:00
Suren Baghdasaryan	19fbad905e	mm: convert mm_lock_seq to a proper seqcount Convert mm_lock_seq to be seqcount_t and change all mmap_write_lock variants to increment it, in-line with the usual seqcount usage pattern. This lets us check whether the mmap_lock is write-locked by checking mm_lock_seq.sequence counter (odd=locked, even=unlocked). This will be used when implementing mmap_lock speculation functions. As a result vm_lock_seq is also change to be unsigned to match the type of mm_lock_seq.sequence. Link: https://lkml.kernel.org/r/20241122174416.1367052-2-surenb@google.com Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Hillf Danton <hdanton@sina.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Sourav Panda <souravpanda@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:50:50 -08:00
Nicholas Piggin	9664c5b908	lazy tlb: fix hotplug exit race with MMU_LAZY_TLB_SHOOTDOWN CPU unplug first calls __cpu_disable(), and that's where powerpc calls cleanup_cpu_mmu_context(), which clears this CPU from mm_cpumask() of all mms in the system. However this CPU may still be using a lazy tlb mm, and its mm_cpumask bit will be cleared from it. The CPU does not switch away from the lazy tlb mm until arch_cpu_idle_dead() calls idle_task_exit(). If that user mm exits in this window, it will not be subject to the lazy tlb mm shootdown and may be freed while in use as a lazy mm by the CPU that is being unplugged. cleanup_cpu_mmu_context() could be moved later, but it looks better to move the lazy tlb mm switching earlier. The problem with doing the lazy mm switching in idle_task_exit() is explained in commit `bf2c59fce4` ("sched/core: Fix illegal RCU from offline CPUs"), which added a wart to switch away from the mm but leave it set in active_mm to be cleaned up later. So instead, switch away from the lazy tlb mm at sched_cpu_wait_empty(), which is the last hotplug state before teardown (CPUHP_AP_SCHED_WAIT_EMPTY). This CPU will never switch to a user thread from this point, so it has no chance to pick up a new lazy tlb mm. This removes the lazy tlb mm handling wart in CPU unplug. With this, idle_task_exit() is not needed anymore and can be cleaned up. This leaves the prototype alone, to be cleaned after this change. herton: took the suggestions from https://lore.kernel.org/all/87jzvyprsw.ffs@tglx/ and made adjustments on the initial patch proposed by Nicholas. Link: https://lkml.kernel.org/r/20230524060455.147699-1-npiggin@gmail.com Link: https://lore.kernel.org/all/20230525205253.E2FAEC433EF@smtp.kernel.org/ Link: https://lkml.kernel.org/r/20241104142318.3295663-1-herton@redhat.com Fixes: `2655421ae6` ("lazy tlb: shoot lazies, non-refcounting lazy tlb mm reference handling scheme") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Suggested-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:50:37 -08:00
Peter Zijlstra	bf8f464ee2	kasan: make kasan_record_aux_stack_noalloc() the default behaviour kasan_record_aux_stack_noalloc() was introduced to record a stack trace without allocating memory in the process. It has been added to callers which were invoked while a raw_spinlock_t was held. More and more callers were identified and changed over time. Is it a good thing to have this while functions try their best to do a locklessly setup? The only downside of having kasan_record_aux_stack() not allocate any memory is that we end up without a stacktrace if stackdepot runs out of memory and at the same stacktrace was not recorded before To quote Marco Elver from https://lore.kernel.org/all/CANpmjNPmQYJ7pv1N3cuU8cP18u7PP_uoZD8YxwZd4jtbof9nVQ@mail.gmail.com/ \| I'd be in favor, it simplifies things. And stack depot should be \| able to replenish its pool sufficiently in the "non-aux" cases \| i.e. regular allocations. Worst case we fail to record some \| aux stacks, but I think that's only really bad if there's a bug \| around one of these allocations. In general the probabilities \| of this being a regression are extremely small [...] Make the kasan_record_aux_stack_noalloc() behaviour default as kasan_record_aux_stack(). [bigeasy@linutronix.de: dressed the diff as patch] Link: https://lkml.kernel.org/r/20241122155451.Mb2pmeyJ@linutronix.de Fixes: `7cb3007ce2` ("kasan: generic: introduce kasan_record_aux_stack_noalloc()") Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reported-by: syzbot+39f85d612b7c20d8db48@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/67275485.050a0220.3c8d68.0a37.GAE@google.com Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Reviewed-by: Marco Elver <elver@google.com> Reviewed-by: Waiman Long <longman@redhat.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Ben Segall <bsegall@google.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: <kasan-dev@googlegroups.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Neeraj Upadhyay <neeraj.upadhyay@kernel.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: syzkaller-bugs@googlegroups.com Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Zqiang <qiang.zhang1211@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:50:31 -08:00
Arnd Bergmann	71497ff8f3	kcov: mark in_softirq_really() as __always_inline If gcc decides not to inline in_softirq_really(), objtool warns about a function call with UACCESS enabled: kernel/kcov.o: warning: objtool: __sanitizer_cov_trace_pc+0x1e: call to in_softirq_really() with UACCESS enabled kernel/kcov.o: warning: objtool: check_kcov_mode+0x11: call to in_softirq_really() with UACCESS enabled Mark this as __always_inline to avoid the problem. Link: https://lkml.kernel.org/r/20241217071814.2261620-1-arnd@kernel.org Fixes: `7d4df2dad3` ("kcov: properly check for softirq context") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Marco Elver <elver@google.com> Cc: Aleksandr Nogikh <nogikh@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:49:58 -08:00
Lorenzo Stoakes	8ac662f5da	fork: avoid inappropriate uprobe access to invalid mm If dup_mmap() encounters an issue, currently uprobe is able to access the relevant mm via the reverse mapping (in build_map_info()), and if we are very unlucky with a race window, observe invalid XA_ZERO_ENTRY state which we establish as part of the fork error path. This occurs because uprobe_write_opcode() invokes anon_vma_prepare() which in turn invokes find_mergeable_anon_vma() that uses a VMA iterator, invoking vma_iter_load() which uses the advanced maple tree API and thus is able to observe XA_ZERO_ENTRY entries added to dup_mmap() in commit `d240629148` ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()"). This change was made on the assumption that only process tear-down code would actually observe (and make use of) these values. However this very unlikely but still possible edge case with uprobes exists and unfortunately does make these observable. The uprobe operation prevents races against the dup_mmap() operation via the dup_mmap_sem semaphore, which is acquired via uprobe_start_dup_mmap() and dropped via uprobe_end_dup_mmap(), and held across register_for_each_vma() prior to invoking build_map_info() which does the reverse mapping lookup. Currently these are acquired and dropped within dup_mmap(), which exposes the race window prior to error handling in the invoking dup_mm() which tears down the mm. We can avoid all this by just moving the invocation of uprobe_start_dup_mmap() and uprobe_end_dup_mmap() up a level to dup_mm() and only release this lock once the dup_mmap() operation succeeds or clean up is done. This means that the uprobe code can never observe an incompletely constructed mm and resolves the issue in this case. Link: https://lkml.kernel.org/r/20241210172412.52995-1-lorenzo.stoakes@oracle.com Fixes: `d240629148` ("fork: use __mt_dup() to duplicate maple tree in dup_mmap()") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: syzbot+2d788f4f7cb660dac4b7@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/6756d273.050a0220.2477f.003d.GAE@google.com/ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-12-18 19:04:44 -08:00
Kees Cook	c7c1167fcb	Merge branch 'for-next/topic/execve/core' into for-next/execve	2024-12-18 17:01:53 -08:00
Rafael J. Wysocki	ebeeee390b	PM: EM: Move sched domains rebuild function from schedutil to EM Function sugov_eas_rebuild_sd() defined in the schedutil cpufreq governor implements generic functionality that may be useful in other places. In particular, there is a plan to use it in the intel_pstate driver in the future. For this reason, move it from schedutil to the energy model code and rename it to em_rebuild_sched_domains(). This also helps to get rid of some #ifdeffery in schedutil which is a plus. No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com>	2024-12-18 20:32:13 +01:00

1 2 3 4 5 ...

46783 Commits