linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-01-17 18:36:00 +00:00

Author	SHA1	Message	Date
Linus Torvalds	12cc5240f4	This pull request contains the following changes for UML: - Removal of dead code (TT mode leftovers, etc.) - Fixes for the network vector driver - Fixes for time-travel mode -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAmb3Bf8WHHJpY2hhcmRA c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wZLED/9xfTCB2l8gbiw+vslfXNRKdt6A jcqC4MfZ/dxkt5X2iy9pGROBl0dRe2WAbdSJIBIQikVDaTWg4Vix7WcnMs+FTXxk gGlJU8xx69mh7MWH0DopJrCHYX1SxA4bxve4J7iljTzuMleYhUUJ6bdqmG9XBRBN M1hYWpAodR/BPaDRZtLpzdYuuo3cZPM3ESpke5GupsFEWxqRZqvdlgwdsb9RfrOe 3HvWZUMrw+hWJg1NkQ7+ljqPjoaafGu8/ic89r44bNtNQqN+o5b1v1E792e9qCJD 1jRXTQMtTDH+ETgYskzWEIFyMQ4WlRy/N+mKKUHUJrTwAm76zdSpxmvU2Fh8cvLy ofWdbtqR127WnKii7UpZLf6kXzmC6pcmLbHU78PohoOnEjk4TeMjEKw6FrcSZY51 wGgz29mLJOZ33mh7So37bRU/x5OKkq0u+BHyrhZYiHXdcBN8R5KBSJlWvl+A+A7y F5VpUvqAazc6H0HazZDWtoPDJ4HpbSEbH/8G4rR3IlZ4DmqyRuOr6f4AeiEPFz2n VNQVivgFL59zPflo8eWsLQvK8ZaZSop05RDYRk53uMooUZUNKvhTFmIRCb6bNFT/ c4Ycoi3qa+YQQxSUEKmrE71dOYxdT1nHvl7YgR0BGABWYt2G1j/UyTjJMJ7L49Ws HGpAnI4FdPELu3qIVA== =DHl2 -----END PGP SIGNATURE----- Merge tag 'uml-for-linus-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Richard Weinberger: - Removal of dead code (TT mode leftovers, etc) - Fixes for the network vector driver - Fixes for time-travel mode * tag 'uml-for-linus-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: um: fix time-travel syscall scheduling hack um: Remove outdated asm/sysrq.h header um: Remove the declaration of user_thread function um: Remove the call to SUBARCH_EXECVE1 macro um: Remove unused mm_fd field from mm_id um: Remove unused fields from thread_struct um: Remove the redundant newpage check in update_pte_range um: Remove unused kpte_clear_flush macro um: Remove obsoleted declaration for execute_syscall_skas user_mode_linux_howto_v2: add VDE vector support in doc vector_user: add VDE support um: remove ARCH_NO_PREEMPT_DYNAMIC um: vector: Fix NAPI budget handling um: vector: Replace locks guarding queue depth with atomics um: remove variable stack array in os_rcv_fd_msg()	2024-09-27 12:48:48 -07:00
Johannes Berg	381d2f95c8	um: fix time-travel syscall scheduling hack The schedule() call there really never did anything at least since the introduction of the EEVDF scheduler, but now I found a case where we permanently hang in a loop of -ERESTARTNOINTR (due to locking.) Work around it by making any syscalls with error return take time (and then schedule after) so we cannot hang in such a loop forever. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-09-12 20:46:23 +02:00
Tiwei Bie	ae0dc67c25	um: Remove outdated asm/sysrq.h header This header no longer serves a purpose after show_trace was removed by commit 9d1ee8ce92e1 ("um: Rewrite show_stack()"). Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-09-12 20:44:11 +02:00
Tiwei Bie	bf67dbf4f7	um: Remove the call to SUBARCH_EXECVE1 macro This macro has never been defined by any supported sub-architectures in tree since it was introduced by commit 1d3468a6643a ("[PATCH uml: move _kern.c files"). Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-09-12 20:42:22 +02:00
Tiwei Bie	59376fb2a7	um: Remove unused mm_fd field from mm_id It's no longer used since the removal of the SKAS3/4 support. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-09-12 20:36:22 +02:00
Tiwei Bie	94090f418f	um: Remove unused fields from thread_struct These fields are no longer used since the removal of tt mode. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-09-12 20:35:35 +02:00
Tiwei Bie	669afa4e87	um: Remove the redundant newpage check in update_pte_range The two checks have been identical since commit ef714f15027c ("um: remove force_flush_all from fork_handler"). And the inner one isn't necessary anymore. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-09-12 20:34:39 +02:00
Daniel Vetter	91dae758bd	drm-misc-next for v6.12: UAPI Changes: virtio: - Define DRM capset Cross-subsystem Changes: dma-buf: - heaps: Clean up documentation printk: - Pass description to kmsg_dump() Core Changes: CI: - Update IGT tests - Point upstream repo to GitLab instance modesetting: - Introduce Power Saving Policy property for connectors - Add might_fault() to drm_modeset_lock priming - Add dynamic per-crtc vblank configuration support panic: - Avoid build-time interference with framebuffer console docs: - Document Colorspace property scheduler: - Remove full_recover from drm_sched_start TTM: - Make LRU walk restartable after dropping locks - Allow direct reclaim to allocate local memory Driver Changes: amdgpu: - Support Power Saving Policy connector property ast: - astdp: Support AST2600 with VGA; Clean up HPD bridge: - Silence error message on -EPROBE_DEFER - analogix: Clean aup - bridge-connector: Fix double free - lt6505: Disable interrupt when powered off - tc358767: Make default DP port preemphasis configurable gma500: - Update i2c terminology ivpu: - Add MODULE_FIRMWARE() lcdif: - Fix pixel clock loongson: - Use GEM refcount over TTM's mgag200: - Improve BMC handling - Support VBLANK intterupts nouveau: - Refactor and clean up internals - Use GEM refcount over TTM's panel: - Shutdown fixes plus documentation - Refactor several drivers for better code sharing - boe-th101mb31ig002: Support for starry-er88577 MIPI-DSI panel plus DT; Fix porch parameter - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1, BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2, CMN N116BCP-EA2, CSW MNB601LS1-4 - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT - jd9365da: Support Melfas lmfbx101117480 MIPI-DSI panel plus DT; Refactor for code sharing sti: - Fix module owner stm: - Avoid UAF wih managed plane and CRTC helpers - Fix module owner - Fix error handling in probe - Depend on COMMON_CLK - ltdc: Fix transparency after disabling plane; Remove unused interrupt tegra: - Call drm_atomic_helper_shutdown() v3d: - Clean up perfmon vkms: - Clean up -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmareygACgkQaA3BHVML eiO2vwf9FirbMiq4lfHzgcbNIU1dTUtjRAZjrlwGmqk5cb9lUshAMCMBMOEQBDdg XMQQj/RMBvRUuxzsPGk78ObSz5FBaBLgKwFprer0V6uslQaJxj4YRsnkp0l2n+0k +ebhfo2rUgZOdgNOkXH326w9UhqiydIa7GaA2aq1vUzXKFDfvGXtSN75BMlEWlKP rTft56AiwjwcKu7zYFHGlFUMSNpKAQy7lnV3+dBXAfFNHu4zVNoI/yWGEOdR7eVo WhiEcpvismsOh+BfUvMNPP3RKwjXHdwMlJYb+v9XGgH27hqc50lSceWydHtoJTto DTXF9WQhJ+/GQR9ZGmBjos9GVbECDA== =L/1W -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2024-08-01' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-next for v6.12: UAPI Changes: virtio: - Define DRM capset Cross-subsystem Changes: dma-buf: - heaps: Clean up documentation printk: - Pass description to kmsg_dump() Core Changes: CI: - Update IGT tests - Point upstream repo to GitLab instance modesetting: - Introduce Power Saving Policy property for connectors - Add might_fault() to drm_modeset_lock priming - Add dynamic per-crtc vblank configuration support panic: - Avoid build-time interference with framebuffer console docs: - Document Colorspace property scheduler: - Remove full_recover from drm_sched_start TTM: - Make LRU walk restartable after dropping locks - Allow direct reclaim to allocate local memory Driver Changes: amdgpu: - Support Power Saving Policy connector property ast: - astdp: Support AST2600 with VGA; Clean up HPD bridge: - Silence error message on -EPROBE_DEFER - analogix: Clean aup - bridge-connector: Fix double free - lt6505: Disable interrupt when powered off - tc358767: Make default DP port preemphasis configurable gma500: - Update i2c terminology ivpu: - Add MODULE_FIRMWARE() lcdif: - Fix pixel clock loongson: - Use GEM refcount over TTM's mgag200: - Improve BMC handling - Support VBLANK intterupts nouveau: - Refactor and clean up internals - Use GEM refcount over TTM's panel: - Shutdown fixes plus documentation - Refactor several drivers for better code sharing - boe-th101mb31ig002: Support for starry-er88577 MIPI-DSI panel plus DT; Fix porch parameter - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1, BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2, CMN N116BCP-EA2, CSW MNB601LS1-4 - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT - jd9365da: Support Melfas lmfbx101117480 MIPI-DSI panel plus DT; Refactor for code sharing sti: - Fix module owner stm: - Avoid UAF wih managed plane and CRTC helpers - Fix module owner - Fix error handling in probe - Depend on COMMON_CLK - ltdc: Fix transparency after disabling plane; Remove unused interrupt tegra: - Call drm_atomic_helper_shutdown() v3d: - Clean up perfmon vkms: - Clean up Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240801121406.GA102996@linux.fritz.box	2024-08-08 18:58:46 +02:00
Jocelyn Falempe	e1a261ba59	printk: Add a short description string to kmsg_dump() kmsg_dump doesn't forward the panic reason string to the kmsg_dumper callback. This patch adds a new struct kmsg_dump_detail, that will hold the reason and description, and pass it to the dump() callback. To avoid updating all kmsg_dump() call, it adds a kmsg_dump_desc() function and a macro for backward compatibility. I've written this for drm_panic, but it can be useful for other kmsg_dumper. It allows to see the panic reason, like "sysrq triggered crash" or "VFS: Unable to mount root fs on xxxx" on the drm panic screen. v2: * Use a struct kmsg_dump_detail to hold the reason and description pointer, for more flexibility if we want to add other parameters. (Kees Cook) * Fix powerpc/nvram_64 build, as I didn't update the forward declaration of oops_to_nvram() Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Acked-by: Petr Mladek <pmladek@suse.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Kees Cook <kees@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240702122639.248110-1-jfalempe@redhat.com	2024-07-17 12:35:24 +02:00
Johannes Berg	86abcd6eeb	um: register power-off handler Otherwise we always get reboot: Power off not available: System halted instead which is really quite pointless. Link: https://patch.msgid.link/20240703173839.fcbb538c6686.I3d333f4773cff93c4337c4d128ee0b1b501b3dfa@changeid Acked-By: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-04 12:03:20 +02:00
Anton Ivanov	cd01672d64	um: Enable preemption in UML Since userspace state is saved in the MM process, kernel using FPU still doesn't really need to do anything, so this really is as simple as enabling preemption. The irq critical section in sigio_handler() needs preempt_disable()/preempt_enable(). Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Link: https://patch.msgid.link/20240702102549.d2fcea450854.I12f5a53d80ec1e425e66ef272b1e95cb523b608e@changeid [rebase, remove FPU save/restore, fix x86/um Makefile, rewrite commit message] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 17:10:43 +02:00
Benjamin Berg	bcf3d957c6	um: refactor TLB update handling Conceptually, we want the memory mappings to always be up to date and represent whatever is in the TLB. To ensure that, we need to sync them over in the userspace case and for the kernel we need to process the mappings. The kernel will call flush_tlb_* if page table entries that were valid before become invalid. Unfortunately, this is not the case if entries are added. As such, change both flush_tlb_* and set_ptes to track the memory range that has to be synchronized. For the kernel, we need to execute a flush_tlb_kern_* immediately but we can wait for the first page fault in case of set_ptes. For userspace in contrast we only store that a range of memory needs to be synced and do so whenever we switch to that process. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240703134536.1161108-13-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 17:09:50 +02:00
Benjamin Berg	573a446fc8	um: simplify and consolidate TLB updates The HVC update was mostly used to compress consecutive calls into one. This is mostly relevant for userspace where it is already handled by the syscall stub code. Simplify the whole logic and consolidate it for both kernel and userspace. This does remove the sequential syscall compression for the kernel, however that shouldn't be the main factor in most runs. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240703134536.1161108-12-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 17:09:50 +02:00
Benjamin Berg	ef714f1502	um: remove force_flush_all from fork_handler There should be no need for this. It may be that this used to work around another issue where after a clone the MM was in a bad state. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240703134536.1161108-11-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 17:09:50 +02:00
Benjamin Berg	5168f6b4a4	um: Do not flush MM in flush_thread There should be no need to flush the memory in flush_thread. Doing this likely worked around some issue where memory was still incorrectly mapped when creating or cloning an MM. With the removal of the special clone path, that isn't relevant anymore. However, add the flush into MM initialization so that any new userspace MM is guaranteed to be clean. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240703134536.1161108-10-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 17:09:49 +02:00
Benjamin Berg	3c83170d7c	um: Delay flushing syscalls until the thread is restarted As running the syscalls is expensive due to context switches, we should do so as late as possible in case more syscalls need to be queued later on. This will also benefit a later move to a SECCOMP enabled userspace as in that case the need for extra context switches is removed entirely. Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> Link: https://patch.msgid.link/20240703134536.1161108-9-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 17:09:49 +02:00
Benjamin Berg	a5d2cfe749	um: remove copy_context_skas0 The kernel flushes the memory ranges anyway for CoW and does not assume that the userspace process has anything set up already. So, start with a fresh process for the new mm context. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240703134536.1161108-8-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 17:09:49 +02:00
Benjamin Berg	7911b650a0	um: remove LDT support The current LDT code has a few issues that mean it should be redone in a different way once we always start with a fresh MM even when cloning. In a new and better world, the kernel would just ensure its own LDT is clear at startup. At that point, all that is needed is a simple function to populate the LDT from another MM in arch_dup_mmap combined with some tracking of the installed LDT entries for each MM. Note that the old implementation was even incorrect with regard to reading, as it copied out the LDT entries in the internal format rather than converting them to the userspace structure. Removal should be fine as the LDT is not used for thread-local storage anymore. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20240703134536.1161108-7-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 17:09:49 +02:00
Benjamin Berg	76ed9158e1	um: Rework syscall handling Rework syscall handling to be platform independent. Also create a clean split between queueing of syscalls and flushing them out, removing the need to keep state in the code that triggers the syscalls. The code adds syscall_data_len to the global mm_id structure. This will be used later to allow surrounding code to track whether syscalls still need to run and if errors occurred. Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> Link: https://patch.msgid.link/20240703134536.1161108-5-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 17:09:49 +02:00
Benjamin Berg	dc26184a9d	um: Create signal stack memory assignment in stub_data When we switch to use seccomp, we need both the signal stack and other data (i.e. syscall information) to co-exist in the stub data. To facilitate this, start by defining separate memory areas for the stack and syscall data. This moves the signal stack onto a new page as the memory area is not sufficient to hold both signal stack and syscall information. Only change the signal stack setup for now, as the syscall code will be reworked later. Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> Link: https://patch.msgid.link/20240703134536.1161108-3-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 17:09:48 +02:00
Johannes Berg	4561022588	um: time-travel: remove time_exit() This function is unused and unneeded, remove it. Link: https://patch.msgid.link/20240703130105.02b3a974acb7.I7264821f7cfa17ea713b7a3e4787aa41a3107d01@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 17:09:08 +02:00
Johannes Berg	bfb80d8bc9	um: add shared memory optimisation for time-travel=ext With external time travel, a LOT of message can end up being exchanged on the socket, taking a significant amount of time just to do that. Add a new shared memory optimisation to that, where a number of changes are made: - the controller sends a client ID and a shared memory FD (and a logging FD we don't use) in the ACK message to the initial START - the shared memory holds the current time and the free_until value, so that there's no need to exchange messages for that - if the client that's running has shared memory support, any client (the running one included) can request the next time it wants to run inside the shared memory, rather than sending a message, by also updating the free_until value - when shared memory is enabled, RUN/WAIT messages no longer have an ACK, further cutting down on messages Together, this can reduce the number of messages very significantly, and reduce overall test/simulation run time. Co-developed-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Link: https://patch.msgid.link/20240702192118.6ad0a083f574.Ie41206c8ce4507fe26b991937f47e86c24ca7a31@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 12:24:54 +02:00
Johannes Berg	5cde6096a4	um: generalize os_rcv_fd Change os_rcv_fd() to os_rcv_fd_msg() that can more generally receive any number of FDs in any kind of message. Link: https://patch.msgid.link/20240702192118.40b78b2bfe4e.Ic6ec12d72630e5bcae1e597d6bd5c6f29f441563@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 12:24:25 +02:00
Mordechay Goodstein	6555acdefc	um: time-travel: support time-travel protocol broadcast messages Add a message type to the time-travel protocol to broadcast a small (64-bit) value to all participants in a simulation. The main use case is to have an identical message come to all participants in a simulation, e.g. to separate out logs for different tests running in a single simulation. Down in the guts of time_travel_handle_message() we can't use printk() and not even printk_deferred(), so just store the message and print it at the start of the userspace() function. Unfortunately this means that other prints in the kernel can actually bypass the message, but in most cases where this is used, for example to separate test logs, userspace will be involved. Also, even if we could use printk_deferred(), we'd still need to flush it out in the userspace() function since otherwise userspace messages might cross it. As a result, this is a reasonable compromise, there's no need to have any core changes and it solves the main use case we have for it. Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Link: https://patch.msgid.link/20240702192118.c4093bc5b15e.I2ca8d006b67feeb866ac2017af7b741c9e06445a@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 12:24:22 +02:00
Wei Yang	5cd93c7532	um/mm: remove redundant assignment of max_low_pfn Current calculation of max_low_pfn is introduced in commit af84eab20891 ("[PATCH] uml: fix LVM crash"). It is intended to set max_low_pfn to the same value as max_pfn. But I am not sure why the max_pfn is set to totalram_pages, which represents the number of usable pages in system instead of an absolute page frame number. (The change history stops there.) While we have already calculate it in setup_physmem(), so not necessary to do it again. Also this would help changing totalram_pages accounting, since we plan to move the accounting into __free_pages_core(). With this change, totalram_pages may not represent the total usable pages at this point, since some pages would be deferred initialized. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> CC: Jeff Dike <jdike@linux.intel.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Alasdair G Kergon <agk@redhat.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Mike Rapoport (IBM) <rppt@kernel.org> CC: David Hildenbrand <david@redhat.com> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org> Link: https://patch.msgid.link/20240615034150.2958-1-richard.weiyang@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 12:22:39 +02:00
Tiwei Bie	cb2759431a	um: Remove /proc/sysemu support code Currently /proc/sysemu will never be registered, as sysemu_supported is initialized to zero implicitly and no code updates it. And there is also nothing to configure via sysemu in UML anymore. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20240527134024.1539848-3-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 12:21:57 +02:00
Tiwei Bie	6fdae1da76	um: Remove unused ncpus variable It's no longer used. And uml_ncpus_setup doesn't exist anymore. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20240527134024.1539848-2-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 12:21:57 +02:00
Johannes Berg	7d0a8a490a	um: time-travel: fix time-travel-start option We need to have the = as part of the option so that the value can be parsed properly. Also document that it must be given in nanoseconds, not seconds. Fixes: 065038706f77 ("um: Support time travel mode") Link: https://patch.msgid.link/20240417102744.14b9a9d4eba0.Ib22e9136513126b2099d932650f55f193120cd97@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 12:21:16 +02:00
Benjamin Berg	c140a5bd5d	um: irqs: process outstanding IRQs when unblocking signals When in time-travel mode, the eventfd events are read even when signals are blocked as SIGIO still needs to be processed. In this case, the event is cleared on the eventfd but the IRQ still needs to be fired later. We did already ensure that the SIGIO handler is run again. However, the FDs are configured to be level triggered, so that eventfd will not notify again. As such, add some logic to mark the IRQ as pending and process it at the next opportunity. To avoid duplication, reuse the logic used for the suspend/resume case. This does not really change anything except for delaying running the IRQs with timetravel_handler at a slightly later point in time (and possibly running non-timetravel IRQs that shouldn't happen earlier). While at it, move marking as pending into irq_event_handler as that is the more logical place for it to happen. Signed-off-by: Benjamin Berg <benjamin.berg@intel.com> Link: https://patch.msgid.link/20231018123643.1255813-1-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-07-03 12:18:01 +02:00
Linus Torvalds	2313022ec5	This pull request contains the following changes for UML: - Fixes for -Wmissing-prototypes warnings and further cleanup - Remove callback returning void from rtc and virtio drivers - Fix bash location -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAmZQ/FYWHHJpY2hhcmRA c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wfbLEAC1X55jjignxMIt4gEbtOXL2Pgn Md3z8sr5QhyQeLEkoYEhAqYHcKYY8A9ZshfNS4RNTbhU6qaFQBNwbBuFnJ1MsllC 236EKgy0xFChgqH0bszGW97VRcIs79qauDt0mE0AXQGpuW7AjJX9chT2ikp9Sr5z P2Gnp7+l/OaAH7UXFpaYYOWOzRAQCbA67hN3nRcSBCPq+Plw2bQCCKKK0g4UwqmI vukAguO3eGZ0B4oQEsPX/krM0IigM01l5pJVhkdNzJgMOfd7eWb3o3juE35f4KPx vSd8LPmoBvDJt9dKbZE38fC58+U9qWDcBDLfDlf7F0dGtWQi6QeZmrmQSteQUAFF YWHllQ+P6xdh1kdSXWk8IesVINydMAc79DpqmKkEUgmCGVX+grt40aOTnOIUuzjq 9lMcfKgjjBz6qsC3fWyGMvjaPpRRbe4G1wnAOij+hdBNR2fEFaqv8Dx9Zx42G3lm oYDylqjP73SbtOKbTCdHTqOfTSC83KYmo6w5ttwnFZcDVtbXRY8NejIX08Go8KIn OXeZ8Pxf3DmQ4yuhE3mWOoT/eFiZnXpoNiteQZ/8RhyPMJllVijtSIlnLteuah4d Z68Nh9/P52VcjMH0wS1eTKrkUAgfGBQ3kIOZqbU8UMSeq8vTB2kx++HwAtmUNi07 pDaNOQVtW5m4HMhVlw== =umlG -----END PGP SIGNATURE----- Merge tag 'uml-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Richard Weinberger: - Fixes for -Wmissing-prototypes warnings and further cleanup - Remove callback returning void from rtc and virtio drivers - Fix bash location * tag 'uml-for-linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (26 commits) um: virtio_uml: Convert to platform remove callback returning void um: rtc: Convert to platform remove callback returning void um: Remove unused do_get_thread_area function um: Fix -Wmissing-prototypes warnings for __vdso_* um: Add an internal header shared among the user code um: Fix the declaration of kasan_map_memory um: Fix the -Wmissing-prototypes warning for get_thread_reg um: Fix the -Wmissing-prototypes warning for __switch_mm um: Fix -Wmissing-prototypes warnings for (rt_)sigreturn um: Stop tracking host PID in cpu_tasks um: process: remove unused 'n' variable um: vector: remove unused len variable/calculation um: vector: fix bpfflash parameter evaluation um: slirp: remove set but unused variable 'pid' um: signal: move pid variable where needed um: Makefile: use bash from the environment um: Add winch to winch_handlers before registering winch IRQ um: Fix -Wmissing-prototypes warnings for __warp_* and foo um: Fix -Wmissing-prototypes warnings for text_poke* um: Move declarations to proper headers ...	2024-05-25 13:17:48 -07:00
Masahiro Yamada	b1992c3772	kbuild: use $(src) instead of $(srctree)/$(src) for source directory Kbuild conventionally uses $(obj)/ for generated files, and $(src)/ for checked-in source files. It is merely a convention without any functional difference. In fact, $(obj) and $(src) are exactly the same, as defined in scripts/Makefile.build: src := $(obj) When the kernel is built in a separate output directory, $(src) does not accurately reflect the source directory location. While Kbuild resolves this discrepancy by specifying VPATH=$(srctree) to search for source files, it does not cover all cases. For example, when adding a header search path for local headers, -I$(srctree)/$(src) is typically passed to the compiler. This introduces inconsistency between upstream and downstream Makefiles because $(src) is used instead of $(srctree)/$(src) for the latter. To address this inconsistency, this commit changes the semantics of $(src) so that it always points to the directory in the source tree. Going forward, the variables used in Makefiles will have the following meanings: $(obj) - directory in the object tree $(src) - directory in the source tree (changed by this commit) $(objtree) - the top of the kernel object tree $(srctree) - the top of the kernel source tree Consequently, $(srctree)/$(src) in upstream Makefiles need to be replaced with $(src). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>	2024-05-10 04:34:52 +09:00
Tiwei Bie	f95bab8610	um: Stop tracking host PID in cpu_tasks The host PID tracked in 'cpu_tasks' is no longer used. Stopping tracking it will also save some cycles. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-04-30 14:11:30 +02:00
Johannes Berg	dac847ae2b	um: process: remove unused 'n' variable The return value of fn() wasn't used for a long time, so no need to assign it to a variable, addressing a W=1 warning. This seems to be - with patches from others posted to the list before - the last W=1 warning in arch/um/. Fixes: 22e2430d60db ("x86, um: convert to saner kernel_execve() semantics") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-04-22 22:30:11 +02:00
Tiwei Bie	19cf791573	um: Fix -Wmissing-prototypes warnings for text_poke* The prototypes for text_poke* are declared in asm/text-patching.h under arch/x86/include/. It's safe to include this header, as it's UML-aware (by checking CONFIG_UML_X86). This will address below -Wmissing-prototypes warnings: arch/um/kernel/um_arch.c:461:7: warning: no previous prototype for ‘text_poke’ [-Wmissing-prototypes] arch/um/kernel/um_arch.c:473:6: warning: no previous prototype for ‘text_poke_sync’ [-Wmissing-prototypes] Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-04-22 21:58:48 +02:00
Tiwei Bie	a4b4382f3e	um: Move declarations to proper headers This will address below -Wmissing-prototypes warnings: arch/um/kernel/initrd.c:18:12: warning: no previous prototype for ‘read_initrd’ [-Wmissing-prototypes] arch/um/kernel/um_arch.c:408:19: warning: no previous prototype for ‘read_initrd’ [-Wmissing-prototypes] arch/um/os-Linux/start_up.c:301:12: warning: no previous prototype for ‘parse_iomem’ [-Wmissing-prototypes] arch/x86/um/ptrace_32.c:15:6: warning: no previous prototype for ‘arch_switch_to’ [-Wmissing-prototypes] arch/x86/um/ptrace_32.c:101:5: warning: no previous prototype for ‘poke_user’ [-Wmissing-prototypes] arch/x86/um/ptrace_32.c:153:5: warning: no previous prototype for ‘peek_user’ [-Wmissing-prototypes] arch/x86/um/ptrace_64.c:111:5: warning: no previous prototype for ‘poke_user’ [-Wmissing-prototypes] arch/x86/um/ptrace_64.c:171:5: warning: no previous prototype for ‘peek_user’ [-Wmissing-prototypes] arch/x86/um/syscalls_64.c:48:6: warning: no previous prototype for ‘arch_switch_to’ [-Wmissing-prototypes] arch/x86/um/tls_32.c:184:5: warning: no previous prototype for ‘arch_switch_tls’ [-Wmissing-prototypes] Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-04-22 21:58:48 +02:00
Tiwei Bie	9ffc6724a3	um: Add missing headers This will address below -Wmissing-prototypes warnings: arch/um/kernel/mem.c:202:8: warning: no previous prototype for ‘pgd_alloc’ [-Wmissing-prototypes] arch/um/kernel/mem.c:215:7: warning: no previous prototype for ‘uml_kmalloc’ [-Wmissing-prototypes] arch/um/kernel/process.c:207:6: warning: no previous prototype for ‘arch_cpu_idle’ [-Wmissing-prototypes] arch/um/kernel/process.c:328:15: warning: no previous prototype for ‘arch_align_stack’ [-Wmissing-prototypes] arch/um/kernel/reboot.c:45:6: warning: no previous prototype for ‘machine_restart’ [-Wmissing-prototypes] arch/um/kernel/reboot.c:51:6: warning: no previous prototype for ‘machine_power_off’ [-Wmissing-prototypes] arch/um/kernel/reboot.c:57:6: warning: no previous prototype for ‘machine_halt’ [-Wmissing-prototypes] arch/um/kernel/skas/mmu.c:17:5: warning: no previous prototype for ‘init_new_context’ [-Wmissing-prototypes] arch/um/kernel/skas/mmu.c:60:6: warning: no previous prototype for ‘destroy_context’ [-Wmissing-prototypes] arch/um/kernel/skas/process.c:36:12: warning: no previous prototype for ‘start_uml’ [-Wmissing-prototypes] arch/um/kernel/time.c:807:15: warning: no previous prototype for ‘calibrate_delay_is_known’ [-Wmissing-prototypes] arch/um/kernel/tlb.c:594:6: warning: no previous prototype for ‘force_flush_all’ [-Wmissing-prototypes] arch/x86/um/bugs_32.c:22:6: warning: no previous prototype for ‘arch_check_bugs’ [-Wmissing-prototypes] arch/x86/um/bugs_32.c:44:6: warning: no previous prototype for ‘arch_examine_signal’ [-Wmissing-prototypes] arch/x86/um/bugs_64.c:9:6: warning: no previous prototype for ‘arch_check_bugs’ [-Wmissing-prototypes] arch/x86/um/bugs_64.c:13:6: warning: no previous prototype for ‘arch_examine_signal’ [-Wmissing-prototypes] arch/x86/um/elfcore.c:10:12: warning: no previous prototype for ‘elf_core_extra_phdrs’ [-Wmissing-prototypes] arch/x86/um/elfcore.c:15:5: warning: no previous prototype for ‘elf_core_write_extra_phdrs’ [-Wmissing-prototypes] arch/x86/um/elfcore.c:42:5: warning: no previous prototype for ‘elf_core_write_extra_data’ [-Wmissing-prototypes] arch/x86/um/elfcore.c:63:8: warning: no previous prototype for ‘elf_core_extra_data_size’ [-Wmissing-prototypes] arch/x86/um/fault.c:18:5: warning: no previous prototype for ‘arch_fixup’ [-Wmissing-prototypes] arch/x86/um/os-Linux/mcontext.c:7:6: warning: no previous prototype for ‘get_regs_from_mc’ [-Wmissing-prototypes] arch/x86/um/os-Linux/tls.c:22:6: warning: no previous prototype for ‘check_host_supports_tls’ [-Wmissing-prototypes] Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-04-22 21:46:20 +02:00
Tiwei Bie	179d83d89c	um: Fix the return type of __switch_to Make it match the declaration in asm-generic/switch_to.h. And also include the header to allow the compiler to check it. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-04-22 21:45:41 +02:00
Tiwei Bie	b5e0950fd6	um: Remove unused functions These functions are not used anymore. Removing them will also address below -Wmissing-prototypes warnings: arch/um/kernel/process.c:51:5: warning: no previous prototype for ‘pid_to_processor_id’ [-Wmissing-prototypes] arch/um/kernel/process.c:253:5: warning: no previous prototype for ‘copy_to_user_proc’ [-Wmissing-prototypes] arch/um/kernel/process.c:263:5: warning: no previous prototype for ‘clear_user_proc’ [-Wmissing-prototypes] arch/um/kernel/tlb.c:579:6: warning: no previous prototype for ‘flush_tlb_mm_range’ [-Wmissing-prototypes] Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-04-22 21:44:52 +02:00
Tiwei Bie	53471c5749	um: Make local functions and variables static This will also fix the warnings like: warning: no previous prototype for ‘fork_handler’ [-Wmissing-prototypes] 140 \| void fork_handler(void) \| ^~~~~~~~~~~~ Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-04-22 21:43:03 +02:00
Stephen Boyd	221a819aa3	um: Unconditionally call unflatten_device_tree() Call this function unconditionally so that we can populate an empty DTB on platforms that don't boot with a command line provided DTB. There's no harm in calling unflatten_device_tree() unconditionally. If there isn't a valid initial_boot_params dtb then unflatten_device_tree() returns early. Cc: Rob Herring <robh+dt@kernel.org> Cc: Frank Rowand <frowand.list@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: linux-um@lists.infradead.org Signed-off-by: Stephen Boyd <sboyd@kernel.org> Link: https://lore.kernel.org/r/20240217010557.2381548-4-sboyd@kernel.org Signed-off-by: Rob Herring <robh@kernel.org>	2024-03-08 12:50:39 -06:00
Linus Torvalds	6cff79f4b9	This pull request contains the following changes for UML: - Clang coverage support - Many cleanups from Benjamin Berg - Various minor fixes -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAmWi86sWHHJpY2hhcmRA c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wWB9D/0XLw6RzLkav+fZzaNhNsP1jzGn cQ3Yl29WcCsL+7iMVy/fLPZns8SlgBtVDZ6YZMNxrjjOBLu1VTOv30dNwYpvKUDj ERakIpZeqg8+w6+pOK3lLWRkjjBVDrFM+e6vgZ+2iEFkvHhyHk7LhcfAanC3CTUO qU4dLqZDpO4H98xhkbksc847I8Yg8mx9vLrqZBxQ8tfT2tLHMfzuR7kCvRfxOFso 4QDe4NImgEF5WLlK3Aj92NWxbMUZvw1f0bEuJGQJPr6GsLbNcsrlQCZIis/mov82 6176c3Jl1lHyBYsYyPAagVpqVBN+Q1gL1+YGz4Nv+wZFTqk5RvxMzVDdkpP6QEAV ZJjdJdBJHrAqrzIdiY8XfPYBsu5iZYCLr8q3wJxQYYBrV3s8eXsFXMqoLHKBVOe6 OSgTv2OJrSvXZzXgI+5nysVVqOeoImcRzQXuVHha5jhrf6sPeGxB+hOuTccPLOG4 kIEoC+8CaR+vlHBspg2qU+wN0QAqvIFPID05p0GM37UnskxeOxSksMSkwPAVdhqD 6Mq+L5TDx6ShRfQ0KC1PGmByv312bRLudZ34BBAhoEspixI1AISbDn2wX7pKqERq yC9h/49XmjqcCjmE2TsQ1k4zhUSo8CT/Z8SUmrmqnbrUmchdRCEBplolZpBbKfGO U+l4hGt7O5BrPO0Tzw== =oH6k -----END PGP SIGNATURE----- Merge tag 'uml-for-linus-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Richard Weinberger: - Clang coverage support - Many cleanups from Benjamin Berg - Various minor fixes * tag 'uml-for-linus-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: um: Mark 32bit syscall helpers as clobbering memory um: Remove unused register save/restore functions um: Rely on PTRACE_SETREGSET to set FS/GS base registers Documentation: kunit: Add clang UML coverage example arch: um: Add Clang coverage support um: time-travel: fix time corruption um: net: Fix return type of uml_net_start_xmit() um: Always inline stub functions um: Do not use printk in userspace trampoline um: Reap winch thread if it fails um: Do not use printk in SIGWINCH helper thread um: Don't use vfprintf() for os_info() um: Make errors to stop ptraced child fatal during startup um: Drop NULL check from start_userspace um: Drop support for hosts without SYSEMU_SINGLESTEP support um: document arch_futex_atomic_op_inuser um: mmu: remove stub_pages um: Fix naming clash between UML and scheduler um: virt-pci: fix platform map offset	2024-01-17 10:44:34 -08:00
Kirill A. Shutemov	5e0a760b44	mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER commit 23baf831a32c ("mm, treewide: redefine MAX_ORDER sanely") has changed the definition of MAX_ORDER to be inclusive. This has caused issues with code that was not yet upstream and depended on the previous definition. To draw attention to the altered meaning of the define, rename MAX_ORDER to MAX_PAGE_ORDER. Link: https://lkml.kernel.org/r/20231228144704.14033-2-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-01-08 15:27:15 -08:00
Johannes Berg	abe4eaa861	um: time-travel: fix time corruption In 'basic' time-travel mode (without =inf-cpu or =ext), we still get timer interrupts. These can happen at arbitrary points in time, i.e. while in timer_read(), which pushes time forward just a little bit. Then, if we happen to get the interrupt after calculating the new time to push to, but before actually finishing that, the interrupt will set the time to a value that's incompatible with the forward, and we'll crash because time goes backwards when we do the forwarding. Fix this by reading the time_travel_time, calculating the adjustment, and doing the adjustment all with interrupts disabled. Reported-by: Vincent Whitchurch <Vincent.Whitchurch@axis.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-01-05 00:29:51 +01:00
Benjamin Berg	a55719847d	um: Drop support for hosts without SYSEMU_SINGLESTEP support These features have existed since Linux 2.6.14 and can be considered widely available at this point. Also drop the backward compatibility code for PTRACE_SETOPTIONS. Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> ---- v2: * Continue to define PTRACE_SYSEMU_SINGLESTEP as glibc only added it in version 2.27. Signed-off-by: Richard Weinberger <richard@nod.at>	2024-01-04 23:29:11 +01:00
Anton Ivanov	a8e75902f4	um: document arch_futex_atomic_op_inuser arch_futex_atomic_op_inuser was not documented correctly resulting in build time warnings. Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-01-04 22:08:57 +01:00
Anton Ivanov	541d4e4d43	um: Fix naming clash between UML and scheduler __cant_sleep was already used and exported by the scheduler. The name had to be changed to a UML specific one. Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Reviewed-by: Peter Lafreniere <peter@n8pjl.ca> Signed-off-by: Richard Weinberger <richard@nod.at>	2024-01-04 21:22:27 +01:00
Nick Desaulniers	ab7ca2eb63	um: fix 3 instances of -Wmissing-prototypes Fixes the following build errors observed from W=1 builds: arch/um/drivers/xterm_kern.c:35:5: warning: no previous prototype for function 'xterm_fd' [-Wmissing-prototypes] 35 \| int xterm_fd(int socket, int pid_out) \| ^ arch/um/drivers/xterm_kern.c:35:1: note: declare 'static' if the function is not intended to be used outside of this translation unit 35 \| int xterm_fd(int socket, int pid_out) \| ^ \| static arch/um/drivers/chan_kern.c:183:6: warning: no previous prototype for function 'free_irqs' [-Wmissing-prototypes] 183 \| void free_irqs(void) \| ^ arch/um/drivers/chan_kern.c:183:1: note: declare 'static' if the function is not intended to be used outside of this translation unit 183 \| void free_irqs(void) \| ^ \| static arch/um/drivers/slirp_kern.c:18:6: warning: no previous prototype for function 'slirp_init' [-Wmissing-prototypes] 18 \| void slirp_init(struct net_device dev, void data) \| ^ arch/um/drivers/slirp_kern.c:18:1: note: declare 'static' if the function is not intended to be used outside of this translation unit 18 \| void slirp_init(struct net_device dev, void data) \| ^ \| static Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202308081050.sZEw4cQ5-lkp@intel.com/ Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2023-08-26 22:45:05 +02:00
Peter Zijlstra	be0fffa5ca	x86/alternative: Rename apply_ibt_endbr() The current name doesn't reflect what it does very well. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lkml.kernel.org/r/20230622144321.427441595%40infradead.org	2023-07-10 09:52:23 +02:00
Linus Torvalds	9471f1f2f5	Merge branch 'expand-stack' This modifies our user mode stack expansion code to always take the mmap_lock for writing before modifying the VM layout. It's actually something we always technically should have done, but because we didn't strictly need it, we were being lazy ("opportunistic" sounds so much better, doesn't it?) about things, and had this hack in place where we would extend the stack vma in-place without doing the proper locking. And it worked fine. We just needed to change vm_start (or, in the case of grow-up stacks, vm_end) and together with some special ad-hoc locking using the anon_vma lock and the mm->page_table_lock, it all was fairly straightforward. That is, it was all fine until Ruihan Li pointed out that now that the vma layout uses the maple tree code, we really don't just change vm_start and vm_end any more, and the locking really is broken. Oops. It's not actually all _that_ horrible to fix this once and for all, and do proper locking, but it's a bit painful. We have basically three different cases of stack expansion, and they all work just a bit differently: - the common and obvious case is the page fault handling. It's actually fairly simple and straightforward, except for the fact that we have something like 24 different versions of it, and you end up in a maze of twisty little passages, all alike. - the simplest case is the execve() code that creates a new stack. There are no real locking concerns because it's all in a private new VM that hasn't been exposed to anybody, but lockdep still can end up unhappy if you get it wrong. - and finally, we have GUP and page pinning, which shouldn't really be expanding the stack in the first place, but in addition to execve() we also use it for ptrace(). And debuggers do want to possibly access memory under the stack pointer and thus need to be able to expand the stack as a special case. None of these cases are exactly complicated, but the page fault case in particular is just repeated slightly differently many many times. And ia64 in particular has a fairly complicated situation where you can have both a regular grow-down stack _and_ a special grow-up stack for the register backing store. So to make this slightly more manageable, the bulk of this series is to first create a helper function for the most common page fault case, and convert all the straightforward architectures to it. Thus the new 'lock_mm_and_find_vma()' helper function, which ends up being used by x86, arm, powerpc, mips, riscv, alpha, arc, csky, hexagon, loongarch, nios2, sh, sparc32, and xtensa. So we not only convert more than half the architectures, we now have more shared code and avoid some of those twisty little passages. And largely due to this common helper function, the full diffstat of this series ends up deleting more lines than it adds. That still leaves eight architectures (ia64, m68k, microblaze, openrisc, parisc, s390, sparc64 and um) that end up doing 'expand_stack()' manually because they are doing something slightly different from the normal pattern. Along with the couple of special cases in execve() and GUP. So there's a couple of patches that first create 'locked' helper versions of the stack expansion functions, so that there's a obvious path forward in the conversion. The execve() case is then actually pretty simple, and is a nice cleanup from our old "grow-up stackls are special, because at execve time even they grow down". The #ifdef CONFIG_STACK_GROWSUP in that code just goes away, because it's just more straightforward to write out the stack expansion there manually, instead od having get_user_pages_remote() do it for us in some situations but not others and have to worry about locking rules for GUP. And the final step is then to just convert the remaining odd cases to a new world order where 'expand_stack()' is called with the mmap_lock held for reading, but where it might drop it and upgrade it to a write, only to return with it held for reading (in the success case) or with it completely dropped (in the failure case). In the process, we remove all the stack expansion from GUP (where dropping the lock wouldn't be ok without special rules anyway), and add it in manually to __access_remote_vm() for ptrace(). Thanks to Adrian Glaubitz and Frank Scheiner who tested the ia64 cases. Everything else here felt pretty straightforward, but the ia64 rules for stack expansion are really quite odd and very different from everything else. Also thanks to Vegard Nossum who caught me getting one of those odd conditions entirely the wrong way around. Anyway, I think I want to actually move all the stack expansion code to a whole new file of its own, rather than have it split up between mm/mmap.c and mm/memory.c, but since this will have to be backported to the initial maple tree vma introduction anyway, I tried to keep the patches _fairly_ minimal. Also, while I don't think it's valid to expand the stack from GUP, the final patch in here is a "warn if some crazy GUP user wants to try to expand the stack" patch. That one will be reverted before the final release, but it's left to catch any odd cases during the merge window and release candidates. Reported-by: Ruihan Li <lrh2000@pku.edu.cn> * branch 'expand-stack': gup: add warning if some caller would seem to want stack expansion mm: always expand the stack with the mmap write lock held execve: expand new process stack manually ahead of time mm: make find_extend_vma() fail if write lock not held powerpc/mm: convert coprocessor fault to lock_mm_and_find_vma() mm/fault: convert remaining simple cases to lock_mm_and_find_vma() arm/mm: Convert to using lock_mm_and_find_vma() riscv/mm: Convert to using lock_mm_and_find_vma() mips/mm: Convert to using lock_mm_and_find_vma() powerpc/mm: Convert to using lock_mm_and_find_vma() arm64/mm: Convert to using lock_mm_and_find_vma() mm: make the page fault mmap locking killable mm: introduce new 'lock_mm_and_find_vma()' page fault helper	2023-06-28 20:35:21 -07:00
Linus Torvalds	8d7071af89	mm: always expand the stack with the mmap write lock held This finishes the job of always holding the mmap write lock when extending the user stack vma, and removes the 'write_locked' argument from the vm helper functions again. For some cases, we just avoid expanding the stack at all: drivers and page pinning really shouldn't be extending any stacks. Let's see if any strange users really wanted that. It's worth noting that architectures that weren't converted to the new lock_mm_and_find_vma() helper function are left using the legacy "expand_stack()" function, but it has been changed to drop the mmap_lock and take it for writing while expanding the vma. This makes it fairly straightforward to convert the remaining architectures. As a result of dropping and re-taking the lock, the calling conventions for this function have also changed, since the old vma may no longer be valid. So it will now return the new vma if successful, and NULL - and the lock dropped - if the area could not be extended. Tested-by: Vegard Nossum <vegard.nossum@oracle.com> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> # ia64 Tested-by: Frank Scheiner <frank.scheiner@web.de> # ia64 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2023-06-27 09:41:30 -07:00

1 2 3 4 5 ...

1000 Commits