mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
synced 2025-01-07 14:32:23 +00:00
module: warn about excessively long module waits
Russell King reported that the arm cbc(aes) crypto module hangs when loaded, and Herbert Xu bisected it to commit9b9879fc03
("modules: catch concurrent module loads, treat them as idempotent"), and noted: "So what's happening here is that the first modprobe tries to load a fallback CBC implementation, in doing so it triggers a load of the exact same module due to module aliases. IOW we're loading aes-arm-bs which provides cbc(aes). However, this needs a fallback of cbc(aes) to operate, which is made out of the generic cbc module + any implementation of aes, or ecb(aes). The latter happens to also be provided by aes-arm-cb so that's why it tries to load the same module again" So loading the aes-arm-bs module ends up wanting to recursively load itself, and the recursive load then ends up waiting for the original module load to complete. This is a regression, in that it used to be that we just tried to load the module multiple times, and then as we went on to install it the second time we would instead just error out because the module name already existed. That is actually also exactly what the original "catch concurrent loads" patch did in commit9828ed3f69
("module: error out early on concurrent load of the same module file"), but it turns out that it ends up being racy, in that erroring out before the module has been fully initialized will cause failures in dependent module loading. See commitac2263b588
(which was the revert of that "error out early") commit for details about why erroring out before the module has been initialized is actually fundamentally racy. Now, for the actual recursive module load (as opposed to just concurrently loading the same module twice), the race is not an issue. At the same time it's hard for the kernel to see that this is recursion, because the module load is always done from a usermode helper, so the recursion is not some simple callchain within the kernel. End result: this is not the real fix, but this at least adds a warning for the situation (admittedly much too late for all the debugging pain that Russell and Herbert went through) and if we can come to a resolution on how to detect the recursion properly, this re-organizes the code to make that easier. Link: https://lore.kernel.org/all/ZrFHLqvFqhzykuYw@shell.armlinux.org.uk/ Reported-by: Russell King <linux@armlinux.org.uk> Debugged-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
parent
cf6d429eb6
commit
cb5b81bc9a
@ -3183,15 +3183,28 @@ static int idempotent_init_module(struct file *f, const char __user * uargs, int
|
||||
if (!f || !(f->f_mode & FMODE_READ))
|
||||
return -EBADF;
|
||||
|
||||
/* See if somebody else is doing the operation? */
|
||||
if (idempotent(&idem, file_inode(f))) {
|
||||
wait_for_completion(&idem.complete);
|
||||
return idem.ret;
|
||||
/* Are we the winners of the race and get to do this? */
|
||||
if (!idempotent(&idem, file_inode(f))) {
|
||||
int ret = init_module_from_file(f, uargs, flags);
|
||||
return idempotent_complete(&idem, ret);
|
||||
}
|
||||
|
||||
/* Otherwise, we'll do it and complete others */
|
||||
return idempotent_complete(&idem,
|
||||
init_module_from_file(f, uargs, flags));
|
||||
/*
|
||||
* Somebody else won the race and is loading the module.
|
||||
*
|
||||
* We have to wait for it forever, since our 'idem' is
|
||||
* on the stack and the list entry stays there until
|
||||
* completed (but we could fix it under the idem_lock)
|
||||
*
|
||||
* It's also unclear what a real timeout might be,
|
||||
* but we could maybe at least make this killable
|
||||
* and remove the idem entry in that case?
|
||||
*/
|
||||
for (;;) {
|
||||
if (wait_for_completion_timeout(&idem.complete, 10*HZ))
|
||||
return idem.ret;
|
||||
pr_warn_once("module '%pD' taking a long time to load", f);
|
||||
}
|
||||
}
|
||||
|
||||
SYSCALL_DEFINE3(finit_module, int, fd, const char __user *, uargs, int, flags)
|
||||
|
Loading…
Reference in New Issue
Block a user