Oleg Nesterov d80e731eca epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree()
This patch is intentionally incomplete to simplify the review.
It ignores ep_unregister_pollwait() which plays with the same wqh.
See the next change.

epoll assumes that the EPOLL_CTL_ADD'ed file controls everything
f_op->poll() needs. In particular it assumes that the wait queue
can't go away until eventpoll_release(). This is not true in case
of signalfd, the task which does EPOLL_CTL_ADD uses its ->sighand
which is not connected to the file.

This patch adds the special event, POLLFREE, currently only for
epoll. It expects that init_poll_funcptr()'ed hook should do the
necessary cleanup. Perhaps it should be defined as EPOLLFREE in
eventpoll.

__cleanup_sighand() is changed to do wake_up_poll(POLLFREE) if
->signalfd_wqh is not empty, we add the new signalfd_cleanup()
helper.

ep_poll_callback(POLLFREE) simply does list_del_init(task_list).
This make this poll entry inconsistent, but we don't care. If you
share epoll fd which contains our sigfd with another process you
should blame yourself. signalfd is "really special". I simply do
not know how we can define the "right" semantics if it used with
epoll.

The main problem is, epoll calls signalfd_poll() once to establish
the connection with the wait queue, after that signalfd_poll(NULL)
returns the different/inconsistent results depending on who does
EPOLL_CTL_MOD/signalfd_read/etc. IOW: apart from sigmask, signalfd
has nothing to do with the file, it works with the current thread.

In short: this patch is the hack which tries to fix the symptoms.
It also assumes that nobody can take tasklist_lock under epoll
locks, this seems to be true.

Note:

	- we do not have wake_up_all_poll() but wake_up_poll()
	  is fine, poll/epoll doesn't use WQ_FLAG_EXCLUSIVE.

	- signalfd_cleanup() uses POLLHUP along with POLLFREE,
	  we need a couple of simple changes in eventpoll.c to
	  make sure it can't be "lost".

Reported-by: Maxime Bizon <mbizon@freebox.fr>
Cc: <stable@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-24 11:42:50 -08:00

77 lines
1.6 KiB
C

/*
* include/linux/signalfd.h
*
* Copyright (C) 2007 Davide Libenzi <davidel@xmailserver.org>
*
*/
#ifndef _LINUX_SIGNALFD_H
#define _LINUX_SIGNALFD_H
#include <linux/types.h>
/* For O_CLOEXEC and O_NONBLOCK */
#include <linux/fcntl.h>
/* Flags for signalfd4. */
#define SFD_CLOEXEC O_CLOEXEC
#define SFD_NONBLOCK O_NONBLOCK
struct signalfd_siginfo {
__u32 ssi_signo;
__s32 ssi_errno;
__s32 ssi_code;
__u32 ssi_pid;
__u32 ssi_uid;
__s32 ssi_fd;
__u32 ssi_tid;
__u32 ssi_band;
__u32 ssi_overrun;
__u32 ssi_trapno;
__s32 ssi_status;
__s32 ssi_int;
__u64 ssi_ptr;
__u64 ssi_utime;
__u64 ssi_stime;
__u64 ssi_addr;
__u16 ssi_addr_lsb;
/*
* Pad strcture to 128 bytes. Remember to update the
* pad size when you add new members. We use a fixed
* size structure to avoid compatibility problems with
* future versions, and we leave extra space for additional
* members. We use fixed size members because this strcture
* comes out of a read(2) and we really don't want to have
* a compat on read(2).
*/
__u8 __pad[46];
};
#ifdef __KERNEL__
#ifdef CONFIG_SIGNALFD
/*
* Deliver the signal to listening signalfd.
*/
static inline void signalfd_notify(struct task_struct *tsk, int sig)
{
if (unlikely(waitqueue_active(&tsk->sighand->signalfd_wqh)))
wake_up(&tsk->sighand->signalfd_wqh);
}
extern void signalfd_cleanup(struct sighand_struct *sighand);
#else /* CONFIG_SIGNALFD */
static inline void signalfd_notify(struct task_struct *tsk, int sig) { }
static inline void signalfd_cleanup(struct sighand_struct *sighand) { }
#endif /* CONFIG_SIGNALFD */
#endif /* __KERNEL__ */
#endif /* _LINUX_SIGNALFD_H */