2005-04-16 15:20:36 -07:00
|
|
|
/*
|
|
|
|
* linux/fs/compat.c
|
|
|
|
*
|
|
|
|
* Kernel compatibililty routines for e.g. 32 bit syscall support
|
|
|
|
* on 64 bit kernels.
|
|
|
|
*
|
|
|
|
* Copyright (C) 2002 Stephen Rothwell, IBM Corporation
|
|
|
|
* Copyright (C) 1997-2000 Jakub Jelinek (jakub@redhat.com)
|
|
|
|
* Copyright (C) 1998 Eddie C. Dost (ecd@skynet.be)
|
|
|
|
* Copyright (C) 2001,2002 Andi Kleen, SuSE Labs
|
2010-07-18 14:27:13 +02:00
|
|
|
* Copyright (C) 2003 Pavel Machek (pavel@ucw.cz)
|
2005-04-16 15:20:36 -07:00
|
|
|
*
|
|
|
|
* This program is free software; you can redistribute it and/or modify
|
|
|
|
* it under the terms of the GNU General Public License version 2 as
|
|
|
|
* published by the Free Software Foundation.
|
|
|
|
*/
|
|
|
|
|
2010-08-09 17:20:22 -07:00
|
|
|
#include <linux/stddef.h>
|
2007-05-08 00:29:02 -07:00
|
|
|
#include <linux/kernel.h>
|
2005-04-16 15:20:36 -07:00
|
|
|
#include <linux/linkage.h>
|
|
|
|
#include <linux/compat.h>
|
|
|
|
#include <linux/errno.h>
|
|
|
|
#include <linux/time.h>
|
|
|
|
#include <linux/fs.h>
|
|
|
|
#include <linux/fcntl.h>
|
|
|
|
#include <linux/namei.h>
|
|
|
|
#include <linux/file.h>
|
2008-04-24 07:44:08 -04:00
|
|
|
#include <linux/fdtable.h>
|
2005-04-16 15:20:36 -07:00
|
|
|
#include <linux/vfs.h>
|
|
|
|
#include <linux/ioctl.h>
|
|
|
|
#include <linux/init.h>
|
|
|
|
#include <linux/ncp_mount.h>
|
2005-04-18 10:54:51 -07:00
|
|
|
#include <linux/nfs4_mount.h>
|
2005-04-16 15:20:36 -07:00
|
|
|
#include <linux/syscalls.h>
|
|
|
|
#include <linux/ctype.h>
|
|
|
|
#include <linux/module.h>
|
|
|
|
#include <linux/dirent.h>
|
[PATCH] inotify
inotify is intended to correct the deficiencies of dnotify, particularly
its inability to scale and its terrible user interface:
* dnotify requires the opening of one fd per each directory
that you intend to watch. This quickly results in too many
open files and pins removable media, preventing unmount.
* dnotify is directory-based. You only learn about changes to
directories. Sure, a change to a file in a directory affects
the directory, but you are then forced to keep a cache of
stat structures.
* dnotify's interface to user-space is awful. Signals?
inotify provides a more usable, simple, powerful solution to file change
notification:
* inotify's interface is a system call that returns a fd, not SIGIO.
You get a single fd, which is select()-able.
* inotify has an event that says "the filesystem that the item
you were watching is on was unmounted."
* inotify can watch directories or files.
Inotify is currently used by Beagle (a desktop search infrastructure),
Gamin (a FAM replacement), and other projects.
See Documentation/filesystems/inotify.txt.
Signed-off-by: Robert Love <rml@novell.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 17:06:03 -04:00
|
|
|
#include <linux/fsnotify.h>
|
2005-04-16 15:20:36 -07:00
|
|
|
#include <linux/highuid.h>
|
|
|
|
#include <linux/personality.h>
|
|
|
|
#include <linux/rwsem.h>
|
2006-09-30 23:28:59 -07:00
|
|
|
#include <linux/tsacct_kern.h>
|
2007-05-08 00:29:21 -07:00
|
|
|
#include <linux/security.h>
|
2006-10-19 16:08:53 -04:00
|
|
|
#include <linux/highmem.h>
|
2007-05-10 22:23:15 -07:00
|
|
|
#include <linux/signal.h>
|
2006-10-19 17:23:57 -04:00
|
|
|
#include <linux/poll.h>
|
2005-09-14 21:40:00 -07:00
|
|
|
#include <linux/mm.h>
|
2007-03-07 20:41:21 -08:00
|
|
|
#include <linux/eventpoll.h>
|
2009-03-30 07:20:30 -04:00
|
|
|
#include <linux/fs_struct.h>
|
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 17:04:11 +09:00
|
|
|
#include <linux/slab.h>
|
2010-10-30 08:19:35 -07:00
|
|
|
#include <linux/pagemap.h>
|
2005-04-16 15:20:36 -07:00
|
|
|
|
|
|
|
#include <asm/uaccess.h>
|
|
|
|
#include <asm/mmu_context.h>
|
|
|
|
#include <asm/ioctls.h>
|
2006-09-30 20:52:18 +02:00
|
|
|
#include "internal.h"
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
|
2006-06-26 13:56:52 +02:00
|
|
|
int compat_log = 1;
|
|
|
|
|
|
|
|
int compat_printk(const char *fmt, ...)
|
|
|
|
{
|
|
|
|
va_list ap;
|
|
|
|
int ret;
|
|
|
|
if (!compat_log)
|
|
|
|
return 0;
|
|
|
|
va_start(ap, fmt);
|
|
|
|
ret = vprintk(fmt, ap);
|
|
|
|
va_end(ap);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2006-09-30 23:28:47 -07:00
|
|
|
#include "read_write.h"
|
|
|
|
|
2005-04-16 15:20:36 -07:00
|
|
|
/*
|
|
|
|
* Not all architectures have sys_utime, so implement this in terms
|
|
|
|
* of sys_utimes.
|
|
|
|
*/
|
2010-08-11 11:26:22 +01:00
|
|
|
asmlinkage long compat_sys_utime(const char __user *filename,
|
|
|
|
struct compat_utimbuf __user *t)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
utimensat implementation
Implement utimensat(2) which is an extension to futimesat(2) in that it
a) supports nano-second resolution for the timestamps
b) allows to selectively ignore the atime/mtime value
c) allows to selectively use the current time for either atime or mtime
d) supports changing the atime/mtime of a symlink itself along the lines
of the BSD lutimes(3) functions
For this change the internally used do_utimes() functions was changed to
accept a timespec time value and an additional flags parameter.
Additionally the sys_utime function was changed to match compat_sys_utime
which already use do_utimes instead of duplicating the work.
Also, the completely missing futimensat() functionality is added. We have
such a function in glibc but we have to resort to using /proc/self/fd/* which
not everybody likes (chroot etc).
Test application (the syscall number will need per-arch editing):
#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <sys/time.h>
#include <stddef.h>
#include <syscall.h>
#define __NR_utimensat 280
#define UTIME_NOW ((1l << 30) - 1l)
#define UTIME_OMIT ((1l << 30) - 2l)
int
main(void)
{
int status = 0;
int fd = open("ttt", O_RDWR|O_CREAT|O_EXCL, 0666);
if (fd == -1)
error (1, errno, "failed to create test file \"ttt\"");
struct stat64 st1;
if (fstat64 (fd, &st1) != 0)
error (1, errno, "fstat failed");
struct timespec t[2];
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
struct stat64 st2;
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0] = st1.st_atim;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_OMIT;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("atim not set");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim changed from zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_OMIT;
t[1] = st1.st_mtim;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("mtim changed from original time");
status = 1;
}
if (st2.st_mtim.tv_sec != st1.st_mtim.tv_sec
|| st2.st_mtim.tv_nsec != st1.st_mtim.tv_nsec)
{
puts ("mtim not set");
status = 1;
}
if (status != 0)
goto out;
sleep (2);
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_NOW;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_NOW;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
struct timeval tv;
gettimeofday(&tv,NULL);
if (st2.st_atim.tv_sec <= st1.st_atim.tv_sec
|| st2.st_atim.tv_sec > tv.tv_sec)
{
puts ("atim not set to NOW");
status = 1;
}
if (st2.st_mtim.tv_sec <= st1.st_mtim.tv_sec
|| st2.st_mtim.tv_sec > tv.tv_sec)
{
puts ("mtim not set to NOW");
status = 1;
}
if (symlink ("ttt", "tttsym") != 0)
error (1, errno, "cannot create symlink");
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "tttsym", t, AT_SYMLINK_NOFOLLOW) != 0)
error (1, errno, "utimensat failed");
if (lstat64 ("tttsym", &st2) != 0)
error (1, errno, "lstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("symlink atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("symlink mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 1;
t[0].tv_nsec = 0;
t[1].tv_sec = 1;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, fd, NULL, t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 1 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to one");
status = 1;
}
if (st2.st_mtim.tv_sec != 1 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to one");
status = 1;
}
if (status == 0)
puts ("all OK");
out:
close (fd);
unlink ("ttt");
unlink ("tttsym");
return status;
}
[akpm@linux-foundation.org: add missing i386 syscall table entry]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 00:33:25 -07:00
|
|
|
struct timespec tv[2];
|
2005-04-16 15:20:36 -07:00
|
|
|
|
|
|
|
if (t) {
|
|
|
|
if (get_user(tv[0].tv_sec, &t->actime) ||
|
|
|
|
get_user(tv[1].tv_sec, &t->modtime))
|
|
|
|
return -EFAULT;
|
utimensat implementation
Implement utimensat(2) which is an extension to futimesat(2) in that it
a) supports nano-second resolution for the timestamps
b) allows to selectively ignore the atime/mtime value
c) allows to selectively use the current time for either atime or mtime
d) supports changing the atime/mtime of a symlink itself along the lines
of the BSD lutimes(3) functions
For this change the internally used do_utimes() functions was changed to
accept a timespec time value and an additional flags parameter.
Additionally the sys_utime function was changed to match compat_sys_utime
which already use do_utimes instead of duplicating the work.
Also, the completely missing futimensat() functionality is added. We have
such a function in glibc but we have to resort to using /proc/self/fd/* which
not everybody likes (chroot etc).
Test application (the syscall number will need per-arch editing):
#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <sys/time.h>
#include <stddef.h>
#include <syscall.h>
#define __NR_utimensat 280
#define UTIME_NOW ((1l << 30) - 1l)
#define UTIME_OMIT ((1l << 30) - 2l)
int
main(void)
{
int status = 0;
int fd = open("ttt", O_RDWR|O_CREAT|O_EXCL, 0666);
if (fd == -1)
error (1, errno, "failed to create test file \"ttt\"");
struct stat64 st1;
if (fstat64 (fd, &st1) != 0)
error (1, errno, "fstat failed");
struct timespec t[2];
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
struct stat64 st2;
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0] = st1.st_atim;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_OMIT;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("atim not set");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim changed from zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_OMIT;
t[1] = st1.st_mtim;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("mtim changed from original time");
status = 1;
}
if (st2.st_mtim.tv_sec != st1.st_mtim.tv_sec
|| st2.st_mtim.tv_nsec != st1.st_mtim.tv_nsec)
{
puts ("mtim not set");
status = 1;
}
if (status != 0)
goto out;
sleep (2);
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_NOW;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_NOW;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
struct timeval tv;
gettimeofday(&tv,NULL);
if (st2.st_atim.tv_sec <= st1.st_atim.tv_sec
|| st2.st_atim.tv_sec > tv.tv_sec)
{
puts ("atim not set to NOW");
status = 1;
}
if (st2.st_mtim.tv_sec <= st1.st_mtim.tv_sec
|| st2.st_mtim.tv_sec > tv.tv_sec)
{
puts ("mtim not set to NOW");
status = 1;
}
if (symlink ("ttt", "tttsym") != 0)
error (1, errno, "cannot create symlink");
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "tttsym", t, AT_SYMLINK_NOFOLLOW) != 0)
error (1, errno, "utimensat failed");
if (lstat64 ("tttsym", &st2) != 0)
error (1, errno, "lstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("symlink atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("symlink mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 1;
t[0].tv_nsec = 0;
t[1].tv_sec = 1;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, fd, NULL, t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 1 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to one");
status = 1;
}
if (st2.st_mtim.tv_sec != 1 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to one");
status = 1;
}
if (status == 0)
puts ("all OK");
out:
close (fd);
unlink ("ttt");
unlink ("tttsym");
return status;
}
[akpm@linux-foundation.org: add missing i386 syscall table entry]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 00:33:25 -07:00
|
|
|
tv[0].tv_nsec = 0;
|
|
|
|
tv[1].tv_nsec = 0;
|
2005-04-16 15:20:36 -07:00
|
|
|
}
|
utimensat implementation
Implement utimensat(2) which is an extension to futimesat(2) in that it
a) supports nano-second resolution for the timestamps
b) allows to selectively ignore the atime/mtime value
c) allows to selectively use the current time for either atime or mtime
d) supports changing the atime/mtime of a symlink itself along the lines
of the BSD lutimes(3) functions
For this change the internally used do_utimes() functions was changed to
accept a timespec time value and an additional flags parameter.
Additionally the sys_utime function was changed to match compat_sys_utime
which already use do_utimes instead of duplicating the work.
Also, the completely missing futimensat() functionality is added. We have
such a function in glibc but we have to resort to using /proc/self/fd/* which
not everybody likes (chroot etc).
Test application (the syscall number will need per-arch editing):
#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <sys/time.h>
#include <stddef.h>
#include <syscall.h>
#define __NR_utimensat 280
#define UTIME_NOW ((1l << 30) - 1l)
#define UTIME_OMIT ((1l << 30) - 2l)
int
main(void)
{
int status = 0;
int fd = open("ttt", O_RDWR|O_CREAT|O_EXCL, 0666);
if (fd == -1)
error (1, errno, "failed to create test file \"ttt\"");
struct stat64 st1;
if (fstat64 (fd, &st1) != 0)
error (1, errno, "fstat failed");
struct timespec t[2];
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
struct stat64 st2;
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0] = st1.st_atim;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_OMIT;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("atim not set");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim changed from zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_OMIT;
t[1] = st1.st_mtim;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("mtim changed from original time");
status = 1;
}
if (st2.st_mtim.tv_sec != st1.st_mtim.tv_sec
|| st2.st_mtim.tv_nsec != st1.st_mtim.tv_nsec)
{
puts ("mtim not set");
status = 1;
}
if (status != 0)
goto out;
sleep (2);
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_NOW;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_NOW;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
struct timeval tv;
gettimeofday(&tv,NULL);
if (st2.st_atim.tv_sec <= st1.st_atim.tv_sec
|| st2.st_atim.tv_sec > tv.tv_sec)
{
puts ("atim not set to NOW");
status = 1;
}
if (st2.st_mtim.tv_sec <= st1.st_mtim.tv_sec
|| st2.st_mtim.tv_sec > tv.tv_sec)
{
puts ("mtim not set to NOW");
status = 1;
}
if (symlink ("ttt", "tttsym") != 0)
error (1, errno, "cannot create symlink");
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "tttsym", t, AT_SYMLINK_NOFOLLOW) != 0)
error (1, errno, "utimensat failed");
if (lstat64 ("tttsym", &st2) != 0)
error (1, errno, "lstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("symlink atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("symlink mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 1;
t[0].tv_nsec = 0;
t[1].tv_sec = 1;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, fd, NULL, t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 1 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to one");
status = 1;
}
if (st2.st_mtim.tv_sec != 1 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to one");
status = 1;
}
if (status == 0)
puts ("all OK");
out:
close (fd);
unlink ("ttt");
unlink ("tttsym");
return status;
}
[akpm@linux-foundation.org: add missing i386 syscall table entry]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 00:33:25 -07:00
|
|
|
return do_utimes(AT_FDCWD, filename, t ? tv : NULL, 0);
|
|
|
|
}
|
|
|
|
|
2010-08-11 11:26:22 +01:00
|
|
|
asmlinkage long compat_sys_utimensat(unsigned int dfd, const char __user *filename, struct compat_timespec __user *t, int flags)
|
utimensat implementation
Implement utimensat(2) which is an extension to futimesat(2) in that it
a) supports nano-second resolution for the timestamps
b) allows to selectively ignore the atime/mtime value
c) allows to selectively use the current time for either atime or mtime
d) supports changing the atime/mtime of a symlink itself along the lines
of the BSD lutimes(3) functions
For this change the internally used do_utimes() functions was changed to
accept a timespec time value and an additional flags parameter.
Additionally the sys_utime function was changed to match compat_sys_utime
which already use do_utimes instead of duplicating the work.
Also, the completely missing futimensat() functionality is added. We have
such a function in glibc but we have to resort to using /proc/self/fd/* which
not everybody likes (chroot etc).
Test application (the syscall number will need per-arch editing):
#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <sys/time.h>
#include <stddef.h>
#include <syscall.h>
#define __NR_utimensat 280
#define UTIME_NOW ((1l << 30) - 1l)
#define UTIME_OMIT ((1l << 30) - 2l)
int
main(void)
{
int status = 0;
int fd = open("ttt", O_RDWR|O_CREAT|O_EXCL, 0666);
if (fd == -1)
error (1, errno, "failed to create test file \"ttt\"");
struct stat64 st1;
if (fstat64 (fd, &st1) != 0)
error (1, errno, "fstat failed");
struct timespec t[2];
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
struct stat64 st2;
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0] = st1.st_atim;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_OMIT;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("atim not set");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim changed from zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_OMIT;
t[1] = st1.st_mtim;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("mtim changed from original time");
status = 1;
}
if (st2.st_mtim.tv_sec != st1.st_mtim.tv_sec
|| st2.st_mtim.tv_nsec != st1.st_mtim.tv_nsec)
{
puts ("mtim not set");
status = 1;
}
if (status != 0)
goto out;
sleep (2);
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_NOW;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_NOW;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
struct timeval tv;
gettimeofday(&tv,NULL);
if (st2.st_atim.tv_sec <= st1.st_atim.tv_sec
|| st2.st_atim.tv_sec > tv.tv_sec)
{
puts ("atim not set to NOW");
status = 1;
}
if (st2.st_mtim.tv_sec <= st1.st_mtim.tv_sec
|| st2.st_mtim.tv_sec > tv.tv_sec)
{
puts ("mtim not set to NOW");
status = 1;
}
if (symlink ("ttt", "tttsym") != 0)
error (1, errno, "cannot create symlink");
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "tttsym", t, AT_SYMLINK_NOFOLLOW) != 0)
error (1, errno, "utimensat failed");
if (lstat64 ("tttsym", &st2) != 0)
error (1, errno, "lstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("symlink atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("symlink mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 1;
t[0].tv_nsec = 0;
t[1].tv_sec = 1;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, fd, NULL, t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 1 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to one");
status = 1;
}
if (st2.st_mtim.tv_sec != 1 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to one");
status = 1;
}
if (status == 0)
puts ("all OK");
out:
close (fd);
unlink ("ttt");
unlink ("tttsym");
return status;
}
[akpm@linux-foundation.org: add missing i386 syscall table entry]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 00:33:25 -07:00
|
|
|
{
|
|
|
|
struct timespec tv[2];
|
|
|
|
|
|
|
|
if (t) {
|
|
|
|
if (get_compat_timespec(&tv[0], &t[0]) ||
|
|
|
|
get_compat_timespec(&tv[1], &t[1]))
|
|
|
|
return -EFAULT;
|
|
|
|
|
|
|
|
if (tv[0].tv_nsec == UTIME_OMIT && tv[1].tv_nsec == UTIME_OMIT)
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
return do_utimes(dfd, filename, t ? tv : NULL, flags);
|
2005-04-16 15:20:36 -07:00
|
|
|
}
|
|
|
|
|
2010-08-11 11:26:22 +01:00
|
|
|
asmlinkage long compat_sys_futimesat(unsigned int dfd, const char __user *filename, struct compat_timeval __user *t)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
utimensat implementation
Implement utimensat(2) which is an extension to futimesat(2) in that it
a) supports nano-second resolution for the timestamps
b) allows to selectively ignore the atime/mtime value
c) allows to selectively use the current time for either atime or mtime
d) supports changing the atime/mtime of a symlink itself along the lines
of the BSD lutimes(3) functions
For this change the internally used do_utimes() functions was changed to
accept a timespec time value and an additional flags parameter.
Additionally the sys_utime function was changed to match compat_sys_utime
which already use do_utimes instead of duplicating the work.
Also, the completely missing futimensat() functionality is added. We have
such a function in glibc but we have to resort to using /proc/self/fd/* which
not everybody likes (chroot etc).
Test application (the syscall number will need per-arch editing):
#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <sys/time.h>
#include <stddef.h>
#include <syscall.h>
#define __NR_utimensat 280
#define UTIME_NOW ((1l << 30) - 1l)
#define UTIME_OMIT ((1l << 30) - 2l)
int
main(void)
{
int status = 0;
int fd = open("ttt", O_RDWR|O_CREAT|O_EXCL, 0666);
if (fd == -1)
error (1, errno, "failed to create test file \"ttt\"");
struct stat64 st1;
if (fstat64 (fd, &st1) != 0)
error (1, errno, "fstat failed");
struct timespec t[2];
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
struct stat64 st2;
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0] = st1.st_atim;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_OMIT;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("atim not set");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim changed from zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_OMIT;
t[1] = st1.st_mtim;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("mtim changed from original time");
status = 1;
}
if (st2.st_mtim.tv_sec != st1.st_mtim.tv_sec
|| st2.st_mtim.tv_nsec != st1.st_mtim.tv_nsec)
{
puts ("mtim not set");
status = 1;
}
if (status != 0)
goto out;
sleep (2);
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_NOW;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_NOW;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
struct timeval tv;
gettimeofday(&tv,NULL);
if (st2.st_atim.tv_sec <= st1.st_atim.tv_sec
|| st2.st_atim.tv_sec > tv.tv_sec)
{
puts ("atim not set to NOW");
status = 1;
}
if (st2.st_mtim.tv_sec <= st1.st_mtim.tv_sec
|| st2.st_mtim.tv_sec > tv.tv_sec)
{
puts ("mtim not set to NOW");
status = 1;
}
if (symlink ("ttt", "tttsym") != 0)
error (1, errno, "cannot create symlink");
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "tttsym", t, AT_SYMLINK_NOFOLLOW) != 0)
error (1, errno, "utimensat failed");
if (lstat64 ("tttsym", &st2) != 0)
error (1, errno, "lstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("symlink atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("symlink mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 1;
t[0].tv_nsec = 0;
t[1].tv_sec = 1;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, fd, NULL, t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 1 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to one");
status = 1;
}
if (st2.st_mtim.tv_sec != 1 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to one");
status = 1;
}
if (status == 0)
puts ("all OK");
out:
close (fd);
unlink ("ttt");
unlink ("tttsym");
return status;
}
[akpm@linux-foundation.org: add missing i386 syscall table entry]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 00:33:25 -07:00
|
|
|
struct timespec tv[2];
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2006-02-02 16:11:51 +11:00
|
|
|
if (t) {
|
2005-04-16 15:20:36 -07:00
|
|
|
if (get_user(tv[0].tv_sec, &t[0].tv_sec) ||
|
utimensat implementation
Implement utimensat(2) which is an extension to futimesat(2) in that it
a) supports nano-second resolution for the timestamps
b) allows to selectively ignore the atime/mtime value
c) allows to selectively use the current time for either atime or mtime
d) supports changing the atime/mtime of a symlink itself along the lines
of the BSD lutimes(3) functions
For this change the internally used do_utimes() functions was changed to
accept a timespec time value and an additional flags parameter.
Additionally the sys_utime function was changed to match compat_sys_utime
which already use do_utimes instead of duplicating the work.
Also, the completely missing futimensat() functionality is added. We have
such a function in glibc but we have to resort to using /proc/self/fd/* which
not everybody likes (chroot etc).
Test application (the syscall number will need per-arch editing):
#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <sys/time.h>
#include <stddef.h>
#include <syscall.h>
#define __NR_utimensat 280
#define UTIME_NOW ((1l << 30) - 1l)
#define UTIME_OMIT ((1l << 30) - 2l)
int
main(void)
{
int status = 0;
int fd = open("ttt", O_RDWR|O_CREAT|O_EXCL, 0666);
if (fd == -1)
error (1, errno, "failed to create test file \"ttt\"");
struct stat64 st1;
if (fstat64 (fd, &st1) != 0)
error (1, errno, "fstat failed");
struct timespec t[2];
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
struct stat64 st2;
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0] = st1.st_atim;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_OMIT;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("atim not set");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim changed from zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_OMIT;
t[1] = st1.st_mtim;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("mtim changed from original time");
status = 1;
}
if (st2.st_mtim.tv_sec != st1.st_mtim.tv_sec
|| st2.st_mtim.tv_nsec != st1.st_mtim.tv_nsec)
{
puts ("mtim not set");
status = 1;
}
if (status != 0)
goto out;
sleep (2);
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_NOW;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_NOW;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
struct timeval tv;
gettimeofday(&tv,NULL);
if (st2.st_atim.tv_sec <= st1.st_atim.tv_sec
|| st2.st_atim.tv_sec > tv.tv_sec)
{
puts ("atim not set to NOW");
status = 1;
}
if (st2.st_mtim.tv_sec <= st1.st_mtim.tv_sec
|| st2.st_mtim.tv_sec > tv.tv_sec)
{
puts ("mtim not set to NOW");
status = 1;
}
if (symlink ("ttt", "tttsym") != 0)
error (1, errno, "cannot create symlink");
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "tttsym", t, AT_SYMLINK_NOFOLLOW) != 0)
error (1, errno, "utimensat failed");
if (lstat64 ("tttsym", &st2) != 0)
error (1, errno, "lstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("symlink atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("symlink mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 1;
t[0].tv_nsec = 0;
t[1].tv_sec = 1;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, fd, NULL, t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 1 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to one");
status = 1;
}
if (st2.st_mtim.tv_sec != 1 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to one");
status = 1;
}
if (status == 0)
puts ("all OK");
out:
close (fd);
unlink ("ttt");
unlink ("tttsym");
return status;
}
[akpm@linux-foundation.org: add missing i386 syscall table entry]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 00:33:25 -07:00
|
|
|
get_user(tv[0].tv_nsec, &t[0].tv_usec) ||
|
2005-04-16 15:20:36 -07:00
|
|
|
get_user(tv[1].tv_sec, &t[1].tv_sec) ||
|
utimensat implementation
Implement utimensat(2) which is an extension to futimesat(2) in that it
a) supports nano-second resolution for the timestamps
b) allows to selectively ignore the atime/mtime value
c) allows to selectively use the current time for either atime or mtime
d) supports changing the atime/mtime of a symlink itself along the lines
of the BSD lutimes(3) functions
For this change the internally used do_utimes() functions was changed to
accept a timespec time value and an additional flags parameter.
Additionally the sys_utime function was changed to match compat_sys_utime
which already use do_utimes instead of duplicating the work.
Also, the completely missing futimensat() functionality is added. We have
such a function in glibc but we have to resort to using /proc/self/fd/* which
not everybody likes (chroot etc).
Test application (the syscall number will need per-arch editing):
#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <sys/time.h>
#include <stddef.h>
#include <syscall.h>
#define __NR_utimensat 280
#define UTIME_NOW ((1l << 30) - 1l)
#define UTIME_OMIT ((1l << 30) - 2l)
int
main(void)
{
int status = 0;
int fd = open("ttt", O_RDWR|O_CREAT|O_EXCL, 0666);
if (fd == -1)
error (1, errno, "failed to create test file \"ttt\"");
struct stat64 st1;
if (fstat64 (fd, &st1) != 0)
error (1, errno, "fstat failed");
struct timespec t[2];
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
struct stat64 st2;
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0] = st1.st_atim;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_OMIT;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("atim not set");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim changed from zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_OMIT;
t[1] = st1.st_mtim;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("mtim changed from original time");
status = 1;
}
if (st2.st_mtim.tv_sec != st1.st_mtim.tv_sec
|| st2.st_mtim.tv_nsec != st1.st_mtim.tv_nsec)
{
puts ("mtim not set");
status = 1;
}
if (status != 0)
goto out;
sleep (2);
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_NOW;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_NOW;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
struct timeval tv;
gettimeofday(&tv,NULL);
if (st2.st_atim.tv_sec <= st1.st_atim.tv_sec
|| st2.st_atim.tv_sec > tv.tv_sec)
{
puts ("atim not set to NOW");
status = 1;
}
if (st2.st_mtim.tv_sec <= st1.st_mtim.tv_sec
|| st2.st_mtim.tv_sec > tv.tv_sec)
{
puts ("mtim not set to NOW");
status = 1;
}
if (symlink ("ttt", "tttsym") != 0)
error (1, errno, "cannot create symlink");
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "tttsym", t, AT_SYMLINK_NOFOLLOW) != 0)
error (1, errno, "utimensat failed");
if (lstat64 ("tttsym", &st2) != 0)
error (1, errno, "lstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("symlink atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("symlink mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 1;
t[0].tv_nsec = 0;
t[1].tv_sec = 1;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, fd, NULL, t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 1 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to one");
status = 1;
}
if (st2.st_mtim.tv_sec != 1 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to one");
status = 1;
}
if (status == 0)
puts ("all OK");
out:
close (fd);
unlink ("ttt");
unlink ("tttsym");
return status;
}
[akpm@linux-foundation.org: add missing i386 syscall table entry]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 00:33:25 -07:00
|
|
|
get_user(tv[1].tv_nsec, &t[1].tv_usec))
|
2006-02-02 16:11:51 +11:00
|
|
|
return -EFAULT;
|
utimensat implementation
Implement utimensat(2) which is an extension to futimesat(2) in that it
a) supports nano-second resolution for the timestamps
b) allows to selectively ignore the atime/mtime value
c) allows to selectively use the current time for either atime or mtime
d) supports changing the atime/mtime of a symlink itself along the lines
of the BSD lutimes(3) functions
For this change the internally used do_utimes() functions was changed to
accept a timespec time value and an additional flags parameter.
Additionally the sys_utime function was changed to match compat_sys_utime
which already use do_utimes instead of duplicating the work.
Also, the completely missing futimensat() functionality is added. We have
such a function in glibc but we have to resort to using /proc/self/fd/* which
not everybody likes (chroot etc).
Test application (the syscall number will need per-arch editing):
#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <sys/time.h>
#include <stddef.h>
#include <syscall.h>
#define __NR_utimensat 280
#define UTIME_NOW ((1l << 30) - 1l)
#define UTIME_OMIT ((1l << 30) - 2l)
int
main(void)
{
int status = 0;
int fd = open("ttt", O_RDWR|O_CREAT|O_EXCL, 0666);
if (fd == -1)
error (1, errno, "failed to create test file \"ttt\"");
struct stat64 st1;
if (fstat64 (fd, &st1) != 0)
error (1, errno, "fstat failed");
struct timespec t[2];
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
struct stat64 st2;
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0] = st1.st_atim;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_OMIT;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("atim not set");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim changed from zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_OMIT;
t[1] = st1.st_mtim;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("mtim changed from original time");
status = 1;
}
if (st2.st_mtim.tv_sec != st1.st_mtim.tv_sec
|| st2.st_mtim.tv_nsec != st1.st_mtim.tv_nsec)
{
puts ("mtim not set");
status = 1;
}
if (status != 0)
goto out;
sleep (2);
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_NOW;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_NOW;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
struct timeval tv;
gettimeofday(&tv,NULL);
if (st2.st_atim.tv_sec <= st1.st_atim.tv_sec
|| st2.st_atim.tv_sec > tv.tv_sec)
{
puts ("atim not set to NOW");
status = 1;
}
if (st2.st_mtim.tv_sec <= st1.st_mtim.tv_sec
|| st2.st_mtim.tv_sec > tv.tv_sec)
{
puts ("mtim not set to NOW");
status = 1;
}
if (symlink ("ttt", "tttsym") != 0)
error (1, errno, "cannot create symlink");
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "tttsym", t, AT_SYMLINK_NOFOLLOW) != 0)
error (1, errno, "utimensat failed");
if (lstat64 ("tttsym", &st2) != 0)
error (1, errno, "lstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("symlink atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("symlink mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 1;
t[0].tv_nsec = 0;
t[1].tv_sec = 1;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, fd, NULL, t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 1 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to one");
status = 1;
}
if (st2.st_mtim.tv_sec != 1 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to one");
status = 1;
}
if (status == 0)
puts ("all OK");
out:
close (fd);
unlink ("ttt");
unlink ("tttsym");
return status;
}
[akpm@linux-foundation.org: add missing i386 syscall table entry]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 00:33:25 -07:00
|
|
|
if (tv[0].tv_nsec >= 1000000 || tv[0].tv_nsec < 0 ||
|
|
|
|
tv[1].tv_nsec >= 1000000 || tv[1].tv_nsec < 0)
|
|
|
|
return -EINVAL;
|
|
|
|
tv[0].tv_nsec *= 1000;
|
|
|
|
tv[1].tv_nsec *= 1000;
|
2006-02-02 16:11:51 +11:00
|
|
|
}
|
utimensat implementation
Implement utimensat(2) which is an extension to futimesat(2) in that it
a) supports nano-second resolution for the timestamps
b) allows to selectively ignore the atime/mtime value
c) allows to selectively use the current time for either atime or mtime
d) supports changing the atime/mtime of a symlink itself along the lines
of the BSD lutimes(3) functions
For this change the internally used do_utimes() functions was changed to
accept a timespec time value and an additional flags parameter.
Additionally the sys_utime function was changed to match compat_sys_utime
which already use do_utimes instead of duplicating the work.
Also, the completely missing futimensat() functionality is added. We have
such a function in glibc but we have to resort to using /proc/self/fd/* which
not everybody likes (chroot etc).
Test application (the syscall number will need per-arch editing):
#include <errno.h>
#include <fcntl.h>
#include <time.h>
#include <sys/time.h>
#include <stddef.h>
#include <syscall.h>
#define __NR_utimensat 280
#define UTIME_NOW ((1l << 30) - 1l)
#define UTIME_OMIT ((1l << 30) - 2l)
int
main(void)
{
int status = 0;
int fd = open("ttt", O_RDWR|O_CREAT|O_EXCL, 0666);
if (fd == -1)
error (1, errno, "failed to create test file \"ttt\"");
struct stat64 st1;
if (fstat64 (fd, &st1) != 0)
error (1, errno, "fstat failed");
struct timespec t[2];
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
struct stat64 st2;
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0] = st1.st_atim;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_OMIT;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("atim not set");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim changed from zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_OMIT;
t[1] = st1.st_mtim;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
|| st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
{
puts ("mtim changed from original time");
status = 1;
}
if (st2.st_mtim.tv_sec != st1.st_mtim.tv_sec
|| st2.st_mtim.tv_nsec != st1.st_mtim.tv_nsec)
{
puts ("mtim not set");
status = 1;
}
if (status != 0)
goto out;
sleep (2);
t[0].tv_sec = 0;
t[0].tv_nsec = UTIME_NOW;
t[1].tv_sec = 0;
t[1].tv_nsec = UTIME_NOW;
if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
struct timeval tv;
gettimeofday(&tv,NULL);
if (st2.st_atim.tv_sec <= st1.st_atim.tv_sec
|| st2.st_atim.tv_sec > tv.tv_sec)
{
puts ("atim not set to NOW");
status = 1;
}
if (st2.st_mtim.tv_sec <= st1.st_mtim.tv_sec
|| st2.st_mtim.tv_sec > tv.tv_sec)
{
puts ("mtim not set to NOW");
status = 1;
}
if (symlink ("ttt", "tttsym") != 0)
error (1, errno, "cannot create symlink");
t[0].tv_sec = 0;
t[0].tv_nsec = 0;
t[1].tv_sec = 0;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, AT_FDCWD, "tttsym", t, AT_SYMLINK_NOFOLLOW) != 0)
error (1, errno, "utimensat failed");
if (lstat64 ("tttsym", &st2) != 0)
error (1, errno, "lstat failed");
if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
{
puts ("symlink atim not reset to zero");
status = 1;
}
if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
{
puts ("symlink mtim not reset to zero");
status = 1;
}
if (status != 0)
goto out;
t[0].tv_sec = 1;
t[0].tv_nsec = 0;
t[1].tv_sec = 1;
t[1].tv_nsec = 0;
if (syscall(__NR_utimensat, fd, NULL, t, 0) != 0)
error (1, errno, "utimensat failed");
if (fstat64 (fd, &st2) != 0)
error (1, errno, "fstat failed");
if (st2.st_atim.tv_sec != 1 || st2.st_atim.tv_nsec != 0)
{
puts ("atim not reset to one");
status = 1;
}
if (st2.st_mtim.tv_sec != 1 || st2.st_mtim.tv_nsec != 0)
{
puts ("mtim not reset to one");
status = 1;
}
if (status == 0)
puts ("all OK");
out:
close (fd);
unlink ("ttt");
unlink ("tttsym");
return status;
}
[akpm@linux-foundation.org: add missing i386 syscall table entry]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Cc: Alexey Dobriyan <adobriyan@openvz.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-08 00:33:25 -07:00
|
|
|
return do_utimes(dfd, filename, t ? tv : NULL, 0);
|
2006-01-18 17:43:53 -08:00
|
|
|
}
|
|
|
|
|
2010-08-11 11:26:22 +01:00
|
|
|
asmlinkage long compat_sys_utimes(const char __user *filename, struct compat_timeval __user *t)
|
2006-01-18 17:43:53 -08:00
|
|
|
{
|
|
|
|
return compat_sys_futimesat(AT_FDCWD, filename, t);
|
2005-04-16 15:20:36 -07:00
|
|
|
}
|
|
|
|
|
2008-10-15 22:02:05 -07:00
|
|
|
static int cp_compat_stat(struct kstat *stat, struct compat_stat __user *ubuf)
|
|
|
|
{
|
|
|
|
compat_ino_t ino = stat->ino;
|
|
|
|
typeof(ubuf->st_uid) uid = 0;
|
|
|
|
typeof(ubuf->st_gid) gid = 0;
|
|
|
|
int err;
|
|
|
|
|
|
|
|
SET_UID(uid, stat->uid);
|
|
|
|
SET_GID(gid, stat->gid);
|
|
|
|
|
|
|
|
if ((u64) stat->size > MAX_NON_LFS ||
|
|
|
|
!old_valid_dev(stat->dev) ||
|
|
|
|
!old_valid_dev(stat->rdev))
|
|
|
|
return -EOVERFLOW;
|
|
|
|
if (sizeof(ino) < sizeof(stat->ino) && ino != stat->ino)
|
|
|
|
return -EOVERFLOW;
|
|
|
|
|
|
|
|
if (clear_user(ubuf, sizeof(*ubuf)))
|
|
|
|
return -EFAULT;
|
|
|
|
|
|
|
|
err = __put_user(old_encode_dev(stat->dev), &ubuf->st_dev);
|
|
|
|
err |= __put_user(ino, &ubuf->st_ino);
|
|
|
|
err |= __put_user(stat->mode, &ubuf->st_mode);
|
|
|
|
err |= __put_user(stat->nlink, &ubuf->st_nlink);
|
|
|
|
err |= __put_user(uid, &ubuf->st_uid);
|
|
|
|
err |= __put_user(gid, &ubuf->st_gid);
|
|
|
|
err |= __put_user(old_encode_dev(stat->rdev), &ubuf->st_rdev);
|
|
|
|
err |= __put_user(stat->size, &ubuf->st_size);
|
|
|
|
err |= __put_user(stat->atime.tv_sec, &ubuf->st_atime);
|
|
|
|
err |= __put_user(stat->atime.tv_nsec, &ubuf->st_atime_nsec);
|
|
|
|
err |= __put_user(stat->mtime.tv_sec, &ubuf->st_mtime);
|
|
|
|
err |= __put_user(stat->mtime.tv_nsec, &ubuf->st_mtime_nsec);
|
|
|
|
err |= __put_user(stat->ctime.tv_sec, &ubuf->st_ctime);
|
|
|
|
err |= __put_user(stat->ctime.tv_nsec, &ubuf->st_ctime_nsec);
|
|
|
|
err |= __put_user(stat->blksize, &ubuf->st_blksize);
|
|
|
|
err |= __put_user(stat->blocks, &ubuf->st_blocks);
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2010-08-11 11:26:22 +01:00
|
|
|
asmlinkage long compat_sys_newstat(const char __user * filename,
|
2005-04-16 15:20:36 -07:00
|
|
|
struct compat_stat __user *statbuf)
|
|
|
|
{
|
|
|
|
struct kstat stat;
|
2009-04-08 16:34:03 -04:00
|
|
|
int error;
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2009-04-08 16:34:03 -04:00
|
|
|
error = vfs_stat(filename, &stat);
|
|
|
|
if (error)
|
|
|
|
return error;
|
|
|
|
return cp_compat_stat(&stat, statbuf);
|
2005-04-16 15:20:36 -07:00
|
|
|
}
|
|
|
|
|
2010-08-11 11:26:22 +01:00
|
|
|
asmlinkage long compat_sys_newlstat(const char __user * filename,
|
2005-04-16 15:20:36 -07:00
|
|
|
struct compat_stat __user *statbuf)
|
|
|
|
{
|
|
|
|
struct kstat stat;
|
2009-04-08 16:34:03 -04:00
|
|
|
int error;
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2009-04-08 16:34:03 -04:00
|
|
|
error = vfs_lstat(filename, &stat);
|
|
|
|
if (error)
|
|
|
|
return error;
|
|
|
|
return cp_compat_stat(&stat, statbuf);
|
2005-04-16 15:20:36 -07:00
|
|
|
}
|
|
|
|
|
2006-03-24 03:18:20 -08:00
|
|
|
#ifndef __ARCH_WANT_STAT64
|
2010-08-11 11:26:22 +01:00
|
|
|
asmlinkage long compat_sys_newfstatat(unsigned int dfd,
|
|
|
|
const char __user *filename,
|
2006-01-18 17:43:53 -08:00
|
|
|
struct compat_stat __user *statbuf, int flag)
|
|
|
|
{
|
|
|
|
struct kstat stat;
|
2009-04-08 20:05:42 +04:00
|
|
|
int error;
|
2006-01-18 17:43:53 -08:00
|
|
|
|
2009-04-08 20:05:42 +04:00
|
|
|
error = vfs_fstatat(dfd, filename, &stat, flag);
|
|
|
|
if (error)
|
|
|
|
return error;
|
|
|
|
return cp_compat_stat(&stat, statbuf);
|
2006-01-18 17:43:53 -08:00
|
|
|
}
|
2006-03-24 03:18:20 -08:00
|
|
|
#endif
|
2006-01-18 17:43:53 -08:00
|
|
|
|
2005-04-16 15:20:36 -07:00
|
|
|
asmlinkage long compat_sys_newfstat(unsigned int fd,
|
|
|
|
struct compat_stat __user * statbuf)
|
|
|
|
{
|
|
|
|
struct kstat stat;
|
|
|
|
int error = vfs_fstat(fd, &stat);
|
|
|
|
|
|
|
|
if (!error)
|
|
|
|
error = cp_compat_stat(&stat, statbuf);
|
|
|
|
return error;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int put_compat_statfs(struct compat_statfs __user *ubuf, struct kstatfs *kbuf)
|
|
|
|
{
|
|
|
|
|
|
|
|
if (sizeof ubuf->f_blocks == 4) {
|
2008-07-23 21:27:55 -07:00
|
|
|
if ((kbuf->f_blocks | kbuf->f_bfree | kbuf->f_bavail |
|
|
|
|
kbuf->f_bsize | kbuf->f_frsize) & 0xffffffff00000000ULL)
|
2005-04-16 15:20:36 -07:00
|
|
|
return -EOVERFLOW;
|
|
|
|
/* f_files and f_ffree may be -1; it's okay
|
|
|
|
* to stuff that into 32 bits */
|
|
|
|
if (kbuf->f_files != 0xffffffffffffffffULL
|
|
|
|
&& (kbuf->f_files & 0xffffffff00000000ULL))
|
|
|
|
return -EOVERFLOW;
|
|
|
|
if (kbuf->f_ffree != 0xffffffffffffffffULL
|
|
|
|
&& (kbuf->f_ffree & 0xffffffff00000000ULL))
|
|
|
|
return -EOVERFLOW;
|
|
|
|
}
|
|
|
|
if (!access_ok(VERIFY_WRITE, ubuf, sizeof(*ubuf)) ||
|
|
|
|
__put_user(kbuf->f_type, &ubuf->f_type) ||
|
|
|
|
__put_user(kbuf->f_bsize, &ubuf->f_bsize) ||
|
|
|
|
__put_user(kbuf->f_blocks, &ubuf->f_blocks) ||
|
|
|
|
__put_user(kbuf->f_bfree, &ubuf->f_bfree) ||
|
|
|
|
__put_user(kbuf->f_bavail, &ubuf->f_bavail) ||
|
|
|
|
__put_user(kbuf->f_files, &ubuf->f_files) ||
|
|
|
|
__put_user(kbuf->f_ffree, &ubuf->f_ffree) ||
|
|
|
|
__put_user(kbuf->f_namelen, &ubuf->f_namelen) ||
|
|
|
|
__put_user(kbuf->f_fsid.val[0], &ubuf->f_fsid.val[0]) ||
|
|
|
|
__put_user(kbuf->f_fsid.val[1], &ubuf->f_fsid.val[1]) ||
|
|
|
|
__put_user(kbuf->f_frsize, &ubuf->f_frsize) ||
|
2011-10-17 13:40:02 -07:00
|
|
|
__put_user(kbuf->f_flags, &ubuf->f_flags) ||
|
|
|
|
__clear_user(ubuf->f_spare, sizeof(ubuf->f_spare)))
|
2005-04-16 15:20:36 -07:00
|
|
|
return -EFAULT;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2010-12-27 01:41:53 +09:00
|
|
|
* The following statfs calls are copies of code from fs/statfs.c and
|
2005-04-16 15:20:36 -07:00
|
|
|
* should be checked against those from time to time
|
|
|
|
*/
|
2008-07-22 09:59:21 -04:00
|
|
|
asmlinkage long compat_sys_statfs(const char __user *pathname, struct compat_statfs __user *buf)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
2011-03-12 10:41:39 -05:00
|
|
|
struct kstatfs tmp;
|
|
|
|
int error = user_statfs(pathname, &tmp);
|
|
|
|
if (!error)
|
|
|
|
error = put_compat_statfs(buf, &tmp);
|
2005-04-16 15:20:36 -07:00
|
|
|
return error;
|
|
|
|
}
|
|
|
|
|
|
|
|
asmlinkage long compat_sys_fstatfs(unsigned int fd, struct compat_statfs __user *buf)
|
|
|
|
{
|
|
|
|
struct kstatfs tmp;
|
2011-03-12 10:41:39 -05:00
|
|
|
int error = fd_statfs(fd, &tmp);
|
2005-11-21 21:32:23 -08:00
|
|
|
if (!error)
|
|
|
|
error = put_compat_statfs(buf, &tmp);
|
2005-04-16 15:20:36 -07:00
|
|
|
return error;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int put_compat_statfs64(struct compat_statfs64 __user *ubuf, struct kstatfs *kbuf)
|
|
|
|
{
|
|
|
|
if (sizeof ubuf->f_blocks == 4) {
|
2008-07-23 21:27:55 -07:00
|
|
|
if ((kbuf->f_blocks | kbuf->f_bfree | kbuf->f_bavail |
|
|
|
|
kbuf->f_bsize | kbuf->f_frsize) & 0xffffffff00000000ULL)
|
2005-04-16 15:20:36 -07:00
|
|
|
return -EOVERFLOW;
|
|
|
|
/* f_files and f_ffree may be -1; it's okay
|
|
|
|
* to stuff that into 32 bits */
|
|
|
|
if (kbuf->f_files != 0xffffffffffffffffULL
|
|
|
|
&& (kbuf->f_files & 0xffffffff00000000ULL))
|
|
|
|
return -EOVERFLOW;
|
|
|
|
if (kbuf->f_ffree != 0xffffffffffffffffULL
|
|
|
|
&& (kbuf->f_ffree & 0xffffffff00000000ULL))
|
|
|
|
return -EOVERFLOW;
|
|
|
|
}
|
|
|
|
if (!access_ok(VERIFY_WRITE, ubuf, sizeof(*ubuf)) ||
|
|
|
|
__put_user(kbuf->f_type, &ubuf->f_type) ||
|
|
|
|
__put_user(kbuf->f_bsize, &ubuf->f_bsize) ||
|
|
|
|
__put_user(kbuf->f_blocks, &ubuf->f_blocks) ||
|
|
|
|
__put_user(kbuf->f_bfree, &ubuf->f_bfree) ||
|
|
|
|
__put_user(kbuf->f_bavail, &ubuf->f_bavail) ||
|
|
|
|
__put_user(kbuf->f_files, &ubuf->f_files) ||
|
|
|
|
__put_user(kbuf->f_ffree, &ubuf->f_ffree) ||
|
|
|
|
__put_user(kbuf->f_namelen, &ubuf->f_namelen) ||
|
|
|
|
__put_user(kbuf->f_fsid.val[0], &ubuf->f_fsid.val[0]) ||
|
|
|
|
__put_user(kbuf->f_fsid.val[1], &ubuf->f_fsid.val[1]) ||
|
2010-12-27 01:41:54 +09:00
|
|
|
__put_user(kbuf->f_frsize, &ubuf->f_frsize) ||
|
|
|
|
__put_user(kbuf->f_flags, &ubuf->f_flags) ||
|
|
|
|
__clear_user(ubuf->f_spare, sizeof(ubuf->f_spare)))
|
2005-04-16 15:20:36 -07:00
|
|
|
return -EFAULT;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2008-07-22 09:59:21 -04:00
|
|
|
asmlinkage long compat_sys_statfs64(const char __user *pathname, compat_size_t sz, struct compat_statfs64 __user *buf)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
2011-03-12 10:41:39 -05:00
|
|
|
struct kstatfs tmp;
|
2005-04-16 15:20:36 -07:00
|
|
|
int error;
|
|
|
|
|
|
|
|
if (sz != sizeof(*buf))
|
|
|
|
return -EINVAL;
|
|
|
|
|
2011-03-12 10:41:39 -05:00
|
|
|
error = user_statfs(pathname, &tmp);
|
|
|
|
if (!error)
|
|
|
|
error = put_compat_statfs64(buf, &tmp);
|
2005-04-16 15:20:36 -07:00
|
|
|
return error;
|
|
|
|
}
|
|
|
|
|
|
|
|
asmlinkage long compat_sys_fstatfs64(unsigned int fd, compat_size_t sz, struct compat_statfs64 __user *buf)
|
|
|
|
{
|
|
|
|
struct kstatfs tmp;
|
|
|
|
int error;
|
|
|
|
|
|
|
|
if (sz != sizeof(*buf))
|
|
|
|
return -EINVAL;
|
|
|
|
|
2011-03-12 10:41:39 -05:00
|
|
|
error = fd_statfs(fd, &tmp);
|
2005-11-21 21:32:23 -08:00
|
|
|
if (!error)
|
|
|
|
error = put_compat_statfs64(buf, &tmp);
|
2005-04-16 15:20:36 -07:00
|
|
|
return error;
|
|
|
|
}
|
|
|
|
|
2008-11-28 10:09:09 +01:00
|
|
|
/*
|
|
|
|
* This is a copy of sys_ustat, just dealing with a structure layout.
|
|
|
|
* Given how simple this syscall is that apporach is more maintainable
|
|
|
|
* than the various conversion hacks.
|
|
|
|
*/
|
|
|
|
asmlinkage long compat_sys_ustat(unsigned dev, struct compat_ustat __user *u)
|
|
|
|
{
|
|
|
|
struct super_block *sb;
|
|
|
|
struct compat_ustat tmp;
|
|
|
|
struct kstatfs sbuf;
|
|
|
|
int err;
|
|
|
|
|
|
|
|
sb = user_get_super(new_decode_dev(dev));
|
|
|
|
if (!sb)
|
|
|
|
return -EINVAL;
|
2010-07-07 18:53:11 +02:00
|
|
|
err = statfs_by_dentry(sb->s_root, &sbuf);
|
2008-11-28 10:09:09 +01:00
|
|
|
drop_super(sb);
|
|
|
|
if (err)
|
|
|
|
return err;
|
|
|
|
|
|
|
|
memset(&tmp, 0, sizeof(struct compat_ustat));
|
|
|
|
tmp.f_tfree = sbuf.f_bfree;
|
|
|
|
tmp.f_tinode = sbuf.f_ffree;
|
|
|
|
if (copy_to_user(u, &tmp, sizeof(struct compat_ustat)))
|
|
|
|
return -EFAULT;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2005-04-16 15:20:36 -07:00
|
|
|
static int get_compat_flock(struct flock *kfl, struct compat_flock __user *ufl)
|
|
|
|
{
|
|
|
|
if (!access_ok(VERIFY_READ, ufl, sizeof(*ufl)) ||
|
|
|
|
__get_user(kfl->l_type, &ufl->l_type) ||
|
|
|
|
__get_user(kfl->l_whence, &ufl->l_whence) ||
|
|
|
|
__get_user(kfl->l_start, &ufl->l_start) ||
|
|
|
|
__get_user(kfl->l_len, &ufl->l_len) ||
|
|
|
|
__get_user(kfl->l_pid, &ufl->l_pid))
|
|
|
|
return -EFAULT;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int put_compat_flock(struct flock *kfl, struct compat_flock __user *ufl)
|
|
|
|
{
|
|
|
|
if (!access_ok(VERIFY_WRITE, ufl, sizeof(*ufl)) ||
|
|
|
|
__put_user(kfl->l_type, &ufl->l_type) ||
|
|
|
|
__put_user(kfl->l_whence, &ufl->l_whence) ||
|
|
|
|
__put_user(kfl->l_start, &ufl->l_start) ||
|
|
|
|
__put_user(kfl->l_len, &ufl->l_len) ||
|
|
|
|
__put_user(kfl->l_pid, &ufl->l_pid))
|
|
|
|
return -EFAULT;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
#ifndef HAVE_ARCH_GET_COMPAT_FLOCK64
|
|
|
|
static int get_compat_flock64(struct flock *kfl, struct compat_flock64 __user *ufl)
|
|
|
|
{
|
|
|
|
if (!access_ok(VERIFY_READ, ufl, sizeof(*ufl)) ||
|
|
|
|
__get_user(kfl->l_type, &ufl->l_type) ||
|
|
|
|
__get_user(kfl->l_whence, &ufl->l_whence) ||
|
|
|
|
__get_user(kfl->l_start, &ufl->l_start) ||
|
|
|
|
__get_user(kfl->l_len, &ufl->l_len) ||
|
|
|
|
__get_user(kfl->l_pid, &ufl->l_pid))
|
|
|
|
return -EFAULT;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
|
|
|
#ifndef HAVE_ARCH_PUT_COMPAT_FLOCK64
|
|
|
|
static int put_compat_flock64(struct flock *kfl, struct compat_flock64 __user *ufl)
|
|
|
|
{
|
|
|
|
if (!access_ok(VERIFY_WRITE, ufl, sizeof(*ufl)) ||
|
|
|
|
__put_user(kfl->l_type, &ufl->l_type) ||
|
|
|
|
__put_user(kfl->l_whence, &ufl->l_whence) ||
|
|
|
|
__put_user(kfl->l_start, &ufl->l_start) ||
|
|
|
|
__put_user(kfl->l_len, &ufl->l_len) ||
|
|
|
|
__put_user(kfl->l_pid, &ufl->l_pid))
|
|
|
|
return -EFAULT;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
|
|
|
asmlinkage long compat_sys_fcntl64(unsigned int fd, unsigned int cmd,
|
|
|
|
unsigned long arg)
|
|
|
|
{
|
|
|
|
mm_segment_t old_fs;
|
|
|
|
struct flock f;
|
|
|
|
long ret;
|
|
|
|
|
|
|
|
switch (cmd) {
|
|
|
|
case F_GETLK:
|
|
|
|
case F_SETLK:
|
|
|
|
case F_SETLKW:
|
|
|
|
ret = get_compat_flock(&f, compat_ptr(arg));
|
|
|
|
if (ret != 0)
|
|
|
|
break;
|
|
|
|
old_fs = get_fs();
|
|
|
|
set_fs(KERNEL_DS);
|
|
|
|
ret = sys_fcntl(fd, cmd, (unsigned long)&f);
|
|
|
|
set_fs(old_fs);
|
|
|
|
if (cmd == F_GETLK && ret == 0) {
|
2009-04-01 14:40:51 +05:30
|
|
|
/* GETLK was successful and we need to return the data...
|
2006-01-08 01:02:40 -08:00
|
|
|
* but it needs to fit in the compat structure.
|
|
|
|
* l_start shouldn't be too big, unless the original
|
|
|
|
* start + end is greater than COMPAT_OFF_T_MAX, in which
|
|
|
|
* case the app was asking for trouble, so we return
|
|
|
|
* -EOVERFLOW in that case.
|
|
|
|
* l_len could be too big, in which case we just truncate it,
|
|
|
|
* and only allow the app to see that part of the conflicting
|
|
|
|
* lock that might make sense to it anyway
|
|
|
|
*/
|
|
|
|
|
|
|
|
if (f.l_start > COMPAT_OFF_T_MAX)
|
2005-04-16 15:20:36 -07:00
|
|
|
ret = -EOVERFLOW;
|
2006-01-08 01:02:40 -08:00
|
|
|
if (f.l_len > COMPAT_OFF_T_MAX)
|
|
|
|
f.l_len = COMPAT_OFF_T_MAX;
|
2005-04-16 15:20:36 -07:00
|
|
|
if (ret == 0)
|
|
|
|
ret = put_compat_flock(&f, compat_ptr(arg));
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
|
|
|
|
case F_GETLK64:
|
|
|
|
case F_SETLK64:
|
|
|
|
case F_SETLKW64:
|
|
|
|
ret = get_compat_flock64(&f, compat_ptr(arg));
|
|
|
|
if (ret != 0)
|
|
|
|
break;
|
|
|
|
old_fs = get_fs();
|
|
|
|
set_fs(KERNEL_DS);
|
|
|
|
ret = sys_fcntl(fd, (cmd == F_GETLK64) ? F_GETLK :
|
|
|
|
((cmd == F_SETLK64) ? F_SETLK : F_SETLKW),
|
|
|
|
(unsigned long)&f);
|
|
|
|
set_fs(old_fs);
|
|
|
|
if (cmd == F_GETLK64 && ret == 0) {
|
2006-01-08 01:02:40 -08:00
|
|
|
/* need to return lock information - see above for commentary */
|
|
|
|
if (f.l_start > COMPAT_LOFF_T_MAX)
|
2005-04-16 15:20:36 -07:00
|
|
|
ret = -EOVERFLOW;
|
2006-01-08 01:02:40 -08:00
|
|
|
if (f.l_len > COMPAT_LOFF_T_MAX)
|
|
|
|
f.l_len = COMPAT_LOFF_T_MAX;
|
2005-04-16 15:20:36 -07:00
|
|
|
if (ret == 0)
|
|
|
|
ret = put_compat_flock64(&f, compat_ptr(arg));
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
|
|
|
|
default:
|
|
|
|
ret = sys_fcntl(fd, cmd, arg);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
asmlinkage long compat_sys_fcntl(unsigned int fd, unsigned int cmd,
|
|
|
|
unsigned long arg)
|
|
|
|
{
|
|
|
|
if ((cmd == F_GETLK64) || (cmd == F_SETLK64) || (cmd == F_SETLKW64))
|
|
|
|
return -EINVAL;
|
|
|
|
return compat_sys_fcntl64(fd, cmd, arg);
|
|
|
|
}
|
|
|
|
|
|
|
|
asmlinkage long
|
|
|
|
compat_sys_io_setup(unsigned nr_reqs, u32 __user *ctx32p)
|
|
|
|
{
|
|
|
|
long ret;
|
|
|
|
aio_context_t ctx64;
|
|
|
|
|
|
|
|
mm_segment_t oldfs = get_fs();
|
|
|
|
if (unlikely(get_user(ctx64, ctx32p)))
|
|
|
|
return -EFAULT;
|
|
|
|
|
|
|
|
set_fs(KERNEL_DS);
|
|
|
|
/* The __user pointer cast is valid because of the set_fs() */
|
|
|
|
ret = sys_io_setup(nr_reqs, (aio_context_t __user *) &ctx64);
|
|
|
|
set_fs(oldfs);
|
|
|
|
/* truncating is ok because it's a user address */
|
|
|
|
if (!ret)
|
|
|
|
ret = put_user((u32) ctx64, ctx32p);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
asmlinkage long
|
|
|
|
compat_sys_io_getevents(aio_context_t ctx_id,
|
|
|
|
unsigned long min_nr,
|
|
|
|
unsigned long nr,
|
|
|
|
struct io_event __user *events,
|
|
|
|
struct compat_timespec __user *timeout)
|
|
|
|
{
|
|
|
|
long ret;
|
|
|
|
struct timespec t;
|
|
|
|
struct timespec __user *ut = NULL;
|
|
|
|
|
|
|
|
ret = -EFAULT;
|
|
|
|
if (unlikely(!access_ok(VERIFY_WRITE, events,
|
|
|
|
nr * sizeof(struct io_event))))
|
|
|
|
goto out;
|
|
|
|
if (timeout) {
|
|
|
|
if (get_compat_timespec(&t, timeout))
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
ut = compat_alloc_user_space(sizeof(*ut));
|
|
|
|
if (copy_to_user(ut, &t, sizeof(t)) )
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
ret = sys_io_getevents(ctx_id, min_nr, nr, events, ut);
|
|
|
|
out:
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2010-05-26 14:44:25 -07:00
|
|
|
/* A write operation does a read from user space and vice versa */
|
|
|
|
#define vrfy_dir(type) ((type) == READ ? VERIFY_WRITE : VERIFY_READ)
|
|
|
|
|
|
|
|
ssize_t compat_rw_copy_check_uvector(int type,
|
|
|
|
const struct compat_iovec __user *uvector, unsigned long nr_segs,
|
|
|
|
unsigned long fast_segs, struct iovec *fast_pointer,
|
2011-10-31 17:06:39 -07:00
|
|
|
struct iovec **ret_pointer, int check_access)
|
2010-05-26 14:44:25 -07:00
|
|
|
{
|
|
|
|
compat_ssize_t tot_len;
|
|
|
|
struct iovec *iov = *ret_pointer = fast_pointer;
|
|
|
|
ssize_t ret = 0;
|
|
|
|
int seg;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* SuS says "The readv() function *may* fail if the iovcnt argument
|
|
|
|
* was less than or equal to 0, or greater than {IOV_MAX}. Linux has
|
|
|
|
* traditionally returned zero for zero segments, so...
|
|
|
|
*/
|
|
|
|
if (nr_segs == 0)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
ret = -EINVAL;
|
|
|
|
if (nr_segs > UIO_MAXIOV || nr_segs < 0)
|
|
|
|
goto out;
|
|
|
|
if (nr_segs > fast_segs) {
|
|
|
|
ret = -ENOMEM;
|
|
|
|
iov = kmalloc(nr_segs*sizeof(struct iovec), GFP_KERNEL);
|
2010-12-27 01:41:52 +09:00
|
|
|
if (iov == NULL)
|
2010-05-26 14:44:25 -07:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
*ret_pointer = iov;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Single unix specification:
|
|
|
|
* We should -EINVAL if an element length is not >= 0 and fitting an
|
2010-10-29 10:36:49 -07:00
|
|
|
* ssize_t.
|
2010-05-26 14:44:25 -07:00
|
|
|
*
|
2010-10-29 10:36:49 -07:00
|
|
|
* In Linux, the total length is limited to MAX_RW_COUNT, there is
|
|
|
|
* no overflow possibility.
|
2010-05-26 14:44:25 -07:00
|
|
|
*/
|
|
|
|
tot_len = 0;
|
|
|
|
ret = -EINVAL;
|
|
|
|
for (seg = 0; seg < nr_segs; seg++) {
|
|
|
|
compat_uptr_t buf;
|
|
|
|
compat_ssize_t len;
|
|
|
|
|
|
|
|
if (__get_user(len, &uvector->iov_len) ||
|
|
|
|
__get_user(buf, &uvector->iov_base)) {
|
|
|
|
ret = -EFAULT;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
if (len < 0) /* size_t not fitting in compat_ssize_t .. */
|
|
|
|
goto out;
|
2011-10-31 17:06:39 -07:00
|
|
|
if (check_access &&
|
|
|
|
!access_ok(vrfy_dir(type), compat_ptr(buf), len)) {
|
2010-05-26 14:44:25 -07:00
|
|
|
ret = -EFAULT;
|
|
|
|
goto out;
|
|
|
|
}
|
2010-10-29 10:36:49 -07:00
|
|
|
if (len > MAX_RW_COUNT - tot_len)
|
|
|
|
len = MAX_RW_COUNT - tot_len;
|
|
|
|
tot_len += len;
|
2010-05-26 14:44:25 -07:00
|
|
|
iov->iov_base = compat_ptr(buf);
|
|
|
|
iov->iov_len = (compat_size_t) len;
|
|
|
|
uvector++;
|
|
|
|
iov++;
|
|
|
|
}
|
|
|
|
ret = tot_len;
|
|
|
|
|
|
|
|
out:
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2005-04-16 15:20:36 -07:00
|
|
|
static inline long
|
|
|
|
copy_iocb(long nr, u32 __user *ptr32, struct iocb __user * __user *ptr64)
|
|
|
|
{
|
|
|
|
compat_uptr_t uptr;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for (i = 0; i < nr; ++i) {
|
|
|
|
if (get_user(uptr, ptr32 + i))
|
|
|
|
return -EFAULT;
|
|
|
|
if (put_user(compat_ptr(uptr), ptr64 + i))
|
|
|
|
return -EFAULT;
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
#define MAX_AIO_SUBMITS (PAGE_SIZE/sizeof(struct iocb *))
|
|
|
|
|
|
|
|
asmlinkage long
|
|
|
|
compat_sys_io_submit(aio_context_t ctx_id, int nr, u32 __user *iocb)
|
|
|
|
{
|
|
|
|
struct iocb __user * __user *iocb64;
|
|
|
|
long ret;
|
|
|
|
|
|
|
|
if (unlikely(nr < 0))
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
if (nr > MAX_AIO_SUBMITS)
|
|
|
|
nr = MAX_AIO_SUBMITS;
|
|
|
|
|
|
|
|
iocb64 = compat_alloc_user_space(nr * sizeof(*iocb64));
|
|
|
|
ret = copy_iocb(nr, iocb, iocb64);
|
|
|
|
if (!ret)
|
2010-05-26 14:44:26 -07:00
|
|
|
ret = do_io_submit(ctx_id, nr, iocb64, 1);
|
2005-04-16 15:20:36 -07:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
struct compat_ncp_mount_data {
|
|
|
|
compat_int_t version;
|
|
|
|
compat_uint_t ncp_fd;
|
2005-09-06 15:16:40 -07:00
|
|
|
__compat_uid_t mounted_uid;
|
2005-04-16 15:20:36 -07:00
|
|
|
compat_pid_t wdog_pid;
|
|
|
|
unsigned char mounted_vol[NCP_VOLNAME_LEN + 1];
|
|
|
|
compat_uint_t time_out;
|
|
|
|
compat_uint_t retry_count;
|
|
|
|
compat_uint_t flags;
|
2005-09-06 15:16:40 -07:00
|
|
|
__compat_uid_t uid;
|
|
|
|
__compat_gid_t gid;
|
2005-04-16 15:20:36 -07:00
|
|
|
compat_mode_t file_mode;
|
|
|
|
compat_mode_t dir_mode;
|
|
|
|
};
|
|
|
|
|
|
|
|
struct compat_ncp_mount_data_v4 {
|
|
|
|
compat_int_t version;
|
|
|
|
compat_ulong_t flags;
|
|
|
|
compat_ulong_t mounted_uid;
|
|
|
|
compat_long_t wdog_pid;
|
|
|
|
compat_uint_t ncp_fd;
|
|
|
|
compat_uint_t time_out;
|
|
|
|
compat_uint_t retry_count;
|
|
|
|
compat_ulong_t uid;
|
|
|
|
compat_ulong_t gid;
|
|
|
|
compat_ulong_t file_mode;
|
|
|
|
compat_ulong_t dir_mode;
|
|
|
|
};
|
|
|
|
|
|
|
|
static void *do_ncp_super_data_conv(void *raw_data)
|
|
|
|
{
|
|
|
|
int version = *(unsigned int *)raw_data;
|
|
|
|
|
|
|
|
if (version == 3) {
|
|
|
|
struct compat_ncp_mount_data *c_n = raw_data;
|
|
|
|
struct ncp_mount_data *n = raw_data;
|
|
|
|
|
|
|
|
n->dir_mode = c_n->dir_mode;
|
|
|
|
n->file_mode = c_n->file_mode;
|
|
|
|
n->gid = c_n->gid;
|
|
|
|
n->uid = c_n->uid;
|
|
|
|
memmove (n->mounted_vol, c_n->mounted_vol, (sizeof (c_n->mounted_vol) + 3 * sizeof (unsigned int)));
|
|
|
|
n->wdog_pid = c_n->wdog_pid;
|
|
|
|
n->mounted_uid = c_n->mounted_uid;
|
|
|
|
} else if (version == 4) {
|
|
|
|
struct compat_ncp_mount_data_v4 *c_n = raw_data;
|
|
|
|
struct ncp_mount_data_v4 *n = raw_data;
|
|
|
|
|
|
|
|
n->dir_mode = c_n->dir_mode;
|
|
|
|
n->file_mode = c_n->file_mode;
|
|
|
|
n->gid = c_n->gid;
|
|
|
|
n->uid = c_n->uid;
|
|
|
|
n->retry_count = c_n->retry_count;
|
|
|
|
n->time_out = c_n->time_out;
|
|
|
|
n->ncp_fd = c_n->ncp_fd;
|
|
|
|
n->wdog_pid = c_n->wdog_pid;
|
|
|
|
n->mounted_uid = c_n->mounted_uid;
|
|
|
|
n->flags = c_n->flags;
|
|
|
|
} else if (version != 5) {
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
return raw_data;
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2005-04-18 10:54:51 -07:00
|
|
|
struct compat_nfs_string {
|
|
|
|
compat_uint_t len;
|
2005-04-27 15:39:03 -07:00
|
|
|
compat_uptr_t data;
|
2005-04-18 10:54:51 -07:00
|
|
|
};
|
|
|
|
|
|
|
|
static inline void compat_nfs_string(struct nfs_string *dst,
|
|
|
|
struct compat_nfs_string *src)
|
|
|
|
{
|
|
|
|
dst->data = compat_ptr(src->data);
|
|
|
|
dst->len = src->len;
|
|
|
|
}
|
|
|
|
|
|
|
|
struct compat_nfs4_mount_data_v1 {
|
|
|
|
compat_int_t version;
|
|
|
|
compat_int_t flags;
|
|
|
|
compat_int_t rsize;
|
|
|
|
compat_int_t wsize;
|
|
|
|
compat_int_t timeo;
|
|
|
|
compat_int_t retrans;
|
|
|
|
compat_int_t acregmin;
|
|
|
|
compat_int_t acregmax;
|
|
|
|
compat_int_t acdirmin;
|
|
|
|
compat_int_t acdirmax;
|
|
|
|
struct compat_nfs_string client_addr;
|
|
|
|
struct compat_nfs_string mnt_path;
|
|
|
|
struct compat_nfs_string hostname;
|
|
|
|
compat_uint_t host_addrlen;
|
2005-04-27 15:39:03 -07:00
|
|
|
compat_uptr_t host_addr;
|
2005-04-18 10:54:51 -07:00
|
|
|
compat_int_t proto;
|
|
|
|
compat_int_t auth_flavourlen;
|
2005-04-27 15:39:03 -07:00
|
|
|
compat_uptr_t auth_flavours;
|
2005-04-18 10:54:51 -07:00
|
|
|
};
|
|
|
|
|
|
|
|
static int do_nfs4_super_data_conv(void *raw_data)
|
|
|
|
{
|
|
|
|
int version = *(compat_uint_t *) raw_data;
|
|
|
|
|
|
|
|
if (version == 1) {
|
|
|
|
struct compat_nfs4_mount_data_v1 *raw = raw_data;
|
|
|
|
struct nfs4_mount_data *real = raw_data;
|
|
|
|
|
|
|
|
/* copy the fields backwards */
|
|
|
|
real->auth_flavours = compat_ptr(raw->auth_flavours);
|
|
|
|
real->auth_flavourlen = raw->auth_flavourlen;
|
|
|
|
real->proto = raw->proto;
|
|
|
|
real->host_addr = compat_ptr(raw->host_addr);
|
|
|
|
real->host_addrlen = raw->host_addrlen;
|
|
|
|
compat_nfs_string(&real->hostname, &raw->hostname);
|
|
|
|
compat_nfs_string(&real->mnt_path, &raw->mnt_path);
|
|
|
|
compat_nfs_string(&real->client_addr, &raw->client_addr);
|
|
|
|
real->acdirmax = raw->acdirmax;
|
|
|
|
real->acdirmin = raw->acdirmin;
|
|
|
|
real->acregmax = raw->acregmax;
|
|
|
|
real->acregmin = raw->acregmin;
|
|
|
|
real->retrans = raw->retrans;
|
|
|
|
real->timeo = raw->timeo;
|
|
|
|
real->wsize = raw->wsize;
|
|
|
|
real->rsize = raw->rsize;
|
|
|
|
real->flags = raw->flags;
|
|
|
|
real->version = raw->version;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2005-04-16 15:20:36 -07:00
|
|
|
#define NCPFS_NAME "ncpfs"
|
2005-04-18 10:54:51 -07:00
|
|
|
#define NFS4_NAME "nfs4"
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2010-08-11 11:26:22 +01:00
|
|
|
asmlinkage long compat_sys_mount(const char __user * dev_name,
|
|
|
|
const char __user * dir_name,
|
|
|
|
const char __user * type, unsigned long flags,
|
|
|
|
const void __user * data)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
fs: fix overflow in sys_mount() for in-kernel calls
sys_mount() reads/copies a whole page for its "type" parameter. When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).
(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code. Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc. checks.)
But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte. Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?
[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: al <al@dizzy.pdmi.ras.ru>
2009-09-18 13:05:45 -07:00
|
|
|
char *kernel_type;
|
2005-04-16 15:20:36 -07:00
|
|
|
unsigned long data_page;
|
fs: fix overflow in sys_mount() for in-kernel calls
sys_mount() reads/copies a whole page for its "type" parameter. When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).
(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code. Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc. checks.)
But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte. Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?
[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: al <al@dizzy.pdmi.ras.ru>
2009-09-18 13:05:45 -07:00
|
|
|
char *kernel_dev;
|
2005-04-16 15:20:36 -07:00
|
|
|
char *dir_page;
|
|
|
|
int retval;
|
|
|
|
|
fs: fix overflow in sys_mount() for in-kernel calls
sys_mount() reads/copies a whole page for its "type" parameter. When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).
(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code. Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc. checks.)
But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte. Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?
[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: al <al@dizzy.pdmi.ras.ru>
2009-09-18 13:05:45 -07:00
|
|
|
retval = copy_mount_string(type, &kernel_type);
|
2005-04-16 15:20:36 -07:00
|
|
|
if (retval < 0)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
dir_page = getname(dir_name);
|
|
|
|
retval = PTR_ERR(dir_page);
|
|
|
|
if (IS_ERR(dir_page))
|
|
|
|
goto out1;
|
|
|
|
|
fs: fix overflow in sys_mount() for in-kernel calls
sys_mount() reads/copies a whole page for its "type" parameter. When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).
(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code. Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc. checks.)
But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte. Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?
[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: al <al@dizzy.pdmi.ras.ru>
2009-09-18 13:05:45 -07:00
|
|
|
retval = copy_mount_string(dev_name, &kernel_dev);
|
2005-04-16 15:20:36 -07:00
|
|
|
if (retval < 0)
|
|
|
|
goto out2;
|
|
|
|
|
fs: fix overflow in sys_mount() for in-kernel calls
sys_mount() reads/copies a whole page for its "type" parameter. When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).
(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code. Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc. checks.)
But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte. Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?
[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: al <al@dizzy.pdmi.ras.ru>
2009-09-18 13:05:45 -07:00
|
|
|
retval = copy_mount_options(data, &data_page);
|
2005-04-16 15:20:36 -07:00
|
|
|
if (retval < 0)
|
|
|
|
goto out3;
|
|
|
|
|
|
|
|
retval = -EINVAL;
|
|
|
|
|
fs: fix overflow in sys_mount() for in-kernel calls
sys_mount() reads/copies a whole page for its "type" parameter. When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).
(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code. Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc. checks.)
But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte. Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?
[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: al <al@dizzy.pdmi.ras.ru>
2009-09-18 13:05:45 -07:00
|
|
|
if (kernel_type && data_page) {
|
2010-10-04 22:55:57 +02:00
|
|
|
if (!strcmp(kernel_type, NCPFS_NAME)) {
|
2005-04-16 15:20:36 -07:00
|
|
|
do_ncp_super_data_conv((void *)data_page);
|
fs: fix overflow in sys_mount() for in-kernel calls
sys_mount() reads/copies a whole page for its "type" parameter. When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).
(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code. Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc. checks.)
But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte. Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?
[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: al <al@dizzy.pdmi.ras.ru>
2009-09-18 13:05:45 -07:00
|
|
|
} else if (!strcmp(kernel_type, NFS4_NAME)) {
|
2005-04-18 10:54:51 -07:00
|
|
|
if (do_nfs4_super_data_conv((void *) data_page))
|
|
|
|
goto out4;
|
2005-04-16 15:20:36 -07:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
fs: fix overflow in sys_mount() for in-kernel calls
sys_mount() reads/copies a whole page for its "type" parameter. When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).
(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code. Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc. checks.)
But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte. Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?
[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: al <al@dizzy.pdmi.ras.ru>
2009-09-18 13:05:45 -07:00
|
|
|
retval = do_mount(kernel_dev, dir_page, kernel_type,
|
2005-04-16 15:20:36 -07:00
|
|
|
flags, (void*)data_page);
|
|
|
|
|
2005-04-18 10:54:51 -07:00
|
|
|
out4:
|
2005-04-16 15:20:36 -07:00
|
|
|
free_page(data_page);
|
|
|
|
out3:
|
fs: fix overflow in sys_mount() for in-kernel calls
sys_mount() reads/copies a whole page for its "type" parameter. When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).
(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code. Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc. checks.)
But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte. Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?
[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: al <al@dizzy.pdmi.ras.ru>
2009-09-18 13:05:45 -07:00
|
|
|
kfree(kernel_dev);
|
2005-04-16 15:20:36 -07:00
|
|
|
out2:
|
|
|
|
putname(dir_page);
|
|
|
|
out1:
|
fs: fix overflow in sys_mount() for in-kernel calls
sys_mount() reads/copies a whole page for its "type" parameter. When
do_mount_root() passes a kernel address that points to an object which is
smaller than a whole page, copy_mount_options() will happily go past this
memory object, possibly dereferencing "wild" pointers that could be in any
state (hence the kmemcheck warning, which shows that parts of the next
page are not even allocated).
(The likelihood of something going wrong here is pretty low -- first of
all this only applies to kernel calls to sys_mount(), which are mostly
found in the boot code. Secondly, I guess if the page was not mapped,
exact_copy_from_user() _would_ in fact handle it correctly because of its
access_ok(), etc. checks.)
But it is much nicer to avoid the dubious reads altogether, by stopping as
soon as we find a NUL byte. Is there a good reason why we can't do
something like this, using the already existing strndup_from_user()?
[akpm@linux-foundation.org: make copy_mount_string() static]
[AV: fix compat mount breakage, which involves undoing akpm's change above]
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: al <al@dizzy.pdmi.ras.ru>
2009-09-18 13:05:45 -07:00
|
|
|
kfree(kernel_type);
|
2005-04-16 15:20:36 -07:00
|
|
|
out:
|
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
|
|
|
struct compat_old_linux_dirent {
|
|
|
|
compat_ulong_t d_ino;
|
|
|
|
compat_ulong_t d_offset;
|
|
|
|
unsigned short d_namlen;
|
|
|
|
char d_name[1];
|
|
|
|
};
|
|
|
|
|
|
|
|
struct compat_readdir_callback {
|
|
|
|
struct compat_old_linux_dirent __user *dirent;
|
|
|
|
int result;
|
|
|
|
};
|
|
|
|
|
|
|
|
static int compat_fillonedir(void *__buf, const char *name, int namlen,
|
2006-10-03 01:13:46 -07:00
|
|
|
loff_t offset, u64 ino, unsigned int d_type)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
|
|
|
struct compat_readdir_callback *buf = __buf;
|
|
|
|
struct compat_old_linux_dirent __user *dirent;
|
2006-10-03 01:13:46 -07:00
|
|
|
compat_ulong_t d_ino;
|
2005-04-16 15:20:36 -07:00
|
|
|
|
|
|
|
if (buf->result)
|
|
|
|
return -EINVAL;
|
2006-10-03 01:13:46 -07:00
|
|
|
d_ino = ino;
|
2008-08-12 00:28:24 -04:00
|
|
|
if (sizeof(d_ino) < sizeof(ino) && d_ino != ino) {
|
|
|
|
buf->result = -EOVERFLOW;
|
2006-10-03 01:13:46 -07:00
|
|
|
return -EOVERFLOW;
|
2008-08-12 00:28:24 -04:00
|
|
|
}
|
2005-04-16 15:20:36 -07:00
|
|
|
buf->result++;
|
|
|
|
dirent = buf->dirent;
|
|
|
|
if (!access_ok(VERIFY_WRITE, dirent,
|
|
|
|
(unsigned long)(dirent->d_name + namlen + 1) -
|
|
|
|
(unsigned long)dirent))
|
|
|
|
goto efault;
|
2006-10-03 01:13:46 -07:00
|
|
|
if ( __put_user(d_ino, &dirent->d_ino) ||
|
2005-04-16 15:20:36 -07:00
|
|
|
__put_user(offset, &dirent->d_offset) ||
|
|
|
|
__put_user(namlen, &dirent->d_namlen) ||
|
|
|
|
__copy_to_user(dirent->d_name, name, namlen) ||
|
|
|
|
__put_user(0, dirent->d_name + namlen))
|
|
|
|
goto efault;
|
|
|
|
return 0;
|
|
|
|
efault:
|
|
|
|
buf->result = -EFAULT;
|
|
|
|
return -EFAULT;
|
|
|
|
}
|
|
|
|
|
|
|
|
asmlinkage long compat_sys_old_readdir(unsigned int fd,
|
|
|
|
struct compat_old_linux_dirent __user *dirent, unsigned int count)
|
|
|
|
{
|
|
|
|
int error;
|
|
|
|
struct file *file;
|
|
|
|
struct compat_readdir_callback buf;
|
|
|
|
|
|
|
|
error = -EBADF;
|
|
|
|
file = fget(fd);
|
|
|
|
if (!file)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
buf.result = 0;
|
|
|
|
buf.dirent = dirent;
|
|
|
|
|
|
|
|
error = vfs_readdir(file, compat_fillonedir, &buf);
|
2008-08-24 07:29:52 -04:00
|
|
|
if (buf.result)
|
2005-04-16 15:20:36 -07:00
|
|
|
error = buf.result;
|
|
|
|
|
|
|
|
fput(file);
|
|
|
|
out:
|
|
|
|
return error;
|
|
|
|
}
|
|
|
|
|
|
|
|
struct compat_linux_dirent {
|
|
|
|
compat_ulong_t d_ino;
|
|
|
|
compat_ulong_t d_off;
|
|
|
|
unsigned short d_reclen;
|
|
|
|
char d_name[1];
|
|
|
|
};
|
|
|
|
|
|
|
|
struct compat_getdents_callback {
|
|
|
|
struct compat_linux_dirent __user *current_dir;
|
|
|
|
struct compat_linux_dirent __user *previous;
|
|
|
|
int count;
|
|
|
|
int error;
|
|
|
|
};
|
|
|
|
|
|
|
|
static int compat_filldir(void *__buf, const char *name, int namlen,
|
2006-10-03 01:13:46 -07:00
|
|
|
loff_t offset, u64 ino, unsigned int d_type)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
|
|
|
struct compat_linux_dirent __user * dirent;
|
|
|
|
struct compat_getdents_callback *buf = __buf;
|
2006-10-03 01:13:46 -07:00
|
|
|
compat_ulong_t d_ino;
|
2010-08-09 17:20:22 -07:00
|
|
|
int reclen = ALIGN(offsetof(struct compat_linux_dirent, d_name) +
|
|
|
|
namlen + 2, sizeof(compat_long_t));
|
2005-04-16 15:20:36 -07:00
|
|
|
|
|
|
|
buf->error = -EINVAL; /* only used if we fail.. */
|
|
|
|
if (reclen > buf->count)
|
|
|
|
return -EINVAL;
|
2006-10-03 01:13:46 -07:00
|
|
|
d_ino = ino;
|
2008-08-12 00:28:24 -04:00
|
|
|
if (sizeof(d_ino) < sizeof(ino) && d_ino != ino) {
|
|
|
|
buf->error = -EOVERFLOW;
|
2006-10-03 01:13:46 -07:00
|
|
|
return -EOVERFLOW;
|
2008-08-12 00:28:24 -04:00
|
|
|
}
|
2005-04-16 15:20:36 -07:00
|
|
|
dirent = buf->previous;
|
|
|
|
if (dirent) {
|
|
|
|
if (__put_user(offset, &dirent->d_off))
|
|
|
|
goto efault;
|
|
|
|
}
|
|
|
|
dirent = buf->current_dir;
|
2006-10-03 01:13:46 -07:00
|
|
|
if (__put_user(d_ino, &dirent->d_ino))
|
2005-04-16 15:20:36 -07:00
|
|
|
goto efault;
|
|
|
|
if (__put_user(reclen, &dirent->d_reclen))
|
|
|
|
goto efault;
|
|
|
|
if (copy_to_user(dirent->d_name, name, namlen))
|
|
|
|
goto efault;
|
|
|
|
if (__put_user(0, dirent->d_name + namlen))
|
|
|
|
goto efault;
|
|
|
|
if (__put_user(d_type, (char __user *) dirent + reclen - 1))
|
|
|
|
goto efault;
|
|
|
|
buf->previous = dirent;
|
|
|
|
dirent = (void __user *)dirent + reclen;
|
|
|
|
buf->current_dir = dirent;
|
|
|
|
buf->count -= reclen;
|
|
|
|
return 0;
|
|
|
|
efault:
|
|
|
|
buf->error = -EFAULT;
|
|
|
|
return -EFAULT;
|
|
|
|
}
|
|
|
|
|
|
|
|
asmlinkage long compat_sys_getdents(unsigned int fd,
|
|
|
|
struct compat_linux_dirent __user *dirent, unsigned int count)
|
|
|
|
{
|
|
|
|
struct file * file;
|
|
|
|
struct compat_linux_dirent __user * lastdirent;
|
|
|
|
struct compat_getdents_callback buf;
|
|
|
|
int error;
|
|
|
|
|
|
|
|
error = -EFAULT;
|
|
|
|
if (!access_ok(VERIFY_WRITE, dirent, count))
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
error = -EBADF;
|
|
|
|
file = fget(fd);
|
|
|
|
if (!file)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
buf.current_dir = dirent;
|
|
|
|
buf.previous = NULL;
|
|
|
|
buf.count = count;
|
|
|
|
buf.error = 0;
|
|
|
|
|
|
|
|
error = vfs_readdir(file, compat_filldir, &buf);
|
2008-08-24 07:29:52 -04:00
|
|
|
if (error >= 0)
|
|
|
|
error = buf.error;
|
2005-04-16 15:20:36 -07:00
|
|
|
lastdirent = buf.previous;
|
|
|
|
if (lastdirent) {
|
|
|
|
if (put_user(file->f_pos, &lastdirent->d_off))
|
|
|
|
error = -EFAULT;
|
|
|
|
else
|
|
|
|
error = count - buf.count;
|
|
|
|
}
|
|
|
|
fput(file);
|
|
|
|
out:
|
|
|
|
return error;
|
|
|
|
}
|
|
|
|
|
|
|
|
#ifndef __ARCH_OMIT_COMPAT_SYS_GETDENTS64
|
|
|
|
|
|
|
|
struct compat_getdents_callback64 {
|
|
|
|
struct linux_dirent64 __user *current_dir;
|
|
|
|
struct linux_dirent64 __user *previous;
|
|
|
|
int count;
|
|
|
|
int error;
|
|
|
|
};
|
|
|
|
|
|
|
|
static int compat_filldir64(void * __buf, const char * name, int namlen, loff_t offset,
|
2006-10-03 01:13:46 -07:00
|
|
|
u64 ino, unsigned int d_type)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
|
|
|
struct linux_dirent64 __user *dirent;
|
|
|
|
struct compat_getdents_callback64 *buf = __buf;
|
2010-08-09 17:20:22 -07:00
|
|
|
int reclen = ALIGN(offsetof(struct linux_dirent64, d_name) + namlen + 1,
|
|
|
|
sizeof(u64));
|
2005-04-16 15:20:36 -07:00
|
|
|
u64 off;
|
|
|
|
|
|
|
|
buf->error = -EINVAL; /* only used if we fail.. */
|
|
|
|
if (reclen > buf->count)
|
|
|
|
return -EINVAL;
|
|
|
|
dirent = buf->previous;
|
|
|
|
|
|
|
|
if (dirent) {
|
|
|
|
if (__put_user_unaligned(offset, &dirent->d_off))
|
|
|
|
goto efault;
|
|
|
|
}
|
|
|
|
dirent = buf->current_dir;
|
|
|
|
if (__put_user_unaligned(ino, &dirent->d_ino))
|
|
|
|
goto efault;
|
|
|
|
off = 0;
|
|
|
|
if (__put_user_unaligned(off, &dirent->d_off))
|
|
|
|
goto efault;
|
|
|
|
if (__put_user(reclen, &dirent->d_reclen))
|
|
|
|
goto efault;
|
|
|
|
if (__put_user(d_type, &dirent->d_type))
|
|
|
|
goto efault;
|
|
|
|
if (copy_to_user(dirent->d_name, name, namlen))
|
|
|
|
goto efault;
|
|
|
|
if (__put_user(0, dirent->d_name + namlen))
|
|
|
|
goto efault;
|
|
|
|
buf->previous = dirent;
|
|
|
|
dirent = (void __user *)dirent + reclen;
|
|
|
|
buf->current_dir = dirent;
|
|
|
|
buf->count -= reclen;
|
|
|
|
return 0;
|
|
|
|
efault:
|
|
|
|
buf->error = -EFAULT;
|
|
|
|
return -EFAULT;
|
|
|
|
}
|
|
|
|
|
|
|
|
asmlinkage long compat_sys_getdents64(unsigned int fd,
|
|
|
|
struct linux_dirent64 __user * dirent, unsigned int count)
|
|
|
|
{
|
|
|
|
struct file * file;
|
|
|
|
struct linux_dirent64 __user * lastdirent;
|
|
|
|
struct compat_getdents_callback64 buf;
|
|
|
|
int error;
|
|
|
|
|
|
|
|
error = -EFAULT;
|
|
|
|
if (!access_ok(VERIFY_WRITE, dirent, count))
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
error = -EBADF;
|
|
|
|
file = fget(fd);
|
|
|
|
if (!file)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
buf.current_dir = dirent;
|
|
|
|
buf.previous = NULL;
|
|
|
|
buf.count = count;
|
|
|
|
buf.error = 0;
|
|
|
|
|
|
|
|
error = vfs_readdir(file, compat_filldir64, &buf);
|
2008-08-24 07:29:52 -04:00
|
|
|
if (error >= 0)
|
|
|
|
error = buf.error;
|
2005-04-16 15:20:36 -07:00
|
|
|
lastdirent = buf.previous;
|
|
|
|
if (lastdirent) {
|
|
|
|
typeof(lastdirent->d_off) d_off = file->f_pos;
|
2006-12-06 20:36:36 -08:00
|
|
|
if (__put_user_unaligned(d_off, &lastdirent->d_off))
|
2008-08-24 07:29:52 -04:00
|
|
|
error = -EFAULT;
|
|
|
|
else
|
|
|
|
error = count - buf.count;
|
2005-04-16 15:20:36 -07:00
|
|
|
}
|
|
|
|
fput(file);
|
|
|
|
out:
|
|
|
|
return error;
|
|
|
|
}
|
|
|
|
#endif /* ! __ARCH_OMIT_COMPAT_SYS_GETDENTS64 */
|
|
|
|
|
|
|
|
static ssize_t compat_do_readv_writev(int type, struct file *file,
|
|
|
|
const struct compat_iovec __user *uvector,
|
|
|
|
unsigned long nr_segs, loff_t *pos)
|
|
|
|
{
|
|
|
|
compat_ssize_t tot_len;
|
|
|
|
struct iovec iovstack[UIO_FASTIOV];
|
2010-09-22 14:32:56 -04:00
|
|
|
struct iovec *iov = iovstack;
|
2005-04-16 15:20:36 -07:00
|
|
|
ssize_t ret;
|
|
|
|
io_fn_t fn;
|
|
|
|
iov_fn_t fnv;
|
|
|
|
|
|
|
|
ret = -EINVAL;
|
|
|
|
if (!file->f_op)
|
|
|
|
goto out;
|
2010-05-26 14:44:25 -07:00
|
|
|
|
2005-04-16 15:20:36 -07:00
|
|
|
ret = -EFAULT;
|
|
|
|
if (!access_ok(VERIFY_READ, uvector, nr_segs*sizeof(*uvector)))
|
|
|
|
goto out;
|
|
|
|
|
2010-05-26 14:44:25 -07:00
|
|
|
tot_len = compat_rw_copy_check_uvector(type, uvector, nr_segs,
|
2011-10-31 17:06:39 -07:00
|
|
|
UIO_FASTIOV, iovstack, &iov, 1);
|
2005-04-16 15:20:36 -07:00
|
|
|
if (tot_len == 0) {
|
|
|
|
ret = 0;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
ret = rw_verify_area(type, file, pos, tot_len);
|
2006-01-04 16:20:40 -08:00
|
|
|
if (ret < 0)
|
2005-04-16 15:20:36 -07:00
|
|
|
goto out;
|
|
|
|
|
|
|
|
fnv = NULL;
|
|
|
|
if (type == READ) {
|
|
|
|
fn = file->f_op->read;
|
2006-09-30 23:28:47 -07:00
|
|
|
fnv = file->f_op->aio_read;
|
2005-04-16 15:20:36 -07:00
|
|
|
} else {
|
|
|
|
fn = (io_fn_t)file->f_op->write;
|
2006-09-30 23:28:47 -07:00
|
|
|
fnv = file->f_op->aio_write;
|
2005-04-16 15:20:36 -07:00
|
|
|
}
|
|
|
|
|
2006-09-30 23:28:47 -07:00
|
|
|
if (fnv)
|
|
|
|
ret = do_sync_readv_writev(file, iov, nr_segs, tot_len,
|
|
|
|
pos, fnv);
|
|
|
|
else
|
|
|
|
ret = do_loop_readv_writev(file, iov, nr_segs, pos, fn);
|
2005-04-16 15:20:36 -07:00
|
|
|
|
|
|
|
out:
|
|
|
|
if (iov != iovstack)
|
|
|
|
kfree(iov);
|
[PATCH] inotify
inotify is intended to correct the deficiencies of dnotify, particularly
its inability to scale and its terrible user interface:
* dnotify requires the opening of one fd per each directory
that you intend to watch. This quickly results in too many
open files and pins removable media, preventing unmount.
* dnotify is directory-based. You only learn about changes to
directories. Sure, a change to a file in a directory affects
the directory, but you are then forced to keep a cache of
stat structures.
* dnotify's interface to user-space is awful. Signals?
inotify provides a more usable, simple, powerful solution to file change
notification:
* inotify's interface is a system call that returns a fd, not SIGIO.
You get a single fd, which is select()-able.
* inotify has an event that says "the filesystem that the item
you were watching is on was unmounted."
* inotify can watch directories or files.
Inotify is currently used by Beagle (a desktop search infrastructure),
Gamin (a FAM replacement), and other projects.
See Documentation/filesystems/inotify.txt.
Signed-off-by: Robert Love <rml@novell.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 17:06:03 -04:00
|
|
|
if ((ret + (type == READ)) > 0) {
|
|
|
|
if (type == READ)
|
2009-12-17 21:24:21 -05:00
|
|
|
fsnotify_access(file);
|
[PATCH] inotify
inotify is intended to correct the deficiencies of dnotify, particularly
its inability to scale and its terrible user interface:
* dnotify requires the opening of one fd per each directory
that you intend to watch. This quickly results in too many
open files and pins removable media, preventing unmount.
* dnotify is directory-based. You only learn about changes to
directories. Sure, a change to a file in a directory affects
the directory, but you are then forced to keep a cache of
stat structures.
* dnotify's interface to user-space is awful. Signals?
inotify provides a more usable, simple, powerful solution to file change
notification:
* inotify's interface is a system call that returns a fd, not SIGIO.
You get a single fd, which is select()-able.
* inotify has an event that says "the filesystem that the item
you were watching is on was unmounted."
* inotify can watch directories or files.
Inotify is currently used by Beagle (a desktop search infrastructure),
Gamin (a FAM replacement), and other projects.
See Documentation/filesystems/inotify.txt.
Signed-off-by: Robert Love <rml@novell.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 17:06:03 -04:00
|
|
|
else
|
2009-12-17 21:24:21 -05:00
|
|
|
fsnotify_modify(file);
|
[PATCH] inotify
inotify is intended to correct the deficiencies of dnotify, particularly
its inability to scale and its terrible user interface:
* dnotify requires the opening of one fd per each directory
that you intend to watch. This quickly results in too many
open files and pins removable media, preventing unmount.
* dnotify is directory-based. You only learn about changes to
directories. Sure, a change to a file in a directory affects
the directory, but you are then forced to keep a cache of
stat structures.
* dnotify's interface to user-space is awful. Signals?
inotify provides a more usable, simple, powerful solution to file change
notification:
* inotify's interface is a system call that returns a fd, not SIGIO.
You get a single fd, which is select()-able.
* inotify has an event that says "the filesystem that the item
you were watching is on was unmounted."
* inotify can watch directories or files.
Inotify is currently used by Beagle (a desktop search infrastructure),
Gamin (a FAM replacement), and other projects.
See Documentation/filesystems/inotify.txt.
Signed-off-by: Robert Love <rml@novell.com>
Cc: John McCutchan <ttb@tentacle.dhs.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-07-12 17:06:03 -04:00
|
|
|
}
|
2005-04-16 15:20:36 -07:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
preadv/pwritev: create compat_readv()
This patch series:
Implement the preadv() and pwritev() syscalls. *BSD has this syscall for
quite some time.
Test code:
#if 0
set -x
gcc -Wall -O2 -o preadv $0
exit 0
#endif
/*
* preadv demo / test
*
* (c) 2008 Gerd Hoffmann <kraxel@redhat.com>
*
* build with "sh $thisfile"
*/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <inttypes.h>
#include <sys/uio.h>
/* ----------------------------------------------------------------- */
/* syscall windup */
#include <sys/syscall.h>
#if 0
/* WARNING: Be sure you know what you are doing if you enable this.
* linux syscall code isn't upstream yet, syscall numbers are subject
* to change */
# ifndef __NR_preadv
# ifdef __i386__
# define __NR_preadv 333
# define __NR_pwritev 334
# endif
# ifdef __x86_64__
# define __NR_preadv 295
# define __NR_pwritev 296
# endif
# endif
#endif
#ifndef __NR_preadv
# error preadv/pwritev syscall numbers are unknown
#endif
static ssize_t preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset)
{
uint32_t pos_high = (offset >> 32) & 0xffffffff;
uint32_t pos_low = offset & 0xffffffff;
return syscall(__NR_preadv, fd, iov, iovcnt, pos_high, pos_low);
}
static ssize_t pwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset)
{
uint32_t pos_high = (offset >> 32) & 0xffffffff;
uint32_t pos_low = offset & 0xffffffff;
return syscall(__NR_pwritev, fd, iov, iovcnt, pos_high, pos_low);
}
/* ----------------------------------------------------------------- */
/* demo/test app */
static char filename[] = "/tmp/preadv-XXXXXX";
static char outbuf[11] = "0123456789";
static char inbuf[11] = "----------";
static struct iovec ovec[2] = {{
.iov_base = outbuf + 5,
.iov_len = 5,
},{
.iov_base = outbuf + 0,
.iov_len = 5,
}};
static struct iovec ivec[3] = {{
.iov_base = inbuf + 6,
.iov_len = 2,
},{
.iov_base = inbuf + 4,
.iov_len = 2,
},{
.iov_base = inbuf + 2,
.iov_len = 2,
}};
void cleanup(void)
{
unlink(filename);
}
int main(int argc, char **argv)
{
int fd, rc;
fd = mkstemp(filename);
if (-1 == fd) {
perror("mkstemp");
exit(1);
}
atexit(cleanup);
/* write to file: "56789-01234" */
rc = pwritev(fd, ovec, 2, 0);
if (rc < 0) {
perror("pwritev");
exit(1);
}
/* read from file: "78-90-12" */
rc = preadv(fd, ivec, 3, 2);
if (rc < 0) {
perror("preadv");
exit(1);
}
printf("result : %s\n", inbuf);
printf("expected: %s\n", "--129078--");
exit(0);
}
This patch:
Factor out some code from compat_sys_readv() which can be shared with the
upcoming compat_sys_preadv().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <linux-api@vger.kernel.org>
Cc: <linux-arch@vger.kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 16:59:20 -07:00
|
|
|
static size_t compat_readv(struct file *file,
|
|
|
|
const struct compat_iovec __user *vec,
|
|
|
|
unsigned long vlen, loff_t *pos)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
|
|
|
ssize_t ret = -EBADF;
|
|
|
|
|
|
|
|
if (!(file->f_mode & FMODE_READ))
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
ret = -EINVAL;
|
2006-09-30 23:28:47 -07:00
|
|
|
if (!file->f_op || (!file->f_op->aio_read && !file->f_op->read))
|
2005-04-16 15:20:36 -07:00
|
|
|
goto out;
|
|
|
|
|
preadv/pwritev: create compat_readv()
This patch series:
Implement the preadv() and pwritev() syscalls. *BSD has this syscall for
quite some time.
Test code:
#if 0
set -x
gcc -Wall -O2 -o preadv $0
exit 0
#endif
/*
* preadv demo / test
*
* (c) 2008 Gerd Hoffmann <kraxel@redhat.com>
*
* build with "sh $thisfile"
*/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <inttypes.h>
#include <sys/uio.h>
/* ----------------------------------------------------------------- */
/* syscall windup */
#include <sys/syscall.h>
#if 0
/* WARNING: Be sure you know what you are doing if you enable this.
* linux syscall code isn't upstream yet, syscall numbers are subject
* to change */
# ifndef __NR_preadv
# ifdef __i386__
# define __NR_preadv 333
# define __NR_pwritev 334
# endif
# ifdef __x86_64__
# define __NR_preadv 295
# define __NR_pwritev 296
# endif
# endif
#endif
#ifndef __NR_preadv
# error preadv/pwritev syscall numbers are unknown
#endif
static ssize_t preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset)
{
uint32_t pos_high = (offset >> 32) & 0xffffffff;
uint32_t pos_low = offset & 0xffffffff;
return syscall(__NR_preadv, fd, iov, iovcnt, pos_high, pos_low);
}
static ssize_t pwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset)
{
uint32_t pos_high = (offset >> 32) & 0xffffffff;
uint32_t pos_low = offset & 0xffffffff;
return syscall(__NR_pwritev, fd, iov, iovcnt, pos_high, pos_low);
}
/* ----------------------------------------------------------------- */
/* demo/test app */
static char filename[] = "/tmp/preadv-XXXXXX";
static char outbuf[11] = "0123456789";
static char inbuf[11] = "----------";
static struct iovec ovec[2] = {{
.iov_base = outbuf + 5,
.iov_len = 5,
},{
.iov_base = outbuf + 0,
.iov_len = 5,
}};
static struct iovec ivec[3] = {{
.iov_base = inbuf + 6,
.iov_len = 2,
},{
.iov_base = inbuf + 4,
.iov_len = 2,
},{
.iov_base = inbuf + 2,
.iov_len = 2,
}};
void cleanup(void)
{
unlink(filename);
}
int main(int argc, char **argv)
{
int fd, rc;
fd = mkstemp(filename);
if (-1 == fd) {
perror("mkstemp");
exit(1);
}
atexit(cleanup);
/* write to file: "56789-01234" */
rc = pwritev(fd, ovec, 2, 0);
if (rc < 0) {
perror("pwritev");
exit(1);
}
/* read from file: "78-90-12" */
rc = preadv(fd, ivec, 3, 2);
if (rc < 0) {
perror("preadv");
exit(1);
}
printf("result : %s\n", inbuf);
printf("expected: %s\n", "--129078--");
exit(0);
}
This patch:
Factor out some code from compat_sys_readv() which can be shared with the
upcoming compat_sys_preadv().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <linux-api@vger.kernel.org>
Cc: <linux-arch@vger.kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 16:59:20 -07:00
|
|
|
ret = compat_do_readv_writev(READ, file, vec, vlen, pos);
|
2005-04-16 15:20:36 -07:00
|
|
|
|
|
|
|
out:
|
2009-01-06 14:41:09 -08:00
|
|
|
if (ret > 0)
|
|
|
|
add_rchar(current, ret);
|
|
|
|
inc_syscr(current);
|
preadv/pwritev: create compat_readv()
This patch series:
Implement the preadv() and pwritev() syscalls. *BSD has this syscall for
quite some time.
Test code:
#if 0
set -x
gcc -Wall -O2 -o preadv $0
exit 0
#endif
/*
* preadv demo / test
*
* (c) 2008 Gerd Hoffmann <kraxel@redhat.com>
*
* build with "sh $thisfile"
*/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <inttypes.h>
#include <sys/uio.h>
/* ----------------------------------------------------------------- */
/* syscall windup */
#include <sys/syscall.h>
#if 0
/* WARNING: Be sure you know what you are doing if you enable this.
* linux syscall code isn't upstream yet, syscall numbers are subject
* to change */
# ifndef __NR_preadv
# ifdef __i386__
# define __NR_preadv 333
# define __NR_pwritev 334
# endif
# ifdef __x86_64__
# define __NR_preadv 295
# define __NR_pwritev 296
# endif
# endif
#endif
#ifndef __NR_preadv
# error preadv/pwritev syscall numbers are unknown
#endif
static ssize_t preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset)
{
uint32_t pos_high = (offset >> 32) & 0xffffffff;
uint32_t pos_low = offset & 0xffffffff;
return syscall(__NR_preadv, fd, iov, iovcnt, pos_high, pos_low);
}
static ssize_t pwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset)
{
uint32_t pos_high = (offset >> 32) & 0xffffffff;
uint32_t pos_low = offset & 0xffffffff;
return syscall(__NR_pwritev, fd, iov, iovcnt, pos_high, pos_low);
}
/* ----------------------------------------------------------------- */
/* demo/test app */
static char filename[] = "/tmp/preadv-XXXXXX";
static char outbuf[11] = "0123456789";
static char inbuf[11] = "----------";
static struct iovec ovec[2] = {{
.iov_base = outbuf + 5,
.iov_len = 5,
},{
.iov_base = outbuf + 0,
.iov_len = 5,
}};
static struct iovec ivec[3] = {{
.iov_base = inbuf + 6,
.iov_len = 2,
},{
.iov_base = inbuf + 4,
.iov_len = 2,
},{
.iov_base = inbuf + 2,
.iov_len = 2,
}};
void cleanup(void)
{
unlink(filename);
}
int main(int argc, char **argv)
{
int fd, rc;
fd = mkstemp(filename);
if (-1 == fd) {
perror("mkstemp");
exit(1);
}
atexit(cleanup);
/* write to file: "56789-01234" */
rc = pwritev(fd, ovec, 2, 0);
if (rc < 0) {
perror("pwritev");
exit(1);
}
/* read from file: "78-90-12" */
rc = preadv(fd, ivec, 3, 2);
if (rc < 0) {
perror("preadv");
exit(1);
}
printf("result : %s\n", inbuf);
printf("expected: %s\n", "--129078--");
exit(0);
}
This patch:
Factor out some code from compat_sys_readv() which can be shared with the
upcoming compat_sys_preadv().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <linux-api@vger.kernel.org>
Cc: <linux-arch@vger.kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 16:59:20 -07:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
asmlinkage ssize_t
|
|
|
|
compat_sys_readv(unsigned long fd, const struct compat_iovec __user *vec,
|
|
|
|
unsigned long vlen)
|
|
|
|
{
|
|
|
|
struct file *file;
|
2009-04-02 16:59:25 -07:00
|
|
|
int fput_needed;
|
preadv/pwritev: create compat_readv()
This patch series:
Implement the preadv() and pwritev() syscalls. *BSD has this syscall for
quite some time.
Test code:
#if 0
set -x
gcc -Wall -O2 -o preadv $0
exit 0
#endif
/*
* preadv demo / test
*
* (c) 2008 Gerd Hoffmann <kraxel@redhat.com>
*
* build with "sh $thisfile"
*/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <inttypes.h>
#include <sys/uio.h>
/* ----------------------------------------------------------------- */
/* syscall windup */
#include <sys/syscall.h>
#if 0
/* WARNING: Be sure you know what you are doing if you enable this.
* linux syscall code isn't upstream yet, syscall numbers are subject
* to change */
# ifndef __NR_preadv
# ifdef __i386__
# define __NR_preadv 333
# define __NR_pwritev 334
# endif
# ifdef __x86_64__
# define __NR_preadv 295
# define __NR_pwritev 296
# endif
# endif
#endif
#ifndef __NR_preadv
# error preadv/pwritev syscall numbers are unknown
#endif
static ssize_t preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset)
{
uint32_t pos_high = (offset >> 32) & 0xffffffff;
uint32_t pos_low = offset & 0xffffffff;
return syscall(__NR_preadv, fd, iov, iovcnt, pos_high, pos_low);
}
static ssize_t pwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset)
{
uint32_t pos_high = (offset >> 32) & 0xffffffff;
uint32_t pos_low = offset & 0xffffffff;
return syscall(__NR_pwritev, fd, iov, iovcnt, pos_high, pos_low);
}
/* ----------------------------------------------------------------- */
/* demo/test app */
static char filename[] = "/tmp/preadv-XXXXXX";
static char outbuf[11] = "0123456789";
static char inbuf[11] = "----------";
static struct iovec ovec[2] = {{
.iov_base = outbuf + 5,
.iov_len = 5,
},{
.iov_base = outbuf + 0,
.iov_len = 5,
}};
static struct iovec ivec[3] = {{
.iov_base = inbuf + 6,
.iov_len = 2,
},{
.iov_base = inbuf + 4,
.iov_len = 2,
},{
.iov_base = inbuf + 2,
.iov_len = 2,
}};
void cleanup(void)
{
unlink(filename);
}
int main(int argc, char **argv)
{
int fd, rc;
fd = mkstemp(filename);
if (-1 == fd) {
perror("mkstemp");
exit(1);
}
atexit(cleanup);
/* write to file: "56789-01234" */
rc = pwritev(fd, ovec, 2, 0);
if (rc < 0) {
perror("pwritev");
exit(1);
}
/* read from file: "78-90-12" */
rc = preadv(fd, ivec, 3, 2);
if (rc < 0) {
perror("preadv");
exit(1);
}
printf("result : %s\n", inbuf);
printf("expected: %s\n", "--129078--");
exit(0);
}
This patch:
Factor out some code from compat_sys_readv() which can be shared with the
upcoming compat_sys_preadv().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <linux-api@vger.kernel.org>
Cc: <linux-arch@vger.kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 16:59:20 -07:00
|
|
|
ssize_t ret;
|
|
|
|
|
2009-04-02 16:59:25 -07:00
|
|
|
file = fget_light(fd, &fput_needed);
|
preadv/pwritev: create compat_readv()
This patch series:
Implement the preadv() and pwritev() syscalls. *BSD has this syscall for
quite some time.
Test code:
#if 0
set -x
gcc -Wall -O2 -o preadv $0
exit 0
#endif
/*
* preadv demo / test
*
* (c) 2008 Gerd Hoffmann <kraxel@redhat.com>
*
* build with "sh $thisfile"
*/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <inttypes.h>
#include <sys/uio.h>
/* ----------------------------------------------------------------- */
/* syscall windup */
#include <sys/syscall.h>
#if 0
/* WARNING: Be sure you know what you are doing if you enable this.
* linux syscall code isn't upstream yet, syscall numbers are subject
* to change */
# ifndef __NR_preadv
# ifdef __i386__
# define __NR_preadv 333
# define __NR_pwritev 334
# endif
# ifdef __x86_64__
# define __NR_preadv 295
# define __NR_pwritev 296
# endif
# endif
#endif
#ifndef __NR_preadv
# error preadv/pwritev syscall numbers are unknown
#endif
static ssize_t preadv(int fd, const struct iovec *iov, int iovcnt, off_t offset)
{
uint32_t pos_high = (offset >> 32) & 0xffffffff;
uint32_t pos_low = offset & 0xffffffff;
return syscall(__NR_preadv, fd, iov, iovcnt, pos_high, pos_low);
}
static ssize_t pwritev(int fd, const struct iovec *iov, int iovcnt, off_t offset)
{
uint32_t pos_high = (offset >> 32) & 0xffffffff;
uint32_t pos_low = offset & 0xffffffff;
return syscall(__NR_pwritev, fd, iov, iovcnt, pos_high, pos_low);
}
/* ----------------------------------------------------------------- */
/* demo/test app */
static char filename[] = "/tmp/preadv-XXXXXX";
static char outbuf[11] = "0123456789";
static char inbuf[11] = "----------";
static struct iovec ovec[2] = {{
.iov_base = outbuf + 5,
.iov_len = 5,
},{
.iov_base = outbuf + 0,
.iov_len = 5,
}};
static struct iovec ivec[3] = {{
.iov_base = inbuf + 6,
.iov_len = 2,
},{
.iov_base = inbuf + 4,
.iov_len = 2,
},{
.iov_base = inbuf + 2,
.iov_len = 2,
}};
void cleanup(void)
{
unlink(filename);
}
int main(int argc, char **argv)
{
int fd, rc;
fd = mkstemp(filename);
if (-1 == fd) {
perror("mkstemp");
exit(1);
}
atexit(cleanup);
/* write to file: "56789-01234" */
rc = pwritev(fd, ovec, 2, 0);
if (rc < 0) {
perror("pwritev");
exit(1);
}
/* read from file: "78-90-12" */
rc = preadv(fd, ivec, 3, 2);
if (rc < 0) {
perror("preadv");
exit(1);
}
printf("result : %s\n", inbuf);
printf("expected: %s\n", "--129078--");
exit(0);
}
This patch:
Factor out some code from compat_sys_readv() which can be shared with the
upcoming compat_sys_preadv().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <linux-api@vger.kernel.org>
Cc: <linux-arch@vger.kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-02 16:59:20 -07:00
|
|
|
if (!file)
|
|
|
|
return -EBADF;
|
|
|
|
ret = compat_readv(file, vec, vlen, &file->f_pos);
|
2009-04-02 16:59:25 -07:00
|
|
|
fput_light(file, fput_needed);
|
2005-04-16 15:20:36 -07:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2009-04-02 16:59:23 -07:00
|
|
|
asmlinkage ssize_t
|
|
|
|
compat_sys_preadv(unsigned long fd, const struct compat_iovec __user *vec,
|
Make non-compat preadv/pwritev use native register size
Instead of always splitting the file offset into 32-bit 'high' and 'low'
parts, just split them into the largest natural word-size - which in C
terms is 'unsigned long'.
This allows 64-bit architectures to avoid the unnecessary 32-bit
shifting and masking for native format (while the compat interfaces will
obviously always have to do it).
This also changes the order of 'high' and 'low' to be "low first". Why?
Because when we have it like this, the 64-bit system calls now don't use
the "pos_high" argument at all, and it makes more sense for the native
system call to simply match the user-mode prototype.
This results in a much more natural calling convention, and allows the
compiler to generate much more straightforward code. On x86-64, we now
generate
testq %rcx, %rcx # pos_l
js .L122 #,
movq %rcx, -48(%rbp) # pos_l, pos
from the C source
loff_t pos = pos_from_hilo(pos_h, pos_l);
...
if (pos < 0)
return -EINVAL;
and the 'pos_h' register isn't even touched. It used to generate code
like
mov %r8d, %r8d # pos_low, pos_low
salq $32, %rcx #, tmp71
movq %r8, %rax # pos_low, pos.386
orq %rcx, %rax # tmp71, pos.386
js .L122 #,
movq %rax, -48(%rbp) # pos.386, pos
which isn't _that_ horrible, but it does show how the natural word size
is just a more sensible interface (same arguments will hold in the user
level glibc wrapper function, of course, so the kernel side is just half
of the equation!)
Note: in all cases the user code wrapper can again be the same. You can
just do
#define HALF_BITS (sizeof(unsigned long)*4)
__syscall(PWRITEV, fd, iov, count, offset, (offset >> HALF_BITS) >> HALF_BITS);
or something like that. That way the user mode wrapper will also be
nicely passing in a zero (it won't actually have to do the shifts, the
compiler will understand what is going on) for the last argument.
And that is a good idea, even if nobody will necessarily ever care: if
we ever do move to a 128-bit lloff_t, this particular system call might
be left alone. Of course, that will be the least of our worries if we
really ever need to care, so this may not be worth really caring about.
[ Fixed for lost 'loff_t' cast noticed by Andrew Morton ]
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-api@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ralf Baechle <ralf@linux-mips.org>>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-03 08:03:22 -07:00
|
|
|
unsigned long vlen, u32 pos_low, u32 pos_high)
|
2009-04-02 16:59:23 -07:00
|
|
|
{
|
|
|
|
loff_t pos = ((loff_t)pos_high << 32) | pos_low;
|
|
|
|
struct file *file;
|
2009-04-02 16:59:25 -07:00
|
|
|
int fput_needed;
|
2009-04-02 16:59:23 -07:00
|
|
|
ssize_t ret;
|
|
|
|
|
|
|
|
if (pos < 0)
|
|
|
|
return -EINVAL;
|
2009-04-02 16:59:25 -07:00
|
|
|
file = fget_light(fd, &fput_needed);
|
2009-04-02 16:59:23 -07:00
|
|
|
if (!file)
|
|
|
|
return -EBADF;
|
2011-03-13 01:50:58 -05:00
|
|
|
ret = -ESPIPE;
|
|
|
|
if (file->f_mode & FMODE_PREAD)
|
|
|
|
ret = compat_readv(file, vec, vlen, &pos);
|
2009-04-02 16:59:25 -07:00
|
|
|
fput_light(file, fput_needed);
|
2009-04-02 16:59:23 -07:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2009-04-02 16:59:21 -07:00
|
|
|
static size_t compat_writev(struct file *file,
|
|
|
|
const struct compat_iovec __user *vec,
|
|
|
|
unsigned long vlen, loff_t *pos)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
|
|
|
ssize_t ret = -EBADF;
|
|
|
|
|
|
|
|
if (!(file->f_mode & FMODE_WRITE))
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
ret = -EINVAL;
|
2006-09-30 23:28:47 -07:00
|
|
|
if (!file->f_op || (!file->f_op->aio_write && !file->f_op->write))
|
2005-04-16 15:20:36 -07:00
|
|
|
goto out;
|
|
|
|
|
2009-04-02 16:59:21 -07:00
|
|
|
ret = compat_do_readv_writev(WRITE, file, vec, vlen, pos);
|
2005-04-16 15:20:36 -07:00
|
|
|
|
|
|
|
out:
|
2009-01-06 14:41:09 -08:00
|
|
|
if (ret > 0)
|
|
|
|
add_wchar(current, ret);
|
|
|
|
inc_syscw(current);
|
2009-04-02 16:59:21 -07:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
asmlinkage ssize_t
|
|
|
|
compat_sys_writev(unsigned long fd, const struct compat_iovec __user *vec,
|
|
|
|
unsigned long vlen)
|
|
|
|
{
|
|
|
|
struct file *file;
|
2009-04-02 16:59:25 -07:00
|
|
|
int fput_needed;
|
2009-04-02 16:59:21 -07:00
|
|
|
ssize_t ret;
|
|
|
|
|
2009-04-02 16:59:25 -07:00
|
|
|
file = fget_light(fd, &fput_needed);
|
2009-04-02 16:59:21 -07:00
|
|
|
if (!file)
|
|
|
|
return -EBADF;
|
|
|
|
ret = compat_writev(file, vec, vlen, &file->f_pos);
|
2009-04-02 16:59:25 -07:00
|
|
|
fput_light(file, fput_needed);
|
2005-04-16 15:20:36 -07:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2009-04-02 16:59:23 -07:00
|
|
|
asmlinkage ssize_t
|
|
|
|
compat_sys_pwritev(unsigned long fd, const struct compat_iovec __user *vec,
|
Make non-compat preadv/pwritev use native register size
Instead of always splitting the file offset into 32-bit 'high' and 'low'
parts, just split them into the largest natural word-size - which in C
terms is 'unsigned long'.
This allows 64-bit architectures to avoid the unnecessary 32-bit
shifting and masking for native format (while the compat interfaces will
obviously always have to do it).
This also changes the order of 'high' and 'low' to be "low first". Why?
Because when we have it like this, the 64-bit system calls now don't use
the "pos_high" argument at all, and it makes more sense for the native
system call to simply match the user-mode prototype.
This results in a much more natural calling convention, and allows the
compiler to generate much more straightforward code. On x86-64, we now
generate
testq %rcx, %rcx # pos_l
js .L122 #,
movq %rcx, -48(%rbp) # pos_l, pos
from the C source
loff_t pos = pos_from_hilo(pos_h, pos_l);
...
if (pos < 0)
return -EINVAL;
and the 'pos_h' register isn't even touched. It used to generate code
like
mov %r8d, %r8d # pos_low, pos_low
salq $32, %rcx #, tmp71
movq %r8, %rax # pos_low, pos.386
orq %rcx, %rax # tmp71, pos.386
js .L122 #,
movq %rax, -48(%rbp) # pos.386, pos
which isn't _that_ horrible, but it does show how the natural word size
is just a more sensible interface (same arguments will hold in the user
level glibc wrapper function, of course, so the kernel side is just half
of the equation!)
Note: in all cases the user code wrapper can again be the same. You can
just do
#define HALF_BITS (sizeof(unsigned long)*4)
__syscall(PWRITEV, fd, iov, count, offset, (offset >> HALF_BITS) >> HALF_BITS);
or something like that. That way the user mode wrapper will also be
nicely passing in a zero (it won't actually have to do the shifts, the
compiler will understand what is going on) for the last argument.
And that is a good idea, even if nobody will necessarily ever care: if
we ever do move to a 128-bit lloff_t, this particular system call might
be left alone. Of course, that will be the least of our worries if we
really ever need to care, so this may not be worth really caring about.
[ Fixed for lost 'loff_t' cast noticed by Andrew Morton ]
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-api@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Ralf Baechle <ralf@linux-mips.org>>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-03 08:03:22 -07:00
|
|
|
unsigned long vlen, u32 pos_low, u32 pos_high)
|
2009-04-02 16:59:23 -07:00
|
|
|
{
|
|
|
|
loff_t pos = ((loff_t)pos_high << 32) | pos_low;
|
|
|
|
struct file *file;
|
2009-04-02 16:59:25 -07:00
|
|
|
int fput_needed;
|
2009-04-02 16:59:23 -07:00
|
|
|
ssize_t ret;
|
|
|
|
|
|
|
|
if (pos < 0)
|
|
|
|
return -EINVAL;
|
2009-04-02 16:59:25 -07:00
|
|
|
file = fget_light(fd, &fput_needed);
|
2009-04-02 16:59:23 -07:00
|
|
|
if (!file)
|
|
|
|
return -EBADF;
|
2011-03-13 01:50:58 -05:00
|
|
|
ret = -ESPIPE;
|
|
|
|
if (file->f_mode & FMODE_PWRITE)
|
|
|
|
ret = compat_writev(file, vec, vlen, &pos);
|
2009-04-02 16:59:25 -07:00
|
|
|
fput_light(file, fput_needed);
|
2009-04-02 16:59:23 -07:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2006-05-01 12:15:48 -07:00
|
|
|
asmlinkage long
|
|
|
|
compat_sys_vmsplice(int fd, const struct compat_iovec __user *iov32,
|
|
|
|
unsigned int nr_segs, unsigned int flags)
|
|
|
|
{
|
|
|
|
unsigned i;
|
2006-10-10 22:44:17 +01:00
|
|
|
struct iovec __user *iov;
|
2006-05-04 09:13:49 +02:00
|
|
|
if (nr_segs > UIO_MAXIOV)
|
2006-05-01 12:15:48 -07:00
|
|
|
return -EINVAL;
|
|
|
|
iov = compat_alloc_user_space(nr_segs * sizeof(struct iovec));
|
|
|
|
for (i = 0; i < nr_segs; i++) {
|
|
|
|
struct compat_iovec v;
|
|
|
|
if (get_user(v.iov_base, &iov32[i].iov_base) ||
|
|
|
|
get_user(v.iov_len, &iov32[i].iov_len) ||
|
|
|
|
put_user(compat_ptr(v.iov_base), &iov[i].iov_base) ||
|
|
|
|
put_user(v.iov_len, &iov[i].iov_len))
|
|
|
|
return -EFAULT;
|
|
|
|
}
|
|
|
|
return sys_vmsplice(fd, iov, nr_segs, flags);
|
|
|
|
}
|
|
|
|
|
2005-09-06 15:18:25 -07:00
|
|
|
/*
|
|
|
|
* Exactly like fs/open.c:sys_open(), except that it doesn't set the
|
|
|
|
* O_LARGEFILE flag.
|
|
|
|
*/
|
|
|
|
asmlinkage long
|
|
|
|
compat_sys_open(const char __user *filename, int flags, int mode)
|
|
|
|
{
|
2006-01-18 17:43:53 -08:00
|
|
|
return do_sys_open(AT_FDCWD, filename, flags, mode);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Exactly like fs/open.c:sys_openat(), except that it doesn't set the
|
|
|
|
* O_LARGEFILE flag.
|
|
|
|
*/
|
|
|
|
asmlinkage long
|
2006-02-02 16:11:51 +11:00
|
|
|
compat_sys_openat(unsigned int dfd, const char __user *filename, int flags, int mode)
|
2006-01-18 17:43:53 -08:00
|
|
|
{
|
|
|
|
return do_sys_open(dfd, filename, flags, mode);
|
2005-09-06 15:18:25 -07:00
|
|
|
}
|
|
|
|
|
2005-04-16 15:20:36 -07:00
|
|
|
#define __COMPAT_NFDBITS (8 * sizeof(compat_ulong_t))
|
|
|
|
|
2008-08-31 08:16:57 -07:00
|
|
|
static int poll_select_copy_remaining(struct timespec *end_time, void __user *p,
|
|
|
|
int timeval, int ret)
|
|
|
|
{
|
|
|
|
struct timespec ts;
|
|
|
|
|
|
|
|
if (!p)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
if (current->personality & STICKY_TIMEOUTS)
|
|
|
|
goto sticky;
|
|
|
|
|
|
|
|
/* No update for zero timeout */
|
|
|
|
if (!end_time->tv_sec && !end_time->tv_nsec)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
ktime_get_ts(&ts);
|
|
|
|
ts = timespec_sub(*end_time, ts);
|
|
|
|
if (ts.tv_sec < 0)
|
|
|
|
ts.tv_sec = ts.tv_nsec = 0;
|
|
|
|
|
|
|
|
if (timeval) {
|
|
|
|
struct compat_timeval rtv;
|
|
|
|
|
|
|
|
rtv.tv_sec = ts.tv_sec;
|
|
|
|
rtv.tv_usec = ts.tv_nsec / NSEC_PER_USEC;
|
|
|
|
|
|
|
|
if (!copy_to_user(p, &rtv, sizeof(rtv)))
|
|
|
|
return ret;
|
|
|
|
} else {
|
|
|
|
struct compat_timespec rts;
|
|
|
|
|
|
|
|
rts.tv_sec = ts.tv_sec;
|
|
|
|
rts.tv_nsec = ts.tv_nsec;
|
|
|
|
|
|
|
|
if (!copy_to_user(p, &rts, sizeof(rts)))
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
/*
|
|
|
|
* If an application puts its timeval in read-only memory, we
|
|
|
|
* don't want the Linux-specific update to the timeval to
|
|
|
|
* cause a fault after the select has completed
|
|
|
|
* successfully. However, because we're not updating the
|
|
|
|
* timeval, we can't restart the system call.
|
|
|
|
*/
|
|
|
|
|
|
|
|
sticky:
|
|
|
|
if (ret == -ERESTARTNOHAND)
|
|
|
|
ret = -EINTR;
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2005-04-16 15:20:36 -07:00
|
|
|
/*
|
|
|
|
* Ooo, nasty. We need here to frob 32-bit unsigned longs to
|
|
|
|
* 64-bit unsigned longs.
|
|
|
|
*/
|
2006-01-14 13:20:43 -08:00
|
|
|
static
|
2005-04-16 15:20:36 -07:00
|
|
|
int compat_get_fd_set(unsigned long nr, compat_ulong_t __user *ufdset,
|
|
|
|
unsigned long *fdset)
|
|
|
|
{
|
2007-05-08 00:29:02 -07:00
|
|
|
nr = DIV_ROUND_UP(nr, __COMPAT_NFDBITS);
|
2005-04-16 15:20:36 -07:00
|
|
|
if (ufdset) {
|
|
|
|
unsigned long odd;
|
|
|
|
|
|
|
|
if (!access_ok(VERIFY_WRITE, ufdset, nr*sizeof(compat_ulong_t)))
|
|
|
|
return -EFAULT;
|
|
|
|
|
|
|
|
odd = nr & 1UL;
|
|
|
|
nr &= ~1UL;
|
|
|
|
while (nr) {
|
|
|
|
unsigned long h, l;
|
2006-12-06 20:36:36 -08:00
|
|
|
if (__get_user(l, ufdset) || __get_user(h, ufdset+1))
|
|
|
|
return -EFAULT;
|
2005-04-16 15:20:36 -07:00
|
|
|
ufdset += 2;
|
|
|
|
*fdset++ = h << 32 | l;
|
|
|
|
nr -= 2;
|
|
|
|
}
|
2006-12-06 20:36:36 -08:00
|
|
|
if (odd && __get_user(*fdset, ufdset))
|
|
|
|
return -EFAULT;
|
2005-04-16 15:20:36 -07:00
|
|
|
} else {
|
|
|
|
/* Tricky, must clear full unsigned long in the
|
|
|
|
* kernel fdset at the end, this makes sure that
|
|
|
|
* actually happens.
|
|
|
|
*/
|
|
|
|
memset(fdset, 0, ((nr + 1) & ~1)*sizeof(compat_ulong_t));
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2006-01-14 13:20:43 -08:00
|
|
|
static
|
2006-12-06 20:36:36 -08:00
|
|
|
int compat_set_fd_set(unsigned long nr, compat_ulong_t __user *ufdset,
|
|
|
|
unsigned long *fdset)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
|
|
|
unsigned long odd;
|
2007-05-08 00:29:02 -07:00
|
|
|
nr = DIV_ROUND_UP(nr, __COMPAT_NFDBITS);
|
2005-04-16 15:20:36 -07:00
|
|
|
|
|
|
|
if (!ufdset)
|
2006-12-06 20:36:36 -08:00
|
|
|
return 0;
|
2005-04-16 15:20:36 -07:00
|
|
|
|
|
|
|
odd = nr & 1UL;
|
|
|
|
nr &= ~1UL;
|
|
|
|
while (nr) {
|
|
|
|
unsigned long h, l;
|
|
|
|
l = *fdset++;
|
|
|
|
h = l >> 32;
|
2006-12-06 20:36:36 -08:00
|
|
|
if (__put_user(l, ufdset) || __put_user(h, ufdset+1))
|
|
|
|
return -EFAULT;
|
2005-04-16 15:20:36 -07:00
|
|
|
ufdset += 2;
|
|
|
|
nr -= 2;
|
|
|
|
}
|
2006-12-06 20:36:36 -08:00
|
|
|
if (odd && __put_user(*fdset, ufdset))
|
|
|
|
return -EFAULT;
|
|
|
|
return 0;
|
2005-04-16 15:20:36 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
* This is a virtual copy of sys_select from fs/select.c and probably
|
|
|
|
* should be compared to it from time to time
|
|
|
|
*/
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We can actually return ERESTARTSYS instead of EINTR, but I'd
|
|
|
|
* like to be certain this leads to no problems. So I return
|
|
|
|
* EINTR just for safety.
|
|
|
|
*
|
|
|
|
* Update: ERESTARTSYS breaks at least the xview clock binary, so
|
|
|
|
* I'm trying ERESTARTNOHAND which restart only when you want to.
|
|
|
|
*/
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
int compat_core_sys_select(int n, compat_ulong_t __user *inp,
|
2008-08-31 08:26:40 -07:00
|
|
|
compat_ulong_t __user *outp, compat_ulong_t __user *exp,
|
|
|
|
struct timespec *end_time)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
|
|
|
fd_set_bits fds;
|
2007-05-23 13:57:45 -07:00
|
|
|
void *bits;
|
2006-12-10 02:21:12 -08:00
|
|
|
int size, max_fds, ret = -EINVAL;
|
2005-09-09 15:10:52 -07:00
|
|
|
struct fdtable *fdt;
|
2007-05-23 13:57:45 -07:00
|
|
|
long stack_fds[SELECT_STACK_ALLOC/sizeof(long)];
|
2005-04-16 15:20:36 -07:00
|
|
|
|
|
|
|
if (n < 0)
|
|
|
|
goto out_nofds;
|
|
|
|
|
2006-12-10 02:21:12 -08:00
|
|
|
/* max_fds can increase, so grab it once to avoid race */
|
2005-09-09 15:42:34 -07:00
|
|
|
rcu_read_lock();
|
2005-09-09 15:10:52 -07:00
|
|
|
fdt = files_fdtable(current->files);
|
2006-12-10 02:21:12 -08:00
|
|
|
max_fds = fdt->max_fds;
|
2005-09-09 15:42:34 -07:00
|
|
|
rcu_read_unlock();
|
2006-12-10 02:21:12 -08:00
|
|
|
if (n > max_fds)
|
|
|
|
n = max_fds;
|
2005-04-16 15:20:36 -07:00
|
|
|
|
|
|
|
/*
|
|
|
|
* We need 6 bitmaps (in/out/ex for both incoming and outgoing),
|
|
|
|
* since we used fdset we need to allocate memory in units of
|
|
|
|
* long-words.
|
|
|
|
*/
|
|
|
|
size = FDS_BYTES(n);
|
2007-05-23 13:57:45 -07:00
|
|
|
bits = stack_fds;
|
|
|
|
if (size > sizeof(stack_fds) / 6) {
|
|
|
|
bits = kmalloc(6 * size, GFP_KERNEL);
|
|
|
|
ret = -ENOMEM;
|
|
|
|
if (!bits)
|
|
|
|
goto out_nofds;
|
|
|
|
}
|
2005-04-16 15:20:36 -07:00
|
|
|
fds.in = (unsigned long *) bits;
|
|
|
|
fds.out = (unsigned long *) (bits + size);
|
|
|
|
fds.ex = (unsigned long *) (bits + 2*size);
|
|
|
|
fds.res_in = (unsigned long *) (bits + 3*size);
|
|
|
|
fds.res_out = (unsigned long *) (bits + 4*size);
|
|
|
|
fds.res_ex = (unsigned long *) (bits + 5*size);
|
|
|
|
|
|
|
|
if ((ret = compat_get_fd_set(n, inp, fds.in)) ||
|
|
|
|
(ret = compat_get_fd_set(n, outp, fds.out)) ||
|
|
|
|
(ret = compat_get_fd_set(n, exp, fds.ex)))
|
|
|
|
goto out;
|
|
|
|
zero_fd_set(n, fds.res_in);
|
|
|
|
zero_fd_set(n, fds.res_out);
|
|
|
|
zero_fd_set(n, fds.res_ex);
|
|
|
|
|
2008-08-31 08:26:40 -07:00
|
|
|
ret = do_select(n, &fds, end_time);
|
2005-04-16 15:20:36 -07:00
|
|
|
|
|
|
|
if (ret < 0)
|
|
|
|
goto out;
|
|
|
|
if (!ret) {
|
|
|
|
ret = -ERESTARTNOHAND;
|
|
|
|
if (signal_pending(current))
|
|
|
|
goto out;
|
|
|
|
ret = 0;
|
|
|
|
}
|
|
|
|
|
2006-12-06 20:36:36 -08:00
|
|
|
if (compat_set_fd_set(n, inp, fds.res_in) ||
|
|
|
|
compat_set_fd_set(n, outp, fds.res_out) ||
|
|
|
|
compat_set_fd_set(n, exp, fds.res_ex))
|
|
|
|
ret = -EFAULT;
|
2005-04-16 15:20:36 -07:00
|
|
|
out:
|
2007-05-23 13:57:45 -07:00
|
|
|
if (bits != stack_fds)
|
|
|
|
kfree(bits);
|
2005-04-16 15:20:36 -07:00
|
|
|
out_nofds:
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
asmlinkage long compat_sys_select(int n, compat_ulong_t __user *inp,
|
|
|
|
compat_ulong_t __user *outp, compat_ulong_t __user *exp,
|
|
|
|
struct compat_timeval __user *tvp)
|
|
|
|
{
|
2008-08-31 08:26:40 -07:00
|
|
|
struct timespec end_time, *to = NULL;
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
struct compat_timeval tv;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (tvp) {
|
|
|
|
if (copy_from_user(&tv, tvp, sizeof(tv)))
|
|
|
|
return -EFAULT;
|
|
|
|
|
2008-08-31 08:26:40 -07:00
|
|
|
to = &end_time;
|
2008-10-25 12:41:41 -07:00
|
|
|
if (poll_select_set_timeout(to,
|
|
|
|
tv.tv_sec + (tv.tv_usec / USEC_PER_SEC),
|
|
|
|
(tv.tv_usec % USEC_PER_SEC) * NSEC_PER_USEC))
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
2008-08-31 08:26:40 -07:00
|
|
|
ret = compat_core_sys_select(n, inp, outp, exp, to);
|
|
|
|
ret = poll_select_copy_remaining(&end_time, tvp, 1, ret);
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2010-03-10 15:21:13 -08:00
|
|
|
struct compat_sel_arg_struct {
|
|
|
|
compat_ulong_t n;
|
|
|
|
compat_uptr_t inp;
|
|
|
|
compat_uptr_t outp;
|
|
|
|
compat_uptr_t exp;
|
|
|
|
compat_uptr_t tvp;
|
|
|
|
};
|
|
|
|
|
|
|
|
asmlinkage long compat_sys_old_select(struct compat_sel_arg_struct __user *arg)
|
|
|
|
{
|
|
|
|
struct compat_sel_arg_struct a;
|
|
|
|
|
|
|
|
if (copy_from_user(&a, arg, sizeof(a)))
|
|
|
|
return -EFAULT;
|
|
|
|
return compat_sys_select(a.n, compat_ptr(a.inp), compat_ptr(a.outp),
|
|
|
|
compat_ptr(a.exp), compat_ptr(a.tvp));
|
|
|
|
}
|
|
|
|
|
2008-04-30 00:53:09 -07:00
|
|
|
#ifdef HAVE_SET_RESTORE_SIGMASK
|
2009-01-14 14:13:57 +01:00
|
|
|
static long do_compat_pselect(int n, compat_ulong_t __user *inp,
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
compat_ulong_t __user *outp, compat_ulong_t __user *exp,
|
|
|
|
struct compat_timespec __user *tsp, compat_sigset_t __user *sigmask,
|
|
|
|
compat_size_t sigsetsize)
|
|
|
|
{
|
|
|
|
compat_sigset_t ss32;
|
|
|
|
sigset_t ksigmask, sigsaved;
|
|
|
|
struct compat_timespec ts;
|
2008-08-31 08:26:40 -07:00
|
|
|
struct timespec end_time, *to = NULL;
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (tsp) {
|
|
|
|
if (copy_from_user(&ts, tsp, sizeof(ts)))
|
|
|
|
return -EFAULT;
|
|
|
|
|
2008-08-31 08:26:40 -07:00
|
|
|
to = &end_time;
|
|
|
|
if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec))
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (sigmask) {
|
|
|
|
if (sigsetsize != sizeof(compat_sigset_t))
|
|
|
|
return -EINVAL;
|
|
|
|
if (copy_from_user(&ss32, sigmask, sizeof(ss32)))
|
|
|
|
return -EFAULT;
|
|
|
|
sigset_from_compat(&ksigmask, &ss32);
|
|
|
|
|
|
|
|
sigdelsetmask(&ksigmask, sigmask(SIGKILL)|sigmask(SIGSTOP));
|
|
|
|
sigprocmask(SIG_SETMASK, &ksigmask, &sigsaved);
|
|
|
|
}
|
|
|
|
|
2008-08-31 08:26:40 -07:00
|
|
|
ret = compat_core_sys_select(n, inp, outp, exp, to);
|
|
|
|
ret = poll_select_copy_remaining(&end_time, tsp, 0, ret);
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
|
|
|
|
if (ret == -ERESTARTNOHAND) {
|
|
|
|
/*
|
|
|
|
* Don't restore the signal mask yet. Let do_signal() deliver
|
|
|
|
* the signal on the way back to userspace, before the signal
|
|
|
|
* mask is restored.
|
|
|
|
*/
|
|
|
|
if (sigmask) {
|
|
|
|
memcpy(¤t->saved_sigmask, &sigsaved,
|
|
|
|
sizeof(sigsaved));
|
2008-04-30 00:53:06 -07:00
|
|
|
set_restore_sigmask();
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
}
|
|
|
|
} else if (sigmask)
|
|
|
|
sigprocmask(SIG_SETMASK, &sigsaved, NULL);
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
asmlinkage long compat_sys_pselect6(int n, compat_ulong_t __user *inp,
|
|
|
|
compat_ulong_t __user *outp, compat_ulong_t __user *exp,
|
|
|
|
struct compat_timespec __user *tsp, void __user *sig)
|
|
|
|
{
|
|
|
|
compat_size_t sigsetsize = 0;
|
|
|
|
compat_uptr_t up = 0;
|
|
|
|
|
|
|
|
if (sig) {
|
|
|
|
if (!access_ok(VERIFY_READ, sig,
|
|
|
|
sizeof(compat_uptr_t)+sizeof(compat_size_t)) ||
|
|
|
|
__get_user(up, (compat_uptr_t __user *)sig) ||
|
|
|
|
__get_user(sigsetsize,
|
|
|
|
(compat_size_t __user *)(sig+sizeof(up))))
|
|
|
|
return -EFAULT;
|
|
|
|
}
|
2009-01-14 14:13:57 +01:00
|
|
|
return do_compat_pselect(n, inp, outp, exp, tsp, compat_ptr(up),
|
|
|
|
sigsetsize);
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
}
|
|
|
|
|
|
|
|
asmlinkage long compat_sys_ppoll(struct pollfd __user *ufds,
|
|
|
|
unsigned int nfds, struct compat_timespec __user *tsp,
|
|
|
|
const compat_sigset_t __user *sigmask, compat_size_t sigsetsize)
|
|
|
|
{
|
|
|
|
compat_sigset_t ss32;
|
|
|
|
sigset_t ksigmask, sigsaved;
|
|
|
|
struct compat_timespec ts;
|
2008-08-31 08:26:40 -07:00
|
|
|
struct timespec end_time, *to = NULL;
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (tsp) {
|
|
|
|
if (copy_from_user(&ts, tsp, sizeof(ts)))
|
|
|
|
return -EFAULT;
|
|
|
|
|
2008-08-31 08:26:40 -07:00
|
|
|
to = &end_time;
|
|
|
|
if (poll_select_set_timeout(to, ts.tv_sec, ts.tv_nsec))
|
|
|
|
return -EINVAL;
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
}
|
|
|
|
|
|
|
|
if (sigmask) {
|
2006-05-15 09:44:27 -07:00
|
|
|
if (sigsetsize != sizeof(compat_sigset_t))
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
return -EINVAL;
|
|
|
|
if (copy_from_user(&ss32, sigmask, sizeof(ss32)))
|
|
|
|
return -EFAULT;
|
|
|
|
sigset_from_compat(&ksigmask, &ss32);
|
|
|
|
|
|
|
|
sigdelsetmask(&ksigmask, sigmask(SIGKILL)|sigmask(SIGSTOP));
|
|
|
|
sigprocmask(SIG_SETMASK, &ksigmask, &sigsaved);
|
|
|
|
}
|
|
|
|
|
2008-08-31 08:26:40 -07:00
|
|
|
ret = do_sys_poll(ufds, nfds, to);
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
|
|
|
|
/* We can restart this syscall, usually */
|
|
|
|
if (ret == -EINTR) {
|
|
|
|
/*
|
|
|
|
* Don't restore the signal mask yet. Let do_signal() deliver
|
|
|
|
* the signal on the way back to userspace, before the signal
|
|
|
|
* mask is restored.
|
|
|
|
*/
|
|
|
|
if (sigmask) {
|
|
|
|
memcpy(¤t->saved_sigmask, &sigsaved,
|
|
|
|
sizeof(sigsaved));
|
2008-04-30 00:53:06 -07:00
|
|
|
set_restore_sigmask();
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
}
|
|
|
|
ret = -ERESTARTNOHAND;
|
|
|
|
} else if (sigmask)
|
|
|
|
sigprocmask(SIG_SETMASK, &sigsaved, NULL);
|
|
|
|
|
2008-08-31 08:26:40 -07:00
|
|
|
ret = poll_select_copy_remaining(&end_time, tsp, 0, ret);
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
2008-04-30 00:53:09 -07:00
|
|
|
#endif /* HAVE_SET_RESTORE_SIGMASK */
|
[PATCH] Add pselect/ppoll system call implementation
The following implementation of ppoll() and pselect() system calls
depends on the architecture providing a TIF_RESTORE_SIGMASK flag in the
thread_info.
These system calls have to change the signal mask during their
operation, and signal handlers must be invoked using the new, temporary
signal mask. The old signal mask must be restored either upon successful
exit from the system call, or upon returning from the invoked signal
handler if the system call is interrupted. We can't simply restore the
original signal mask and return to userspace, since the restored signal
mask may actually block the signal which interrupted the system call.
The TIF_RESTORE_SIGMASK flag deals with this by causing the syscall exit
path to trap into do_signal() just as TIF_SIGPENDING does, and by
causing do_signal() to use the saved signal mask instead of the current
signal mask when setting up the stack frame for the signal handler -- or
by causing do_signal() to simply restore the saved signal mask in the
case where there is no handler to be invoked.
The first patch implements the sys_pselect() and sys_ppoll() system
calls, which are present only if TIF_RESTORE_SIGMASK is defined. That
#ifdef should go away in time when all architectures have implemented
it. The second patch implements TIF_RESTORE_SIGMASK for the PowerPC
kernel (in the -mm tree), and the third patch then removes the
arch-specific implementations of sys_rt_sigsuspend() and replaces them
with generic versions using the same trick.
The fourth and fifth patches, provided by David Howells, implement
TIF_RESTORE_SIGMASK for FR-V and i386 respectively, and the sixth patch
adds the syscalls to the i386 syscall table.
This patch:
Add the pselect() and ppoll() system calls, providing core routines usable by
the original select() and poll() system calls and also the new calls (with
their semantics w.r.t timeouts).
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-18 17:44:05 -08:00
|
|
|
|
2007-03-07 20:41:21 -08:00
|
|
|
#ifdef CONFIG_EPOLL
|
|
|
|
|
2008-04-30 00:53:09 -07:00
|
|
|
#ifdef HAVE_SET_RESTORE_SIGMASK
|
2007-03-07 20:41:21 -08:00
|
|
|
asmlinkage long compat_sys_epoll_pwait(int epfd,
|
|
|
|
struct compat_epoll_event __user *events,
|
|
|
|
int maxevents, int timeout,
|
|
|
|
const compat_sigset_t __user *sigmask,
|
|
|
|
compat_size_t sigsetsize)
|
|
|
|
{
|
|
|
|
long err;
|
|
|
|
compat_sigset_t csigmask;
|
|
|
|
sigset_t ksigmask, sigsaved;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If the caller wants a certain signal mask to be set during the wait,
|
|
|
|
* we apply it here.
|
|
|
|
*/
|
|
|
|
if (sigmask) {
|
|
|
|
if (sigsetsize != sizeof(compat_sigset_t))
|
|
|
|
return -EINVAL;
|
|
|
|
if (copy_from_user(&csigmask, sigmask, sizeof(csigmask)))
|
|
|
|
return -EFAULT;
|
|
|
|
sigset_from_compat(&ksigmask, &csigmask);
|
|
|
|
sigdelsetmask(&ksigmask, sigmask(SIGKILL) | sigmask(SIGSTOP));
|
|
|
|
sigprocmask(SIG_SETMASK, &ksigmask, &sigsaved);
|
|
|
|
}
|
|
|
|
|
|
|
|
err = sys_epoll_wait(epfd, events, maxevents, timeout);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If we changed the signal mask, we need to restore the original one.
|
|
|
|
* In case we've got a signal while waiting, we do not restore the
|
|
|
|
* signal mask yet, and we allow do_signal() to deliver the signal on
|
|
|
|
* the way back to userspace, before the signal mask is restored.
|
|
|
|
*/
|
|
|
|
if (sigmask) {
|
|
|
|
if (err == -EINTR) {
|
|
|
|
memcpy(¤t->saved_sigmask, &sigsaved,
|
|
|
|
sizeof(sigsaved));
|
2008-04-30 00:53:06 -07:00
|
|
|
set_restore_sigmask();
|
2007-03-07 20:41:21 -08:00
|
|
|
} else
|
|
|
|
sigprocmask(SIG_SETMASK, &sigsaved, NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
return err;
|
|
|
|
}
|
2008-04-30 00:53:09 -07:00
|
|
|
#endif /* HAVE_SET_RESTORE_SIGMASK */
|
2007-03-07 20:41:21 -08:00
|
|
|
|
|
|
|
#endif /* CONFIG_EPOLL */
|
2007-05-10 22:23:15 -07:00
|
|
|
|
|
|
|
#ifdef CONFIG_SIGNALFD
|
|
|
|
|
flag parameters: signalfd
This patch adds the new signalfd4 syscall. It extends the old signalfd
syscall by one parameter which is meant to hold a flag value. In this
patch the only flag support is SFD_CLOEXEC which causes the close-on-exec
flag for the returned file descriptor to be set.
A new name SFD_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_signalfd4
# ifdef __x86_64__
# define __NR_signalfd4 289
# elif defined __i386__
# define __NR_signalfd4 327
# else
# error "need __NR_signalfd4"
# endif
#endif
#define SFD_CLOEXEC O_CLOEXEC
int
main (void)
{
sigset_t ss;
sigemptyset (&ss);
sigaddset (&ss, SIGUSR1);
int fd = syscall (__NR_signalfd4, -1, &ss, 8, 0);
if (fd == -1)
{
puts ("signalfd4(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("signalfd4(0) set close-on-exec flag");
return 1;
}
close (fd);
fd = syscall (__NR_signalfd4, -1, &ss, 8, SFD_CLOEXEC);
if (fd == -1)
{
puts ("signalfd4(SFD_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("signalfd4(SFD_CLOEXEC) does not set close-on-exec flag");
return 1;
}
close (fd);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-23 21:29:24 -07:00
|
|
|
asmlinkage long compat_sys_signalfd4(int ufd,
|
|
|
|
const compat_sigset_t __user *sigmask,
|
|
|
|
compat_size_t sigsetsize, int flags)
|
2007-05-10 22:23:15 -07:00
|
|
|
{
|
|
|
|
compat_sigset_t ss32;
|
|
|
|
sigset_t tmp;
|
|
|
|
sigset_t __user *ksigmask;
|
|
|
|
|
|
|
|
if (sigsetsize != sizeof(compat_sigset_t))
|
|
|
|
return -EINVAL;
|
|
|
|
if (copy_from_user(&ss32, sigmask, sizeof(ss32)))
|
|
|
|
return -EFAULT;
|
|
|
|
sigset_from_compat(&tmp, &ss32);
|
|
|
|
ksigmask = compat_alloc_user_space(sizeof(sigset_t));
|
|
|
|
if (copy_to_user(ksigmask, &tmp, sizeof(sigset_t)))
|
|
|
|
return -EFAULT;
|
|
|
|
|
flag parameters: signalfd
This patch adds the new signalfd4 syscall. It extends the old signalfd
syscall by one parameter which is meant to hold a flag value. In this
patch the only flag support is SFD_CLOEXEC which causes the close-on-exec
flag for the returned file descriptor to be set.
A new name SFD_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_signalfd4
# ifdef __x86_64__
# define __NR_signalfd4 289
# elif defined __i386__
# define __NR_signalfd4 327
# else
# error "need __NR_signalfd4"
# endif
#endif
#define SFD_CLOEXEC O_CLOEXEC
int
main (void)
{
sigset_t ss;
sigemptyset (&ss);
sigaddset (&ss, SIGUSR1);
int fd = syscall (__NR_signalfd4, -1, &ss, 8, 0);
if (fd == -1)
{
puts ("signalfd4(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("signalfd4(0) set close-on-exec flag");
return 1;
}
close (fd);
fd = syscall (__NR_signalfd4, -1, &ss, 8, SFD_CLOEXEC);
if (fd == -1)
{
puts ("signalfd4(SFD_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("signalfd4(SFD_CLOEXEC) does not set close-on-exec flag");
return 1;
}
close (fd);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-23 21:29:24 -07:00
|
|
|
return sys_signalfd4(ufd, ksigmask, sizeof(sigset_t), flags);
|
2007-05-10 22:23:15 -07:00
|
|
|
}
|
|
|
|
|
flag parameters: signalfd
This patch adds the new signalfd4 syscall. It extends the old signalfd
syscall by one parameter which is meant to hold a flag value. In this
patch the only flag support is SFD_CLOEXEC which causes the close-on-exec
flag for the returned file descriptor to be set.
A new name SFD_CLOEXEC is introduced which in this implementation must
have the same value as O_CLOEXEC.
The following test must be adjusted for architectures other than x86 and
x86-64 and in case the syscall numbers changed.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#ifndef __NR_signalfd4
# ifdef __x86_64__
# define __NR_signalfd4 289
# elif defined __i386__
# define __NR_signalfd4 327
# else
# error "need __NR_signalfd4"
# endif
#endif
#define SFD_CLOEXEC O_CLOEXEC
int
main (void)
{
sigset_t ss;
sigemptyset (&ss);
sigaddset (&ss, SIGUSR1);
int fd = syscall (__NR_signalfd4, -1, &ss, 8, 0);
if (fd == -1)
{
puts ("signalfd4(0) failed");
return 1;
}
int coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if (coe & FD_CLOEXEC)
{
puts ("signalfd4(0) set close-on-exec flag");
return 1;
}
close (fd);
fd = syscall (__NR_signalfd4, -1, &ss, 8, SFD_CLOEXEC);
if (fd == -1)
{
puts ("signalfd4(SFD_CLOEXEC) failed");
return 1;
}
coe = fcntl (fd, F_GETFD);
if (coe == -1)
{
puts ("fcntl failed");
return 1;
}
if ((coe & FD_CLOEXEC) == 0)
{
puts ("signalfd4(SFD_CLOEXEC) does not set close-on-exec flag");
return 1;
}
close (fd);
puts ("OK");
return 0;
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[akpm@linux-foundation.org: add sys_ni stub]
Signed-off-by: Ulrich Drepper <drepper@redhat.com>
Acked-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-23 21:29:24 -07:00
|
|
|
asmlinkage long compat_sys_signalfd(int ufd,
|
|
|
|
const compat_sigset_t __user *sigmask,
|
|
|
|
compat_size_t sigsetsize)
|
|
|
|
{
|
|
|
|
return compat_sys_signalfd4(ufd, sigmask, sigsetsize, 0);
|
|
|
|
}
|
2007-05-10 22:23:15 -07:00
|
|
|
#endif /* CONFIG_SIGNALFD */
|
|
|
|
|
2007-05-10 22:23:18 -07:00
|
|
|
#ifdef CONFIG_TIMERFD
|
|
|
|
|
timerfd: new timerfd API
This is the new timerfd API as it is implemented by the following patch:
int timerfd_create(int clockid, int flags);
int timerfd_settime(int ufd, int flags,
const struct itimerspec *utmr,
struct itimerspec *otmr);
int timerfd_gettime(int ufd, struct itimerspec *otmr);
The timerfd_create() API creates an un-programmed timerfd fd. The "clockid"
parameter can be either CLOCK_MONOTONIC or CLOCK_REALTIME.
The timerfd_settime() API give new settings by the timerfd fd, by optionally
retrieving the previous expiration time (in case the "otmr" parameter is not
NULL).
The time value specified in "utmr" is absolute, if the TFD_TIMER_ABSTIME bit
is set in the "flags" parameter. Otherwise it's a relative time.
The timerfd_gettime() API returns the next expiration time of the timer, or
{0, 0} if the timerfd has not been set yet.
Like the previous timerfd API implementation, read(2) and poll(2) are
supported (with the same interface). Here's a simple test program I used to
exercise the new timerfd APIs:
http://www.xmailserver.org/timerfd-test2.c
[akpm@linux-foundation.org: coding-style cleanups]
[akpm@linux-foundation.org: fix ia64 build]
[akpm@linux-foundation.org: fix m68k build]
[akpm@linux-foundation.org: fix mips build]
[akpm@linux-foundation.org: fix alpha, arm, blackfin, cris, m68k, s390, sparc and sparc64 builds]
[heiko.carstens@de.ibm.com: fix s390]
[akpm@linux-foundation.org: fix powerpc build]
[akpm@linux-foundation.org: fix sparc64 more]
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-04 22:27:26 -08:00
|
|
|
asmlinkage long compat_sys_timerfd_settime(int ufd, int flags,
|
|
|
|
const struct compat_itimerspec __user *utmr,
|
|
|
|
struct compat_itimerspec __user *otmr)
|
2007-05-10 22:23:18 -07:00
|
|
|
{
|
timerfd: new timerfd API
This is the new timerfd API as it is implemented by the following patch:
int timerfd_create(int clockid, int flags);
int timerfd_settime(int ufd, int flags,
const struct itimerspec *utmr,
struct itimerspec *otmr);
int timerfd_gettime(int ufd, struct itimerspec *otmr);
The timerfd_create() API creates an un-programmed timerfd fd. The "clockid"
parameter can be either CLOCK_MONOTONIC or CLOCK_REALTIME.
The timerfd_settime() API give new settings by the timerfd fd, by optionally
retrieving the previous expiration time (in case the "otmr" parameter is not
NULL).
The time value specified in "utmr" is absolute, if the TFD_TIMER_ABSTIME bit
is set in the "flags" parameter. Otherwise it's a relative time.
The timerfd_gettime() API returns the next expiration time of the timer, or
{0, 0} if the timerfd has not been set yet.
Like the previous timerfd API implementation, read(2) and poll(2) are
supported (with the same interface). Here's a simple test program I used to
exercise the new timerfd APIs:
http://www.xmailserver.org/timerfd-test2.c
[akpm@linux-foundation.org: coding-style cleanups]
[akpm@linux-foundation.org: fix ia64 build]
[akpm@linux-foundation.org: fix m68k build]
[akpm@linux-foundation.org: fix mips build]
[akpm@linux-foundation.org: fix alpha, arm, blackfin, cris, m68k, s390, sparc and sparc64 builds]
[heiko.carstens@de.ibm.com: fix s390]
[akpm@linux-foundation.org: fix powerpc build]
[akpm@linux-foundation.org: fix sparc64 more]
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-04 22:27:26 -08:00
|
|
|
int error;
|
2007-05-10 22:23:18 -07:00
|
|
|
struct itimerspec t;
|
|
|
|
struct itimerspec __user *ut;
|
|
|
|
|
|
|
|
if (get_compat_itimerspec(&t, utmr))
|
2007-05-16 22:11:08 -07:00
|
|
|
return -EFAULT;
|
timerfd: new timerfd API
This is the new timerfd API as it is implemented by the following patch:
int timerfd_create(int clockid, int flags);
int timerfd_settime(int ufd, int flags,
const struct itimerspec *utmr,
struct itimerspec *otmr);
int timerfd_gettime(int ufd, struct itimerspec *otmr);
The timerfd_create() API creates an un-programmed timerfd fd. The "clockid"
parameter can be either CLOCK_MONOTONIC or CLOCK_REALTIME.
The timerfd_settime() API give new settings by the timerfd fd, by optionally
retrieving the previous expiration time (in case the "otmr" parameter is not
NULL).
The time value specified in "utmr" is absolute, if the TFD_TIMER_ABSTIME bit
is set in the "flags" parameter. Otherwise it's a relative time.
The timerfd_gettime() API returns the next expiration time of the timer, or
{0, 0} if the timerfd has not been set yet.
Like the previous timerfd API implementation, read(2) and poll(2) are
supported (with the same interface). Here's a simple test program I used to
exercise the new timerfd APIs:
http://www.xmailserver.org/timerfd-test2.c
[akpm@linux-foundation.org: coding-style cleanups]
[akpm@linux-foundation.org: fix ia64 build]
[akpm@linux-foundation.org: fix m68k build]
[akpm@linux-foundation.org: fix mips build]
[akpm@linux-foundation.org: fix alpha, arm, blackfin, cris, m68k, s390, sparc and sparc64 builds]
[heiko.carstens@de.ibm.com: fix s390]
[akpm@linux-foundation.org: fix powerpc build]
[akpm@linux-foundation.org: fix sparc64 more]
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-04 22:27:26 -08:00
|
|
|
ut = compat_alloc_user_space(2 * sizeof(struct itimerspec));
|
|
|
|
if (copy_to_user(&ut[0], &t, sizeof(t)))
|
2007-05-16 22:11:08 -07:00
|
|
|
return -EFAULT;
|
timerfd: new timerfd API
This is the new timerfd API as it is implemented by the following patch:
int timerfd_create(int clockid, int flags);
int timerfd_settime(int ufd, int flags,
const struct itimerspec *utmr,
struct itimerspec *otmr);
int timerfd_gettime(int ufd, struct itimerspec *otmr);
The timerfd_create() API creates an un-programmed timerfd fd. The "clockid"
parameter can be either CLOCK_MONOTONIC or CLOCK_REALTIME.
The timerfd_settime() API give new settings by the timerfd fd, by optionally
retrieving the previous expiration time (in case the "otmr" parameter is not
NULL).
The time value specified in "utmr" is absolute, if the TFD_TIMER_ABSTIME bit
is set in the "flags" parameter. Otherwise it's a relative time.
The timerfd_gettime() API returns the next expiration time of the timer, or
{0, 0} if the timerfd has not been set yet.
Like the previous timerfd API implementation, read(2) and poll(2) are
supported (with the same interface). Here's a simple test program I used to
exercise the new timerfd APIs:
http://www.xmailserver.org/timerfd-test2.c
[akpm@linux-foundation.org: coding-style cleanups]
[akpm@linux-foundation.org: fix ia64 build]
[akpm@linux-foundation.org: fix m68k build]
[akpm@linux-foundation.org: fix mips build]
[akpm@linux-foundation.org: fix alpha, arm, blackfin, cris, m68k, s390, sparc and sparc64 builds]
[heiko.carstens@de.ibm.com: fix s390]
[akpm@linux-foundation.org: fix powerpc build]
[akpm@linux-foundation.org: fix sparc64 more]
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-04 22:27:26 -08:00
|
|
|
error = sys_timerfd_settime(ufd, flags, &ut[0], &ut[1]);
|
|
|
|
if (!error && otmr)
|
|
|
|
error = (copy_from_user(&t, &ut[1], sizeof(struct itimerspec)) ||
|
|
|
|
put_compat_itimerspec(otmr, &t)) ? -EFAULT: 0;
|
|
|
|
|
|
|
|
return error;
|
|
|
|
}
|
|
|
|
|
|
|
|
asmlinkage long compat_sys_timerfd_gettime(int ufd,
|
|
|
|
struct compat_itimerspec __user *otmr)
|
|
|
|
{
|
|
|
|
int error;
|
|
|
|
struct itimerspec t;
|
|
|
|
struct itimerspec __user *ut;
|
2007-05-10 22:23:18 -07:00
|
|
|
|
timerfd: new timerfd API
This is the new timerfd API as it is implemented by the following patch:
int timerfd_create(int clockid, int flags);
int timerfd_settime(int ufd, int flags,
const struct itimerspec *utmr,
struct itimerspec *otmr);
int timerfd_gettime(int ufd, struct itimerspec *otmr);
The timerfd_create() API creates an un-programmed timerfd fd. The "clockid"
parameter can be either CLOCK_MONOTONIC or CLOCK_REALTIME.
The timerfd_settime() API give new settings by the timerfd fd, by optionally
retrieving the previous expiration time (in case the "otmr" parameter is not
NULL).
The time value specified in "utmr" is absolute, if the TFD_TIMER_ABSTIME bit
is set in the "flags" parameter. Otherwise it's a relative time.
The timerfd_gettime() API returns the next expiration time of the timer, or
{0, 0} if the timerfd has not been set yet.
Like the previous timerfd API implementation, read(2) and poll(2) are
supported (with the same interface). Here's a simple test program I used to
exercise the new timerfd APIs:
http://www.xmailserver.org/timerfd-test2.c
[akpm@linux-foundation.org: coding-style cleanups]
[akpm@linux-foundation.org: fix ia64 build]
[akpm@linux-foundation.org: fix m68k build]
[akpm@linux-foundation.org: fix mips build]
[akpm@linux-foundation.org: fix alpha, arm, blackfin, cris, m68k, s390, sparc and sparc64 builds]
[heiko.carstens@de.ibm.com: fix s390]
[akpm@linux-foundation.org: fix powerpc build]
[akpm@linux-foundation.org: fix sparc64 more]
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-04 22:27:26 -08:00
|
|
|
ut = compat_alloc_user_space(sizeof(struct itimerspec));
|
|
|
|
error = sys_timerfd_gettime(ufd, ut);
|
|
|
|
if (!error)
|
|
|
|
error = (copy_from_user(&t, ut, sizeof(struct itimerspec)) ||
|
|
|
|
put_compat_itimerspec(otmr, &t)) ? -EFAULT: 0;
|
|
|
|
|
|
|
|
return error;
|
2007-05-10 22:23:18 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
#endif /* CONFIG_TIMERFD */
|
2011-01-29 18:43:26 +05:30
|
|
|
|
|
|
|
#ifdef CONFIG_FHANDLE
|
|
|
|
/*
|
|
|
|
* Exactly like fs/open.c:sys_open_by_handle_at(), except that it
|
|
|
|
* doesn't set the O_LARGEFILE flag.
|
|
|
|
*/
|
|
|
|
asmlinkage long
|
|
|
|
compat_sys_open_by_handle_at(int mountdirfd,
|
|
|
|
struct file_handle __user *handle, int flags)
|
|
|
|
{
|
|
|
|
return do_handle_open(mountdirfd, handle, flags);
|
|
|
|
}
|
|
|
|
#endif
|