Andrew Morton ebcf28e1c7 [PATCH] fadvise(): write commands
Add two new linux-specific fadvise extensions():

LINUX_FADV_ASYNC_WRITE: start async writeout of any dirty pages between file
offsets `offset' and `offset+len'.  Any pages which are currently under
writeout are skipped, whether or not they are dirty.

LINUX_FADV_WRITE_WAIT: wait upon writeout of any dirty pages between file
offsets `offset' and `offset+len'.

By combining these two operations the application may do several things:

LINUX_FADV_ASYNC_WRITE: push some or all of the dirty pages at the disk.

LINUX_FADV_WRITE_WAIT, LINUX_FADV_ASYNC_WRITE: push all of the currently dirty
pages at the disk.

LINUX_FADV_WRITE_WAIT, LINUX_FADV_ASYNC_WRITE, LINUX_FADV_WRITE_WAIT: push all
of the currently dirty pages at the disk, wait until they have been written.

It should be noted that none of these operations write out the file's
metadata.  So unless the application is strictly performing overwrites of
already-instantiated disk blocks, there are no guarantees here that the data
will be available after a crash.

To complete this suite of operations I guess we should have a "sync file
metadata only" operation.  This gives applications access to all the building
blocks needed for all sorts of sync operations.  But sync-metadata doesn't fit
well with the fadvise() interface.  Probably it should be a new syscall:
sys_fmetadatasync().

The patch also diddles with the meaning of `endbyte' in sys_fadvise64_64().
It is made to represent that last affected byte in the file (ie: it is
inclusive).  Generally, all these byterange and pagerange functions are
inclusive so we can easily represent EOF with -1.

As Ulrich notes, these two functions are somewhat abusive of the fadvise()
concept, which appears to be "set the future policy for this fd".

But these commands are a perfect fit with the fadvise() impementation, and
several of the existing fadvise() commands are synchronous and don't affect
future policy either.   I think we can live with the slight incongruity.

Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-24 07:33:25 -08:00

28 lines
958 B
C

#ifndef FADVISE_H_INCLUDED
#define FADVISE_H_INCLUDED
#define POSIX_FADV_NORMAL 0 /* No further special treatment. */
#define POSIX_FADV_RANDOM 1 /* Expect random page references. */
#define POSIX_FADV_SEQUENTIAL 2 /* Expect sequential page references. */
#define POSIX_FADV_WILLNEED 3 /* Will need these pages. */
/*
* The advise values for POSIX_FADV_DONTNEED and POSIX_ADV_NOREUSE
* for s390-64 differ from the values for the rest of the world.
*/
#if defined(__s390x__)
#define POSIX_FADV_DONTNEED 6 /* Don't need these pages. */
#define POSIX_FADV_NOREUSE 7 /* Data will be accessed once. */
#else
#define POSIX_FADV_DONTNEED 4 /* Don't need these pages. */
#define POSIX_FADV_NOREUSE 5 /* Data will be accessed once. */
#endif
/*
* Linux-specific fadvise() extensions:
*/
#define LINUX_FADV_ASYNC_WRITE 32 /* Start writeout on range */
#define LINUX_FADV_WRITE_WAIT 33 /* Wait upon writeout to range */
#endif /* FADVISE_H_INCLUDED */