mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2025-01-16 09:56:46 +00:00
331 lines
14 KiB
Plaintext
331 lines
14 KiB
Plaintext
|
==========================
|
||
|
General Filesystem Caching
|
||
|
==========================
|
||
|
|
||
|
========
|
||
|
OVERVIEW
|
||
|
========
|
||
|
|
||
|
This facility is a general purpose cache for network filesystems, though it
|
||
|
could be used for caching other things such as ISO9660 filesystems too.
|
||
|
|
||
|
FS-Cache mediates between cache backends (such as CacheFS) and network
|
||
|
filesystems:
|
||
|
|
||
|
+---------+
|
||
|
| | +--------------+
|
||
|
| NFS |--+ | |
|
||
|
| | | +-->| CacheFS |
|
||
|
+---------+ | +----------+ | | /dev/hda5 |
|
||
|
| | | | +--------------+
|
||
|
+---------+ +-->| | |
|
||
|
| | | |--+
|
||
|
| AFS |----->| FS-Cache |
|
||
|
| | | |--+
|
||
|
+---------+ +-->| | |
|
||
|
| | | | +--------------+
|
||
|
+---------+ | +----------+ | | |
|
||
|
| | | +-->| CacheFiles |
|
||
|
| ISOFS |--+ | /var/cache |
|
||
|
| | +--------------+
|
||
|
+---------+
|
||
|
|
||
|
Or to look at it another way, FS-Cache is a module that provides a caching
|
||
|
facility to a network filesystem such that the cache is transparent to the
|
||
|
user:
|
||
|
|
||
|
+---------+
|
||
|
| |
|
||
|
| Server |
|
||
|
| |
|
||
|
+---------+
|
||
|
| NETWORK
|
||
|
~~~~~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
|
|
||
|
| +----------+
|
||
|
V | |
|
||
|
+---------+ | |
|
||
|
| | | |
|
||
|
| NFS |----->| FS-Cache |
|
||
|
| | | |--+
|
||
|
+---------+ | | | +--------------+ +--------------+
|
||
|
| | | | | | | |
|
||
|
V +----------+ +-->| CacheFiles |-->| Ext3 |
|
||
|
+---------+ | /var/cache | | /dev/sda6 |
|
||
|
| | +--------------+ +--------------+
|
||
|
| VFS | ^ ^
|
||
|
| | | |
|
||
|
+---------+ +--------------+ |
|
||
|
| KERNEL SPACE | |
|
||
|
~~~~~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|~~~~~~|~~~~
|
||
|
| USER SPACE | |
|
||
|
V | |
|
||
|
+---------+ +--------------+
|
||
|
| | | |
|
||
|
| Process | | cachefilesd |
|
||
|
| | | |
|
||
|
+---------+ +--------------+
|
||
|
|
||
|
|
||
|
FS-Cache does not follow the idea of completely loading every netfs file
|
||
|
opened in its entirety into a cache before permitting it to be accessed and
|
||
|
then serving the pages out of that cache rather than the netfs inode because:
|
||
|
|
||
|
(1) It must be practical to operate without a cache.
|
||
|
|
||
|
(2) The size of any accessible file must not be limited to the size of the
|
||
|
cache.
|
||
|
|
||
|
(3) The combined size of all opened files (this includes mapped libraries)
|
||
|
must not be limited to the size of the cache.
|
||
|
|
||
|
(4) The user should not be forced to download an entire file just to do a
|
||
|
one-off access of a small portion of it (such as might be done with the
|
||
|
"file" program).
|
||
|
|
||
|
It instead serves the cache out in PAGE_SIZE chunks as and when requested by
|
||
|
the netfs('s) using it.
|
||
|
|
||
|
|
||
|
FS-Cache provides the following facilities:
|
||
|
|
||
|
(1) More than one cache can be used at once. Caches can be selected
|
||
|
explicitly by use of tags.
|
||
|
|
||
|
(2) Caches can be added / removed at any time.
|
||
|
|
||
|
(3) The netfs is provided with an interface that allows either party to
|
||
|
withdraw caching facilities from a file (required for (2)).
|
||
|
|
||
|
(4) The interface to the netfs returns as few errors as possible, preferring
|
||
|
rather to let the netfs remain oblivious.
|
||
|
|
||
|
(5) Cookies are used to represent indices, files and other objects to the
|
||
|
netfs. The simplest cookie is just a NULL pointer - indicating nothing
|
||
|
cached there.
|
||
|
|
||
|
(6) The netfs is allowed to propose - dynamically - any index hierarchy it
|
||
|
desires, though it must be aware that the index search function is
|
||
|
recursive, stack space is limited, and indices can only be children of
|
||
|
indices.
|
||
|
|
||
|
(7) Data I/O is done direct to and from the netfs's pages. The netfs
|
||
|
indicates that page A is at index B of the data-file represented by cookie
|
||
|
C, and that it should be read or written. The cache backend may or may
|
||
|
not start I/O on that page, but if it does, a netfs callback will be
|
||
|
invoked to indicate completion. The I/O may be either synchronous or
|
||
|
asynchronous.
|
||
|
|
||
|
(8) Cookies can be "retired" upon release. At this point FS-Cache will mark
|
||
|
them as obsolete and the index hierarchy rooted at that point will get
|
||
|
recycled.
|
||
|
|
||
|
(9) The netfs provides a "match" function for index searches. In addition to
|
||
|
saying whether a match was made or not, this can also specify that an
|
||
|
entry should be updated or deleted.
|
||
|
|
||
|
(10) As much as possible is done asynchronously.
|
||
|
|
||
|
|
||
|
FS-Cache maintains a virtual indexing tree in which all indices, files, objects
|
||
|
and pages are kept. Bits of this tree may actually reside in one or more
|
||
|
caches.
|
||
|
|
||
|
FSDEF
|
||
|
|
|
||
|
+------------------------------------+
|
||
|
| |
|
||
|
NFS AFS
|
||
|
| |
|
||
|
+--------------------------+ +-----------+
|
||
|
| | | |
|
||
|
homedir mirror afs.org redhat.com
|
||
|
| | |
|
||
|
+------------+ +---------------+ +----------+
|
||
|
| | | | | |
|
||
|
00001 00002 00007 00125 vol00001 vol00002
|
||
|
| | | | |
|
||
|
+---+---+ +-----+ +---+ +------+------+ +-----+----+
|
||
|
| | | | | | | | | | | | |
|
||
|
PG0 PG1 PG2 PG0 XATTR PG0 PG1 DIRENT DIRENT DIRENT R/W R/O Bak
|
||
|
| |
|
||
|
PG0 +-------+
|
||
|
| |
|
||
|
00001 00003
|
||
|
|
|
||
|
+---+---+
|
||
|
| | |
|
||
|
PG0 PG1 PG2
|
||
|
|
||
|
In the example above, you can see two netfs's being backed: NFS and AFS. These
|
||
|
have different index hierarchies:
|
||
|
|
||
|
(*) The NFS primary index contains per-server indices. Each server index is
|
||
|
indexed by NFS file handles to get data file objects. Each data file
|
||
|
objects can have an array of pages, but may also have further child
|
||
|
objects, such as extended attributes and directory entries. Extended
|
||
|
attribute objects themselves have page-array contents.
|
||
|
|
||
|
(*) The AFS primary index contains per-cell indices. Each cell index contains
|
||
|
per-logical-volume indices. Each of volume index contains up to three
|
||
|
indices for the read-write, read-only and backup mirrors of those volumes.
|
||
|
Each of these contains vnode data file objects, each of which contains an
|
||
|
array of pages.
|
||
|
|
||
|
The very top index is the FS-Cache master index in which individual netfs's
|
||
|
have entries.
|
||
|
|
||
|
Any index object may reside in more than one cache, provided it only has index
|
||
|
children. Any index with non-index object children will be assumed to only
|
||
|
reside in one cache.
|
||
|
|
||
|
|
||
|
The netfs API to FS-Cache can be found in:
|
||
|
|
||
|
Documentation/filesystems/caching/netfs-api.txt
|
||
|
|
||
|
The cache backend API to FS-Cache can be found in:
|
||
|
|
||
|
Documentation/filesystems/caching/backend-api.txt
|
||
|
|
||
|
|
||
|
=======================
|
||
|
STATISTICAL INFORMATION
|
||
|
=======================
|
||
|
|
||
|
If FS-Cache is compiled with the following options enabled:
|
||
|
|
||
|
CONFIG_FSCACHE_PROC=y (implied by the following two)
|
||
|
CONFIG_FSCACHE_STATS=y
|
||
|
CONFIG_FSCACHE_HISTOGRAM=y
|
||
|
|
||
|
then it will gather certain statistics and display them through a number of
|
||
|
proc files.
|
||
|
|
||
|
(*) /proc/fs/fscache/stats
|
||
|
|
||
|
This shows counts of a number of events that can happen in FS-Cache:
|
||
|
|
||
|
CLASS EVENT MEANING
|
||
|
======= ======= =======================================================
|
||
|
Cookies idx=N Number of index cookies allocated
|
||
|
dat=N Number of data storage cookies allocated
|
||
|
spc=N Number of special cookies allocated
|
||
|
Objects alc=N Number of objects allocated
|
||
|
nal=N Number of object allocation failures
|
||
|
avl=N Number of objects that reached the available state
|
||
|
ded=N Number of objects that reached the dead state
|
||
|
ChkAux non=N Number of objects that didn't have a coherency check
|
||
|
ok=N Number of objects that passed a coherency check
|
||
|
upd=N Number of objects that needed a coherency data update
|
||
|
obs=N Number of objects that were declared obsolete
|
||
|
Pages mrk=N Number of pages marked as being cached
|
||
|
unc=N Number of uncache page requests seen
|
||
|
Acquire n=N Number of acquire cookie requests seen
|
||
|
nul=N Number of acq reqs given a NULL parent
|
||
|
noc=N Number of acq reqs rejected due to no cache available
|
||
|
ok=N Number of acq reqs succeeded
|
||
|
nbf=N Number of acq reqs rejected due to error
|
||
|
oom=N Number of acq reqs failed on ENOMEM
|
||
|
Lookups n=N Number of lookup calls made on cache backends
|
||
|
neg=N Number of negative lookups made
|
||
|
pos=N Number of positive lookups made
|
||
|
crt=N Number of objects created by lookup
|
||
|
Updates n=N Number of update cookie requests seen
|
||
|
nul=N Number of upd reqs given a NULL parent
|
||
|
run=N Number of upd reqs granted CPU time
|
||
|
Relinqs n=N Number of relinquish cookie requests seen
|
||
|
nul=N Number of rlq reqs given a NULL parent
|
||
|
wcr=N Number of rlq reqs waited on completion of creation
|
||
|
AttrChg n=N Number of attribute changed requests seen
|
||
|
ok=N Number of attr changed requests queued
|
||
|
nbf=N Number of attr changed rejected -ENOBUFS
|
||
|
oom=N Number of attr changed failed -ENOMEM
|
||
|
run=N Number of attr changed ops given CPU time
|
||
|
Allocs n=N Number of allocation requests seen
|
||
|
ok=N Number of successful alloc reqs
|
||
|
wt=N Number of alloc reqs that waited on lookup completion
|
||
|
nbf=N Number of alloc reqs rejected -ENOBUFS
|
||
|
ops=N Number of alloc reqs submitted
|
||
|
owt=N Number of alloc reqs waited for CPU time
|
||
|
Retrvls n=N Number of retrieval (read) requests seen
|
||
|
ok=N Number of successful retr reqs
|
||
|
wt=N Number of retr reqs that waited on lookup completion
|
||
|
nod=N Number of retr reqs returned -ENODATA
|
||
|
nbf=N Number of retr reqs rejected -ENOBUFS
|
||
|
int=N Number of retr reqs aborted -ERESTARTSYS
|
||
|
oom=N Number of retr reqs failed -ENOMEM
|
||
|
ops=N Number of retr reqs submitted
|
||
|
owt=N Number of retr reqs waited for CPU time
|
||
|
Stores n=N Number of storage (write) requests seen
|
||
|
ok=N Number of successful store reqs
|
||
|
agn=N Number of store reqs on a page already pending storage
|
||
|
nbf=N Number of store reqs rejected -ENOBUFS
|
||
|
oom=N Number of store reqs failed -ENOMEM
|
||
|
ops=N Number of store reqs submitted
|
||
|
run=N Number of store reqs granted CPU time
|
||
|
Ops pend=N Number of times async ops added to pending queues
|
||
|
run=N Number of times async ops given CPU time
|
||
|
enq=N Number of times async ops queued for processing
|
||
|
dfr=N Number of async ops queued for deferred release
|
||
|
rel=N Number of async ops released
|
||
|
gc=N Number of deferred-release async ops garbage collected
|
||
|
|
||
|
|
||
|
(*) /proc/fs/fscache/histogram
|
||
|
|
||
|
cat /proc/fs/fscache/histogram
|
||
|
+HZ +TIME OBJ INST OP RUNS OBJ RUNS RETRV DLY RETRIEVLS
|
||
|
===== ===== ========= ========= ========= ========= =========
|
||
|
|
||
|
This shows the breakdown of the number of times each amount of time
|
||
|
between 0 jiffies and HZ-1 jiffies a variety of tasks took to run. The
|
||
|
columns are as follows:
|
||
|
|
||
|
COLUMN TIME MEASUREMENT
|
||
|
======= =======================================================
|
||
|
OBJ INST Length of time to instantiate an object
|
||
|
OP RUNS Length of time a call to process an operation took
|
||
|
OBJ RUNS Length of time a call to process an object event took
|
||
|
RETRV DLY Time between an requesting a read and lookup completing
|
||
|
RETRIEVLS Time between beginning and end of a retrieval
|
||
|
|
||
|
Each row shows the number of events that took a particular range of times.
|
||
|
Each step is 1 jiffy in size. The +HZ column indicates the particular
|
||
|
jiffy range covered, and the +TIME field the equivalent number of seconds.
|
||
|
|
||
|
|
||
|
=========
|
||
|
DEBUGGING
|
||
|
=========
|
||
|
|
||
|
The FS-Cache facility can have runtime debugging enabled by adjusting the value
|
||
|
in:
|
||
|
|
||
|
/sys/module/fscache/parameters/debug
|
||
|
|
||
|
This is a bitmask of debugging streams to enable:
|
||
|
|
||
|
BIT VALUE STREAM POINT
|
||
|
======= ======= =============================== =======================
|
||
|
0 1 Cache management Function entry trace
|
||
|
1 2 Function exit trace
|
||
|
2 4 General
|
||
|
3 8 Cookie management Function entry trace
|
||
|
4 16 Function exit trace
|
||
|
5 32 General
|
||
|
6 64 Page handling Function entry trace
|
||
|
7 128 Function exit trace
|
||
|
8 256 General
|
||
|
9 512 Operation management Function entry trace
|
||
|
10 1024 Function exit trace
|
||
|
11 2048 General
|
||
|
|
||
|
The appropriate set of values should be OR'd together and the result written to
|
||
|
the control file. For example:
|
||
|
|
||
|
echo $((1|8|64)) >/sys/module/fscache/parameters/debug
|
||
|
|
||
|
will turn on all function entry debugging.
|
||
|
|