mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
synced 2025-01-10 07:00:48 +00:00
Merge branch 'master' into next
This commit is contained in:
commit
d905163c5b
10
.gitignore
vendored
10
.gitignore
vendored
@ -3,7 +3,7 @@
|
||||
# subdirectories here. Add them in the ".gitignore" file
|
||||
# in that subdirectory instead.
|
||||
#
|
||||
# NOTE! Please use 'git-ls-files -i --exclude-standard'
|
||||
# NOTE! Please use 'git ls-files -i --exclude-standard'
|
||||
# command after changing this file, to see if there are
|
||||
# any tracked files which get ignored after the change.
|
||||
#
|
||||
@ -25,6 +25,8 @@
|
||||
*.elf
|
||||
*.bin
|
||||
*.gz
|
||||
*.lzma
|
||||
*.patch
|
||||
|
||||
#
|
||||
# Top-level generic files
|
||||
@ -62,6 +64,12 @@ series
|
||||
cscope.*
|
||||
ncscope.*
|
||||
|
||||
# gnu global files
|
||||
GPATH
|
||||
GRTAGS
|
||||
GSYMS
|
||||
GTAGS
|
||||
|
||||
*.orig
|
||||
*~
|
||||
\#*#
|
||||
|
4
CREDITS
4
CREDITS
@ -1253,6 +1253,10 @@ S: 8124 Constitution Apt. 7
|
||||
S: Sterling Heights, Michigan 48313
|
||||
S: USA
|
||||
|
||||
N: Wolfgang Grandegger
|
||||
E: wg@grandegger.com
|
||||
D: Controller Area Network (device drivers)
|
||||
|
||||
N: William Greathouse
|
||||
E: wgreathouse@smva.com
|
||||
E: wgreathouse@myfavoritei.com
|
||||
|
@ -60,3 +60,62 @@ Description:
|
||||
Indicates whether the block layer should automatically
|
||||
generate checksums for write requests bound for
|
||||
devices that support receiving integrity metadata.
|
||||
|
||||
What: /sys/block/<disk>/alignment_offset
|
||||
Date: April 2009
|
||||
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
||||
Description:
|
||||
Storage devices may report a physical block size that is
|
||||
bigger than the logical block size (for instance a drive
|
||||
with 4KB physical sectors exposing 512-byte logical
|
||||
blocks to the operating system). This parameter
|
||||
indicates how many bytes the beginning of the device is
|
||||
offset from the disk's natural alignment.
|
||||
|
||||
What: /sys/block/<disk>/<partition>/alignment_offset
|
||||
Date: April 2009
|
||||
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
||||
Description:
|
||||
Storage devices may report a physical block size that is
|
||||
bigger than the logical block size (for instance a drive
|
||||
with 4KB physical sectors exposing 512-byte logical
|
||||
blocks to the operating system). This parameter
|
||||
indicates how many bytes the beginning of the partition
|
||||
is offset from the disk's natural alignment.
|
||||
|
||||
What: /sys/block/<disk>/queue/logical_block_size
|
||||
Date: May 2009
|
||||
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
||||
Description:
|
||||
This is the smallest unit the storage device can
|
||||
address. It is typically 512 bytes.
|
||||
|
||||
What: /sys/block/<disk>/queue/physical_block_size
|
||||
Date: May 2009
|
||||
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
||||
Description:
|
||||
This is the smallest unit the storage device can write
|
||||
without resorting to read-modify-write operation. It is
|
||||
usually the same as the logical block size but may be
|
||||
bigger. One example is SATA drives with 4KB sectors
|
||||
that expose a 512-byte logical block size to the
|
||||
operating system.
|
||||
|
||||
What: /sys/block/<disk>/queue/minimum_io_size
|
||||
Date: April 2009
|
||||
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
||||
Description:
|
||||
Storage devices may report a preferred minimum I/O size,
|
||||
which is the smallest request the device can perform
|
||||
without incurring a read-modify-write penalty. For disk
|
||||
drives this is often the physical block size. For RAID
|
||||
arrays it is often the stripe chunk size.
|
||||
|
||||
What: /sys/block/<disk>/queue/optimal_io_size
|
||||
Date: April 2009
|
||||
Contact: Martin K. Petersen <martin.petersen@oracle.com>
|
||||
Description:
|
||||
Storage devices may report an optimal I/O size, which is
|
||||
the device's preferred unit of receiving I/O. This is
|
||||
rarely reported for disk drives. For RAID devices it is
|
||||
usually the stripe width or the internal block size.
|
||||
|
33
Documentation/ABI/testing/sysfs-bus-pci-devices-cciss
Normal file
33
Documentation/ABI/testing/sysfs-bus-pci-devices-cciss
Normal file
@ -0,0 +1,33 @@
|
||||
Where: /sys/bus/pci/devices/<dev>/ccissX/cXdY/model
|
||||
Date: March 2009
|
||||
Kernel Version: 2.6.30
|
||||
Contact: iss_storagedev@hp.com
|
||||
Description: Displays the SCSI INQUIRY page 0 model for logical drive
|
||||
Y of controller X.
|
||||
|
||||
Where: /sys/bus/pci/devices/<dev>/ccissX/cXdY/rev
|
||||
Date: March 2009
|
||||
Kernel Version: 2.6.30
|
||||
Contact: iss_storagedev@hp.com
|
||||
Description: Displays the SCSI INQUIRY page 0 revision for logical
|
||||
drive Y of controller X.
|
||||
|
||||
Where: /sys/bus/pci/devices/<dev>/ccissX/cXdY/unique_id
|
||||
Date: March 2009
|
||||
Kernel Version: 2.6.30
|
||||
Contact: iss_storagedev@hp.com
|
||||
Description: Displays the SCSI INQUIRY page 83 serial number for logical
|
||||
drive Y of controller X.
|
||||
|
||||
Where: /sys/bus/pci/devices/<dev>/ccissX/cXdY/vendor
|
||||
Date: March 2009
|
||||
Kernel Version: 2.6.30
|
||||
Contact: iss_storagedev@hp.com
|
||||
Description: Displays the SCSI INQUIRY page 0 vendor for logical drive
|
||||
Y of controller X.
|
||||
|
||||
Where: /sys/bus/pci/devices/<dev>/ccissX/cXdY/block:cciss!cXdY
|
||||
Date: March 2009
|
||||
Kernel Version: 2.6.30
|
||||
Contact: iss_storagedev@hp.com
|
||||
Description: A symbolic link to /sys/block/cciss!cXdY
|
@ -79,3 +79,13 @@ Description:
|
||||
This file is read-only and shows the number of
|
||||
kilobytes of data that have been written to this
|
||||
filesystem since it was mounted.
|
||||
|
||||
What: /sys/fs/ext4/<disk>/inode_goal
|
||||
Date: June 2008
|
||||
Contact: "Theodore Ts'o" <tytso@mit.edu>
|
||||
Description:
|
||||
Tuning parameter which (if non-zero) controls the goal
|
||||
inode used by the inode allocator in p0reference to
|
||||
all other allocation hueristics. This is intended for
|
||||
debugging use only, and should be 0 on production
|
||||
systems.
|
||||
|
73
Documentation/ABI/testing/sysfs-pps
Normal file
73
Documentation/ABI/testing/sysfs-pps
Normal file
@ -0,0 +1,73 @@
|
||||
What: /sys/class/pps/
|
||||
Date: February 2008
|
||||
Contact: Rodolfo Giometti <giometti@linux.it>
|
||||
Description:
|
||||
The /sys/class/pps/ directory will contain files and
|
||||
directories that will provide a unified interface to
|
||||
the PPS sources.
|
||||
|
||||
What: /sys/class/pps/ppsX/
|
||||
Date: February 2008
|
||||
Contact: Rodolfo Giometti <giometti@linux.it>
|
||||
Description:
|
||||
The /sys/class/pps/ppsX/ directory is related to X-th
|
||||
PPS source into the system. Each directory will
|
||||
contain files to manage and control its PPS source.
|
||||
|
||||
What: /sys/class/pps/ppsX/assert
|
||||
Date: February 2008
|
||||
Contact: Rodolfo Giometti <giometti@linux.it>
|
||||
Description:
|
||||
The /sys/class/pps/ppsX/assert file reports the assert events
|
||||
and the assert sequence number of the X-th source in the form:
|
||||
|
||||
<secs>.<nsec>#<sequence>
|
||||
|
||||
If the source has no assert events the content of this file
|
||||
is empty.
|
||||
|
||||
What: /sys/class/pps/ppsX/clear
|
||||
Date: February 2008
|
||||
Contact: Rodolfo Giometti <giometti@linux.it>
|
||||
Description:
|
||||
The /sys/class/pps/ppsX/clear file reports the clear events
|
||||
and the clear sequence number of the X-th source in the form:
|
||||
|
||||
<secs>.<nsec>#<sequence>
|
||||
|
||||
If the source has no clear events the content of this file
|
||||
is empty.
|
||||
|
||||
What: /sys/class/pps/ppsX/mode
|
||||
Date: February 2008
|
||||
Contact: Rodolfo Giometti <giometti@linux.it>
|
||||
Description:
|
||||
The /sys/class/pps/ppsX/mode file reports the functioning
|
||||
mode of the X-th source in hexadecimal encoding.
|
||||
|
||||
Please, refer to linux/include/linux/pps.h for further
|
||||
info.
|
||||
|
||||
What: /sys/class/pps/ppsX/echo
|
||||
Date: February 2008
|
||||
Contact: Rodolfo Giometti <giometti@linux.it>
|
||||
Description:
|
||||
The /sys/class/pps/ppsX/echo file reports if the X-th does
|
||||
or does not support an "echo" function.
|
||||
|
||||
What: /sys/class/pps/ppsX/name
|
||||
Date: February 2008
|
||||
Contact: Rodolfo Giometti <giometti@linux.it>
|
||||
Description:
|
||||
The /sys/class/pps/ppsX/name file reports the name of the
|
||||
X-th source.
|
||||
|
||||
What: /sys/class/pps/ppsX/path
|
||||
Date: February 2008
|
||||
Contact: Rodolfo Giometti <giometti@linux.it>
|
||||
Description:
|
||||
The /sys/class/pps/ppsX/path file reports the path name of
|
||||
the device connected with the X-th source.
|
||||
|
||||
If the source is not connected with any device the content
|
||||
of this file is empty.
|
@ -29,7 +29,7 @@ hardware, for example, you probably needn't concern yourself with
|
||||
isdn4k-utils.
|
||||
|
||||
o Gnu C 3.2 # gcc --version
|
||||
o Gnu make 3.79.1 # make --version
|
||||
o Gnu make 3.80 # make --version
|
||||
o binutils 2.12 # ld -v
|
||||
o util-linux 2.10o # fdformat --version
|
||||
o module-init-tools 0.9.10 # depmod -V
|
||||
@ -48,6 +48,7 @@ o procps 3.2.0 # ps --version
|
||||
o oprofile 0.9 # oprofiled --version
|
||||
o udev 081 # udevinfo -V
|
||||
o grub 0.93 # grub --version
|
||||
o mcelog 0.6
|
||||
|
||||
Kernel compilation
|
||||
==================
|
||||
@ -61,7 +62,7 @@ computer.
|
||||
Make
|
||||
----
|
||||
|
||||
You will need Gnu make 3.79.1 or later to build the kernel.
|
||||
You will need Gnu make 3.80 or later to build the kernel.
|
||||
|
||||
Binutils
|
||||
--------
|
||||
@ -71,6 +72,13 @@ assembling the 16-bit boot code, removing the need for as86 to compile
|
||||
your kernel. This change does, however, mean that you need a recent
|
||||
release of binutils.
|
||||
|
||||
Perl
|
||||
----
|
||||
|
||||
You will need perl 5 and the following modules: Getopt::Long, Getopt::Std,
|
||||
File::Basename, and File::Find to build the kernel.
|
||||
|
||||
|
||||
System utilities
|
||||
================
|
||||
|
||||
@ -276,6 +284,16 @@ before running exportfs or mountd. It is recommended that all NFS
|
||||
services be protected from the internet-at-large by a firewall where
|
||||
that is possible.
|
||||
|
||||
mcelog
|
||||
------
|
||||
|
||||
In Linux 2.6.31+ the i386 kernel needs to run the mcelog utility
|
||||
as a regular cronjob similar to the x86-64 kernel to process and log
|
||||
machine check events when CONFIG_X86_NEW_MCE is enabled. Machine check
|
||||
events are errors reported by the CPU. Processing them is strongly encouraged.
|
||||
All x86-64 kernels since 2.6.4 require the mcelog utility to
|
||||
process machine checks.
|
||||
|
||||
Getting updated software
|
||||
========================
|
||||
|
||||
@ -365,6 +383,10 @@ FUSE
|
||||
----
|
||||
o <http://sourceforge.net/projects/fuse>
|
||||
|
||||
mcelog
|
||||
------
|
||||
o <ftp://ftp.kernel.org/pub/linux/utils/cpu/mce/mcelog/>
|
||||
|
||||
Networking
|
||||
**********
|
||||
|
||||
|
@ -698,8 +698,8 @@ very often is not. Abundant use of the inline keyword leads to a much bigger
|
||||
kernel, which in turn slows the system as a whole down, due to a bigger
|
||||
icache footprint for the CPU and simply because there is less memory
|
||||
available for the pagecache. Just think about it; a pagecache miss causes a
|
||||
disk seek, which easily takes 5 miliseconds. There are a LOT of cpu cycles
|
||||
that can go into these 5 miliseconds.
|
||||
disk seek, which easily takes 5 milliseconds. There are a LOT of cpu cycles
|
||||
that can go into these 5 milliseconds.
|
||||
|
||||
A reasonable rule of thumb is to not put inline at functions that have more
|
||||
than 3 lines of code in them. An exception to this rule are the cases where
|
||||
|
@ -676,8 +676,8 @@ this directory the following files can currently be found:
|
||||
dma-api/all_errors This file contains a numeric value. If this
|
||||
value is not equal to zero the debugging code
|
||||
will print a warning for every error it finds
|
||||
into the kernel log. Be carefull with this
|
||||
option. It can easily flood your logs.
|
||||
into the kernel log. Be careful with this
|
||||
option, as it can easily flood your logs.
|
||||
|
||||
dma-api/disabled This read-only file contains the character 'Y'
|
||||
if the debugging code is disabled. This can
|
||||
|
@ -106,7 +106,7 @@
|
||||
number of errors are printk'ed including a full stack trace.
|
||||
</para>
|
||||
<para>
|
||||
The statistics are available via debugfs/debug_objects/stats.
|
||||
The statistics are available via /sys/kernel/debug/debug_objects/stats.
|
||||
They provide information about the number of warnings and the
|
||||
number of successful fixups along with information about the
|
||||
usage of the internal tracking objects and the state of the
|
||||
|
@ -145,7 +145,6 @@ usage should require reading the full document.
|
||||
interface in STA mode at first!
|
||||
</para>
|
||||
!Finclude/net/mac80211.h ieee80211_if_init_conf
|
||||
!Finclude/net/mac80211.h ieee80211_if_conf
|
||||
</chapter>
|
||||
|
||||
<chapter id="rx-tx">
|
||||
|
@ -118,7 +118,7 @@ to another chain) checking the final 'nulls' value if
|
||||
the lookup met the end of chain. If final 'nulls' value
|
||||
is not the slot number, then we must restart the lookup at
|
||||
the beginning. If the object was moved to the same chain,
|
||||
then the reader doesnt care : It might eventually
|
||||
then the reader doesn't care : It might eventually
|
||||
scan the list again without harm.
|
||||
|
||||
|
||||
|
@ -5,7 +5,7 @@ Copyright 2006, 2007 Simtec Electronics
|
||||
|
||||
The Silicon Motion SM501 multimedia companion chip is a multifunction device
|
||||
which may provide numerous interfaces including USB host controller USB gadget,
|
||||
Asyncronous Serial ports, Audio functions and a dual display video interface.
|
||||
asynchronous serial ports, audio functions, and a dual display video interface.
|
||||
The device may be connected by PCI or local bus with varying functions enabled.
|
||||
|
||||
Core
|
||||
|
@ -184,8 +184,9 @@ length. Single character labels using special characters, that being anything
|
||||
other than a letter or digit, are reserved for use by the Smack development
|
||||
team. Smack labels are unstructured, case sensitive, and the only operation
|
||||
ever performed on them is comparison for equality. Smack labels cannot
|
||||
contain unprintable characters or the "/" (slash) character. Smack labels
|
||||
cannot begin with a '-', which is reserved for special options.
|
||||
contain unprintable characters, the "/" (slash), the "\" (backslash), the "'"
|
||||
(quote) and '"' (double-quote) characters.
|
||||
Smack labels cannot begin with a '-', which is reserved for special options.
|
||||
|
||||
There are some predefined labels:
|
||||
|
||||
@ -523,3 +524,18 @@ Smack supports some mount options:
|
||||
|
||||
These mount options apply to all file system types.
|
||||
|
||||
Smack auditing
|
||||
|
||||
If you want Smack auditing of security events, you need to set CONFIG_AUDIT
|
||||
in your kernel configuration.
|
||||
By default, all denied events will be audited. You can change this behavior by
|
||||
writing a single character to the /smack/logging file :
|
||||
0 : no logging
|
||||
1 : log denied (default)
|
||||
2 : log accepted
|
||||
3 : log denied & accepted
|
||||
|
||||
Events are logged as 'key=value' pairs, for each event you at least will get
|
||||
the subjet, the object, the rights requested, the action, the kernel function
|
||||
that triggered the event, plus other pairs depending on the type of event
|
||||
audited.
|
||||
|
@ -91,6 +91,10 @@ Be as specific as possible. The WORST descriptions possible include
|
||||
things like "update driver X", "bug fix for driver X", or "this patch
|
||||
includes updates for subsystem X. Please apply."
|
||||
|
||||
The maintainer will thank you if you write your patch description in a
|
||||
form which can be easily pulled into Linux's source code management
|
||||
system, git, as a "commit log". See #15, below.
|
||||
|
||||
If your description starts to get long, that's a sign that you probably
|
||||
need to split up your patch. See #3, next.
|
||||
|
||||
@ -183,8 +187,9 @@ Even if the maintainer did not respond in step #4, make sure to ALWAYS
|
||||
copy the maintainer when you change their code.
|
||||
|
||||
For small patches you may want to CC the Trivial Patch Monkey
|
||||
trivial@kernel.org managed by Jesper Juhl; which collects "trivial"
|
||||
patches. Trivial patches must qualify for one of the following rules:
|
||||
trivial@kernel.org which collects "trivial" patches. Have a look
|
||||
into the MAINTAINERS file for its current manager.
|
||||
Trivial patches must qualify for one of the following rules:
|
||||
Spelling fixes in documentation
|
||||
Spelling fixes which could break grep(1)
|
||||
Warning fixes (cluttering with useless warnings is bad)
|
||||
@ -196,7 +201,6 @@ patches. Trivial patches must qualify for one of the following rules:
|
||||
since people copy, as long as it's trivial)
|
||||
Any fix by the author/maintainer of the file (ie. patch monkey
|
||||
in re-transmission mode)
|
||||
URL: <http://www.kernel.org/pub/linux/kernel/people/juhl/trivial/>
|
||||
|
||||
|
||||
|
||||
@ -405,7 +409,14 @@ person it names. This tag documents that potentially interested parties
|
||||
have been included in the discussion
|
||||
|
||||
|
||||
14) Using Tested-by: and Reviewed-by:
|
||||
14) Using Reported-by:, Tested-by: and Reviewed-by:
|
||||
|
||||
If this patch fixes a problem reported by somebody else, consider adding a
|
||||
Reported-by: tag to credit the reporter for their contribution. Please
|
||||
note that this tag should not be added without the reporter's permission,
|
||||
especially if the problem was not reported in a public forum. That said,
|
||||
if we diligently credit our bug reporters, they will, hopefully, be
|
||||
inspired to help us again in the future.
|
||||
|
||||
A Tested-by: tag indicates that the patch has been successfully tested (in
|
||||
some environment) by the person named. This tag informs maintainers that
|
||||
@ -444,7 +455,7 @@ offer a Reviewed-by tag for a patch. This tag serves to give credit to
|
||||
reviewers and to inform maintainers of the degree of review which has been
|
||||
done on the patch. Reviewed-by: tags, when supplied by reviewers known to
|
||||
understand the subject area and to perform thorough reviews, will normally
|
||||
increase the liklihood of your patch getting into the kernel.
|
||||
increase the likelihood of your patch getting into the kernel.
|
||||
|
||||
|
||||
15) The canonical patch format
|
||||
@ -485,12 +496,33 @@ phrase" should not be a filename. Do not use the same "summary
|
||||
phrase" for every patch in a whole patch series (where a "patch
|
||||
series" is an ordered sequence of multiple, related patches).
|
||||
|
||||
Bear in mind that the "summary phrase" of your email becomes
|
||||
a globally-unique identifier for that patch. It propagates
|
||||
all the way into the git changelog. The "summary phrase" may
|
||||
later be used in developer discussions which refer to the patch.
|
||||
People will want to google for the "summary phrase" to read
|
||||
discussion regarding that patch.
|
||||
Bear in mind that the "summary phrase" of your email becomes a
|
||||
globally-unique identifier for that patch. It propagates all the way
|
||||
into the git changelog. The "summary phrase" may later be used in
|
||||
developer discussions which refer to the patch. People will want to
|
||||
google for the "summary phrase" to read discussion regarding that
|
||||
patch. It will also be the only thing that people may quickly see
|
||||
when, two or three months later, they are going through perhaps
|
||||
thousands of patches using tools such as "gitk" or "git log
|
||||
--oneline".
|
||||
|
||||
For these reasons, the "summary" must be no more than 70-75
|
||||
characters, and it must describe both what the patch changes, as well
|
||||
as why the patch might be necessary. It is challenging to be both
|
||||
succinct and descriptive, but that is what a well-written summary
|
||||
should do.
|
||||
|
||||
The "summary phrase" may be prefixed by tags enclosed in square
|
||||
brackets: "Subject: [PATCH tag] <summary phrase>". The tags are not
|
||||
considered part of the summary phrase, but describe how the patch
|
||||
should be treated. Common tags might include a version descriptor if
|
||||
the multiple versions of the patch have been sent out in response to
|
||||
comments (i.e., "v1, v2, v3"), or "RFC" to indicate a request for
|
||||
comments. If there are four patches in a patch series the individual
|
||||
patches may be numbered like this: 1/4, 2/4, 3/4, 4/4. This assures
|
||||
that developers understand the order in which the patches should be
|
||||
applied and that they have reviewed or applied all of the patches in
|
||||
the patch series.
|
||||
|
||||
A couple of example Subjects:
|
||||
|
||||
@ -510,19 +542,31 @@ the patch author in the changelog.
|
||||
The explanation body will be committed to the permanent source
|
||||
changelog, so should make sense to a competent reader who has long
|
||||
since forgotten the immediate details of the discussion that might
|
||||
have led to this patch.
|
||||
have led to this patch. Including symptoms of the failure which the
|
||||
patch addresses (kernel log messages, oops messages, etc.) is
|
||||
especially useful for people who might be searching the commit logs
|
||||
looking for the applicable patch. If a patch fixes a compile failure,
|
||||
it may not be necessary to include _all_ of the compile failures; just
|
||||
enough that it is likely that someone searching for the patch can find
|
||||
it. As in the "summary phrase", it is important to be both succinct as
|
||||
well as descriptive.
|
||||
|
||||
The "---" marker line serves the essential purpose of marking for patch
|
||||
handling tools where the changelog message ends.
|
||||
|
||||
One good use for the additional comments after the "---" marker is for
|
||||
a diffstat, to show what files have changed, and the number of inserted
|
||||
and deleted lines per file. A diffstat is especially useful on bigger
|
||||
patches. Other comments relevant only to the moment or the maintainer,
|
||||
not suitable for the permanent changelog, should also go here.
|
||||
Use diffstat options "-p 1 -w 70" so that filenames are listed from the
|
||||
top of the kernel source tree and don't use too much horizontal space
|
||||
(easily fit in 80 columns, maybe with some indentation).
|
||||
a diffstat, to show what files have changed, and the number of
|
||||
inserted and deleted lines per file. A diffstat is especially useful
|
||||
on bigger patches. Other comments relevant only to the moment or the
|
||||
maintainer, not suitable for the permanent changelog, should also go
|
||||
here. A good example of such comments might be "patch changelogs"
|
||||
which describe what has changed between the v1 and v2 version of the
|
||||
patch.
|
||||
|
||||
If you are going to include a diffstat after the "---" marker, please
|
||||
use diffstat options "-p 1 -w 70" so that filenames are listed from
|
||||
the top of the kernel source tree and don't use too much horizontal
|
||||
space (easily fit in 80 columns, maybe with some indentation).
|
||||
|
||||
See more details on the proper patch format in the following
|
||||
references.
|
||||
|
@ -246,7 +246,8 @@ void print_ioacct(struct taskstats *t)
|
||||
|
||||
int main(int argc, char *argv[])
|
||||
{
|
||||
int c, rc, rep_len, aggr_len, len2, cmd_type;
|
||||
int c, rc, rep_len, aggr_len, len2;
|
||||
int cmd_type = TASKSTATS_CMD_ATTR_UNSPEC;
|
||||
__u16 id;
|
||||
__u32 mypid;
|
||||
|
||||
|
@ -51,7 +51,7 @@ PIN Numbers
|
||||
-----------
|
||||
|
||||
Each pin has an unique number associated with it in regs-gpio.h,
|
||||
eg S3C2410_GPA0 or S3C2410_GPF1. These defines are used to tell
|
||||
eg S3C2410_GPA(0) or S3C2410_GPF(1). These defines are used to tell
|
||||
the GPIO functions which pin is to be used.
|
||||
|
||||
|
||||
@ -65,11 +65,11 @@ Configuring a pin
|
||||
|
||||
Eg:
|
||||
|
||||
s3c2410_gpio_cfgpin(S3C2410_GPA0, S3C2410_GPA0_ADDR0);
|
||||
s3c2410_gpio_cfgpin(S3C2410_GPE8, S3C2410_GPE8_SDDAT1);
|
||||
s3c2410_gpio_cfgpin(S3C2410_GPA(0), S3C2410_GPA0_ADDR0);
|
||||
s3c2410_gpio_cfgpin(S3C2410_GPE(8), S3C2410_GPE8_SDDAT1);
|
||||
|
||||
which would turn GPA0 into the lowest Address line A0, and set
|
||||
GPE8 to be connected to the SDIO/MMC controller's SDDAT1 line.
|
||||
which would turn GPA(0) into the lowest Address line A0, and set
|
||||
GPE(8) to be connected to the SDIO/MMC controller's SDDAT1 line.
|
||||
|
||||
|
||||
Reading the current configuration
|
||||
|
@ -229,10 +229,10 @@ kernel. It is the use of atomic counters to implement reference
|
||||
counting, and it works such that once the counter falls to zero it can
|
||||
be guaranteed that no other entity can be accessing the object:
|
||||
|
||||
static void obj_list_add(struct obj *obj)
|
||||
static void obj_list_add(struct obj *obj, struct list_head *head)
|
||||
{
|
||||
obj->active = 1;
|
||||
list_add(&obj->list);
|
||||
list_add(&obj->list, head);
|
||||
}
|
||||
|
||||
static void obj_list_del(struct obj *obj)
|
||||
|
@ -186,7 +186,7 @@ a virtual address mapping (unlike the earlier scheme of virtual address
|
||||
do not have a corresponding kernel virtual address space mapping) and
|
||||
low-memory pages.
|
||||
|
||||
Note: Please refer to Documentation/PCI/PCI-DMA-mapping.txt for a discussion
|
||||
Note: Please refer to Documentation/DMA-mapping.txt for a discussion
|
||||
on PCI high mem DMA aspects and mapping of scatter gather lists, and support
|
||||
for 64 bit PCI.
|
||||
|
||||
|
@ -58,7 +58,7 @@ same criteria as reads.
|
||||
front_merges (bool)
|
||||
------------
|
||||
|
||||
Sometimes it happens that a request enters the io scheduler that is contigious
|
||||
Sometimes it happens that a request enters the io scheduler that is contiguous
|
||||
with a request that is already on the queue. Either it fits in the back of that
|
||||
request, or it fits at the front. That is called either a back merge candidate
|
||||
or a front merge candidate. Due to the way files are typically laid out,
|
||||
|
@ -27,7 +27,7 @@ parameter.
|
||||
|
||||
For simplicity, only one braille console can be enabled, other uses of
|
||||
console=brl,... will be discarded. Also note that it does not interfere with
|
||||
the console selection mecanism described in serial-console.txt
|
||||
the console selection mechanism described in serial-console.txt
|
||||
|
||||
For now, only the VisioBraille device is supported.
|
||||
|
||||
|
@ -117,7 +117,7 @@ Using the pktcdvd debugfs interface
|
||||
|
||||
To read pktcdvd device infos in human readable form, do:
|
||||
|
||||
# cat /debug/pktcdvd/pktcdvd[0-7]/info
|
||||
# cat /sys/kernel/debug/pktcdvd/pktcdvd[0-7]/info
|
||||
|
||||
For a description of the debugfs interface look into the file:
|
||||
|
||||
|
@ -152,14 +152,19 @@ When swap is accounted, following files are added.
|
||||
|
||||
usage of mem+swap is limited by memsw.limit_in_bytes.
|
||||
|
||||
Note: why 'mem+swap' rather than swap.
|
||||
* why 'mem+swap' rather than swap.
|
||||
The global LRU(kswapd) can swap out arbitrary pages. Swap-out means
|
||||
to move account from memory to swap...there is no change in usage of
|
||||
mem+swap.
|
||||
mem+swap. In other words, when we want to limit the usage of swap without
|
||||
affecting global LRU, mem+swap limit is better than just limiting swap from
|
||||
OS point of view.
|
||||
|
||||
In other words, when we want to limit the usage of swap without affecting
|
||||
global LRU, mem+swap limit is better than just limiting swap from OS point
|
||||
of view.
|
||||
* What happens when a cgroup hits memory.memsw.limit_in_bytes
|
||||
When a cgroup his memory.memsw.limit_in_bytes, it's useless to do swap-out
|
||||
in this cgroup. Then, swap-out will not be done by cgroup routine and file
|
||||
caches are dropped. But as mentioned above, global LRU can do swapout memory
|
||||
from it for sanity of the system's memory management state. You can't forbid
|
||||
it by cgroup.
|
||||
|
||||
2.5 Reclaim
|
||||
|
||||
@ -204,6 +209,7 @@ We can alter the memory limit:
|
||||
|
||||
NOTE: We can use a suffix (k, K, m, M, g or G) to indicate values in kilo,
|
||||
mega or gigabytes.
|
||||
NOTE: We can write "-1" to reset the *.limit_in_bytes(unlimited).
|
||||
|
||||
# cat /cgroups/0/memory.limit_in_bytes
|
||||
4194304
|
||||
|
@ -41,6 +41,12 @@ void cn_test_callback(void *data)
|
||||
msg->seq, msg->ack, msg->len, (char *)msg->data);
|
||||
}
|
||||
|
||||
/*
|
||||
* Do not remove this function even if no one is using it as
|
||||
* this is an example of how to get notifications about new
|
||||
* connector user registration
|
||||
*/
|
||||
#if 0
|
||||
static int cn_test_want_notify(void)
|
||||
{
|
||||
struct cn_ctl_msg *ctl;
|
||||
@ -117,6 +123,7 @@ nlmsg_failure:
|
||||
kfree_skb(skb);
|
||||
return -EINVAL;
|
||||
}
|
||||
#endif
|
||||
|
||||
static u32 cn_test_timer_counter;
|
||||
static void cn_test_timer_func(unsigned long __data)
|
||||
|
@ -155,7 +155,7 @@ actual frequency must be determined using the following rules:
|
||||
- if relation==CPUFREQ_REL_H, try to select a new_freq lower than or equal
|
||||
target_freq. ("H for highest, but no higher than")
|
||||
|
||||
Here again the frequency table helper might assist you - see section 3
|
||||
Here again the frequency table helper might assist you - see section 2
|
||||
for details.
|
||||
|
||||
|
||||
|
@ -119,10 +119,6 @@ want the kernel to look at the CPU usage and to make decisions on
|
||||
what to do about the frequency. Typically this is set to values of
|
||||
around '10000' or more. It's default value is (cmp. with users-guide.txt):
|
||||
transition_latency * 1000
|
||||
The lowest value you can set is:
|
||||
transition_latency * 100 or it may get restricted to a value where it
|
||||
makes not sense for the kernel anymore to poll that often which depends
|
||||
on your HZ config variable (HZ=1000: max=20000us, HZ=250: max=5000).
|
||||
Be aware that transition latency is in ns and sampling_rate is in us, so you
|
||||
get the same sysfs value by default.
|
||||
Sampling rate should always get adjusted considering the transition latency
|
||||
@ -131,14 +127,20 @@ in the bash (as said, 1000 is default), do:
|
||||
echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) \
|
||||
>ondemand/sampling_rate
|
||||
|
||||
show_sampling_rate_(min|max): THIS INTERFACE IS DEPRECATED, DON'T USE IT.
|
||||
You can use wider ranges now and the general
|
||||
cpuinfo_transition_latency variable (cmp. with user-guide.txt) can be
|
||||
used to obtain exactly the same info:
|
||||
show_sampling_rate_min = transtition_latency * 500 / 1000
|
||||
show_sampling_rate_max = transtition_latency * 500000 / 1000
|
||||
(divided by 1000 is to illustrate that sampling rate is in us and
|
||||
transition latency is exported ns).
|
||||
show_sampling_rate_min:
|
||||
The sampling rate is limited by the HW transition latency:
|
||||
transition_latency * 100
|
||||
Or by kernel restrictions:
|
||||
If CONFIG_NO_HZ is set, the limit is 10ms fixed.
|
||||
If CONFIG_NO_HZ is not set or no_hz=off boot parameter is used, the
|
||||
limits depend on the CONFIG_HZ option:
|
||||
HZ=1000: min=20000us (20ms)
|
||||
HZ=250: min=80000us (80ms)
|
||||
HZ=100: min=200000us (200ms)
|
||||
The highest value of kernel and HW latency restrictions is shown and
|
||||
used as the minimum sampling rate.
|
||||
|
||||
show_sampling_rate_max: THIS INTERFACE IS DEPRECATED, DON'T USE IT.
|
||||
|
||||
up_threshold: defines what the average CPU usage between the samplings
|
||||
of 'sampling_rate' needs to be for the kernel to make a decision on
|
||||
|
@ -31,7 +31,6 @@ Contents:
|
||||
|
||||
3. How to change the CPU cpufreq policy and/or speed
|
||||
3.1 Preferred interface: sysfs
|
||||
3.2 Deprecated interfaces
|
||||
|
||||
|
||||
|
||||
|
@ -76,9 +76,9 @@ Do the steps below to download the BIOS image.
|
||||
|
||||
The /sys/class/firmware/dell_rbu/ entries will remain till the following is
|
||||
done.
|
||||
echo -1 > /sys/class/firmware/dell_rbu/loading.
|
||||
echo -1 > /sys/class/firmware/dell_rbu/loading
|
||||
Until this step is completed the driver cannot be unloaded.
|
||||
Also echoing either mono ,packet or init in to image_type will free up the
|
||||
Also echoing either mono, packet or init in to image_type will free up the
|
||||
memory allocated by the driver.
|
||||
|
||||
If a user by accident executes steps 1 and 3 above without executing step 2;
|
||||
|
@ -119,7 +119,7 @@ which takes quite a bit of time and thought after the "real work" has been
|
||||
done. When done properly, though, it is time well spent.
|
||||
|
||||
|
||||
5.4: PATCH FORMATTING
|
||||
5.4: PATCH FORMATTING AND CHANGELOGS
|
||||
|
||||
So now you have a perfect series of patches for posting, but the work is
|
||||
not done quite yet. Each patch needs to be formatted into a message which
|
||||
@ -146,8 +146,33 @@ that end, each patch will be composed of the following:
|
||||
- One or more tag lines, with, at a minimum, one Signed-off-by: line from
|
||||
the author of the patch. Tags will be described in more detail below.
|
||||
|
||||
The above three items should, normally, be the text used when committing
|
||||
the change to a revision control system. They are followed by:
|
||||
The items above, together, form the changelog for the patch. Writing good
|
||||
changelogs is a crucial but often-neglected art; it's worth spending
|
||||
another moment discussing this issue. When writing a changelog, you should
|
||||
bear in mind that a number of different people will be reading your words.
|
||||
These include subsystem maintainers and reviewers who need to decide
|
||||
whether the patch should be included, distributors and other maintainers
|
||||
trying to decide whether a patch should be backported to other kernels, bug
|
||||
hunters wondering whether the patch is responsible for a problem they are
|
||||
chasing, users who want to know how the kernel has changed, and more. A
|
||||
good changelog conveys the needed information to all of these people in the
|
||||
most direct and concise way possible.
|
||||
|
||||
To that end, the summary line should describe the effects of and motivation
|
||||
for the change as well as possible given the one-line constraint. The
|
||||
detailed description can then amplify on those topics and provide any
|
||||
needed additional information. If the patch fixes a bug, cite the commit
|
||||
which introduced the bug if possible. If a problem is associated with
|
||||
specific log or compiler output, include that output to help others
|
||||
searching for a solution to the same problem. If the change is meant to
|
||||
support other changes coming in later patch, say so. If internal APIs are
|
||||
changed, detail those changes and how other developers should respond. In
|
||||
general, the more you can put yourself into the shoes of everybody who will
|
||||
be reading your changelog, the better that changelog (and the kernel as a
|
||||
whole) will be.
|
||||
|
||||
Needless to say, the changelog should be the text used when committing the
|
||||
change to a revision control system. It will be followed by:
|
||||
|
||||
- The patch itself, in the unified ("-u") patch format. Using the "-p"
|
||||
option to diff will associate function names with changes, making the
|
||||
|
@ -162,3 +162,35 @@ device_remove_file(dev,&dev_attr_power);
|
||||
|
||||
The file name will be 'power' with a mode of 0644 (-rw-r--r--).
|
||||
|
||||
Word of warning: While the kernel allows device_create_file() and
|
||||
device_remove_file() to be called on a device at any time, userspace has
|
||||
strict expectations on when attributes get created. When a new device is
|
||||
registered in the kernel, a uevent is generated to notify userspace (like
|
||||
udev) that a new device is available. If attributes are added after the
|
||||
device is registered, then userspace won't get notified and userspace will
|
||||
not know about the new attributes.
|
||||
|
||||
This is important for device driver that need to publish additional
|
||||
attributes for a device at driver probe time. If the device driver simply
|
||||
calls device_create_file() on the device structure passed to it, then
|
||||
userspace will never be notified of the new attributes. Instead, it should
|
||||
probably use class_create() and class->dev_attrs to set up a list of
|
||||
desired attributes in the modules_init function, and then in the .probe()
|
||||
hook, and then use device_create() to create a new device as a child
|
||||
of the probed device. The new device will generate a new uevent and
|
||||
properly advertise the new attributes to userspace.
|
||||
|
||||
For example, if a driver wanted to add the following attributes:
|
||||
struct device_attribute mydriver_attribs[] = {
|
||||
__ATTR(port_count, 0444, port_count_show),
|
||||
__ATTR(serial_number, 0444, serial_number_show),
|
||||
NULL
|
||||
};
|
||||
|
||||
Then in the module init function is would do:
|
||||
mydriver_class = class_create(THIS_MODULE, "my_attrs");
|
||||
mydriver_class.dev_attr = mydriver_attribs;
|
||||
|
||||
And assuming 'dev' is the struct device passed into the probe hook, the driver
|
||||
probe function would do something like:
|
||||
create_device(&mydriver_class, dev, chrdev, &private_data, "my_name");
|
||||
|
@ -188,7 +188,7 @@ For example, you can do something like the following.
|
||||
|
||||
void my_midlayer_destroy_something()
|
||||
{
|
||||
devres_release_group(dev, my_midlayer_create_soemthing);
|
||||
devres_release_group(dev, my_midlayer_create_something);
|
||||
}
|
||||
|
||||
|
||||
|
@ -112,7 +112,7 @@ sub tda10045 {
|
||||
|
||||
sub tda10046 {
|
||||
my $sourcefile = "TT_PCI_2.19h_28_11_2006.zip";
|
||||
my $url = "http://technotrend-online.com/download/software/219/$sourcefile";
|
||||
my $url = "http://www.tt-download.com/download/updates/219/$sourcefile";
|
||||
my $hash = "6a7e1e2f2644b162ff0502367553c72d";
|
||||
my $outfile = "dvb-fe-tda10046.fw";
|
||||
my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1);
|
||||
@ -129,8 +129,8 @@ sub tda10046 {
|
||||
}
|
||||
|
||||
sub tda10046lifeview {
|
||||
my $sourcefile = "Drv_2.11.02.zip";
|
||||
my $url = "http://www.lifeview.com.tw/drivers/pci_card/FlyDVB-T/$sourcefile";
|
||||
my $sourcefile = "7%5Cdrv_2.11.02.zip";
|
||||
my $url = "http://www.lifeview.hk/dbimages/document/$sourcefile";
|
||||
my $hash = "1ea24dee4eea8fe971686981f34fd2e0";
|
||||
my $outfile = "dvb-fe-tda10046.fw";
|
||||
my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1);
|
||||
@ -317,7 +317,7 @@ sub nxt2002 {
|
||||
|
||||
sub nxt2004 {
|
||||
my $sourcefile = "AVerTVHD_MCE_A180_Drv_v1.2.2.16.zip";
|
||||
my $url = "http://www.aver.com/support/Drivers/$sourcefile";
|
||||
my $url = "http://www.avermedia-usa.com/support/Drivers/$sourcefile";
|
||||
my $hash = "111cb885b1e009188346d72acfed024c";
|
||||
my $outfile = "dvb-fe-nxt2004.fw";
|
||||
my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1);
|
||||
|
@ -23,8 +23,8 @@ first time, it was renamed to 'EDAC'.
|
||||
The bluesmoke project at sourceforge.net is now utilized as a 'staging area'
|
||||
for EDAC development, before it is sent upstream to kernel.org
|
||||
|
||||
At the bluesmoke/EDAC project site, is a series of quilt patches against
|
||||
recent kernels, stored in a SVN respository. For easier downloading, there
|
||||
At the bluesmoke/EDAC project site is a series of quilt patches against
|
||||
recent kernels, stored in a SVN repository. For easier downloading, there
|
||||
is also a tarball snapshot available.
|
||||
|
||||
============================================================================
|
||||
@ -73,9 +73,9 @@ the vendor should tie the parity status bits to 0 if they do not intend
|
||||
to generate parity. Some vendors do not do this, and thus the parity bit
|
||||
can "float" giving false positives.
|
||||
|
||||
In the kernel there is a pci device attribute located in sysfs that is
|
||||
In the kernel there is a PCI device attribute located in sysfs that is
|
||||
checked by the EDAC PCI scanning code. If that attribute is set,
|
||||
PCI parity/error scannining is skipped for that device. The attribute
|
||||
PCI parity/error scanning is skipped for that device. The attribute
|
||||
is:
|
||||
|
||||
broken_parity_status
|
||||
|
@ -29,16 +29,16 @@ o debugfs entries
|
||||
fault-inject-debugfs kernel module provides some debugfs entries for runtime
|
||||
configuration of fault-injection capabilities.
|
||||
|
||||
- /debug/fail*/probability:
|
||||
- /sys/kernel/debug/fail*/probability:
|
||||
|
||||
likelihood of failure injection, in percent.
|
||||
Format: <percent>
|
||||
|
||||
Note that one-failure-per-hundred is a very high error rate
|
||||
for some testcases. Consider setting probability=100 and configure
|
||||
/debug/fail*/interval for such testcases.
|
||||
/sys/kernel/debug/fail*/interval for such testcases.
|
||||
|
||||
- /debug/fail*/interval:
|
||||
- /sys/kernel/debug/fail*/interval:
|
||||
|
||||
specifies the interval between failures, for calls to
|
||||
should_fail() that pass all the other tests.
|
||||
@ -46,18 +46,18 @@ configuration of fault-injection capabilities.
|
||||
Note that if you enable this, by setting interval>1, you will
|
||||
probably want to set probability=100.
|
||||
|
||||
- /debug/fail*/times:
|
||||
- /sys/kernel/debug/fail*/times:
|
||||
|
||||
specifies how many times failures may happen at most.
|
||||
A value of -1 means "no limit".
|
||||
|
||||
- /debug/fail*/space:
|
||||
- /sys/kernel/debug/fail*/space:
|
||||
|
||||
specifies an initial resource "budget", decremented by "size"
|
||||
on each call to should_fail(,size). Failure injection is
|
||||
suppressed until "space" reaches zero.
|
||||
|
||||
- /debug/fail*/verbose
|
||||
- /sys/kernel/debug/fail*/verbose
|
||||
|
||||
Format: { 0 | 1 | 2 }
|
||||
specifies the verbosity of the messages when failure is
|
||||
@ -65,17 +65,17 @@ configuration of fault-injection capabilities.
|
||||
log line per failure; '2' will print a call trace too -- useful
|
||||
to debug the problems revealed by fault injection.
|
||||
|
||||
- /debug/fail*/task-filter:
|
||||
- /sys/kernel/debug/fail*/task-filter:
|
||||
|
||||
Format: { 'Y' | 'N' }
|
||||
A value of 'N' disables filtering by process (default).
|
||||
Any positive value limits failures to only processes indicated by
|
||||
/proc/<pid>/make-it-fail==1.
|
||||
|
||||
- /debug/fail*/require-start:
|
||||
- /debug/fail*/require-end:
|
||||
- /debug/fail*/reject-start:
|
||||
- /debug/fail*/reject-end:
|
||||
- /sys/kernel/debug/fail*/require-start:
|
||||
- /sys/kernel/debug/fail*/require-end:
|
||||
- /sys/kernel/debug/fail*/reject-start:
|
||||
- /sys/kernel/debug/fail*/reject-end:
|
||||
|
||||
specifies the range of virtual addresses tested during
|
||||
stacktrace walking. Failure is injected only if some caller
|
||||
@ -84,26 +84,26 @@ configuration of fault-injection capabilities.
|
||||
Default required range is [0,ULONG_MAX) (whole of virtual address space).
|
||||
Default rejected range is [0,0).
|
||||
|
||||
- /debug/fail*/stacktrace-depth:
|
||||
- /sys/kernel/debug/fail*/stacktrace-depth:
|
||||
|
||||
specifies the maximum stacktrace depth walked during search
|
||||
for a caller within [require-start,require-end) OR
|
||||
[reject-start,reject-end).
|
||||
|
||||
- /debug/fail_page_alloc/ignore-gfp-highmem:
|
||||
- /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem:
|
||||
|
||||
Format: { 'Y' | 'N' }
|
||||
default is 'N', setting it to 'Y' won't inject failures into
|
||||
highmem/user allocations.
|
||||
|
||||
- /debug/failslab/ignore-gfp-wait:
|
||||
- /debug/fail_page_alloc/ignore-gfp-wait:
|
||||
- /sys/kernel/debug/failslab/ignore-gfp-wait:
|
||||
- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
|
||||
|
||||
Format: { 'Y' | 'N' }
|
||||
default is 'N', setting it to 'Y' will inject failures
|
||||
only into non-sleep allocations (GFP_ATOMIC allocations).
|
||||
|
||||
- /debug/fail_page_alloc/min-order:
|
||||
- /sys/kernel/debug/fail_page_alloc/min-order:
|
||||
|
||||
specifies the minimum page allocation order to be injected
|
||||
failures.
|
||||
@ -166,13 +166,13 @@ o Inject slab allocation failures into module init/exit code
|
||||
#!/bin/bash
|
||||
|
||||
FAILTYPE=failslab
|
||||
echo Y > /debug/$FAILTYPE/task-filter
|
||||
echo 10 > /debug/$FAILTYPE/probability
|
||||
echo 100 > /debug/$FAILTYPE/interval
|
||||
echo -1 > /debug/$FAILTYPE/times
|
||||
echo 0 > /debug/$FAILTYPE/space
|
||||
echo 2 > /debug/$FAILTYPE/verbose
|
||||
echo 1 > /debug/$FAILTYPE/ignore-gfp-wait
|
||||
echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
|
||||
echo 10 > /sys/kernel/debug/$FAILTYPE/probability
|
||||
echo 100 > /sys/kernel/debug/$FAILTYPE/interval
|
||||
echo -1 > /sys/kernel/debug/$FAILTYPE/times
|
||||
echo 0 > /sys/kernel/debug/$FAILTYPE/space
|
||||
echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
|
||||
echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
|
||||
|
||||
faulty_system()
|
||||
{
|
||||
@ -217,20 +217,20 @@ then
|
||||
exit 1
|
||||
fi
|
||||
|
||||
cat /sys/module/$module/sections/.text > /debug/$FAILTYPE/require-start
|
||||
cat /sys/module/$module/sections/.data > /debug/$FAILTYPE/require-end
|
||||
cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start
|
||||
cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end
|
||||
|
||||
echo N > /debug/$FAILTYPE/task-filter
|
||||
echo 10 > /debug/$FAILTYPE/probability
|
||||
echo 100 > /debug/$FAILTYPE/interval
|
||||
echo -1 > /debug/$FAILTYPE/times
|
||||
echo 0 > /debug/$FAILTYPE/space
|
||||
echo 2 > /debug/$FAILTYPE/verbose
|
||||
echo 1 > /debug/$FAILTYPE/ignore-gfp-wait
|
||||
echo 1 > /debug/$FAILTYPE/ignore-gfp-highmem
|
||||
echo 10 > /debug/$FAILTYPE/stacktrace-depth
|
||||
echo N > /sys/kernel/debug/$FAILTYPE/task-filter
|
||||
echo 10 > /sys/kernel/debug/$FAILTYPE/probability
|
||||
echo 100 > /sys/kernel/debug/$FAILTYPE/interval
|
||||
echo -1 > /sys/kernel/debug/$FAILTYPE/times
|
||||
echo 0 > /sys/kernel/debug/$FAILTYPE/space
|
||||
echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
|
||||
echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
|
||||
echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
|
||||
echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
|
||||
|
||||
trap "echo 0 > /debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
|
||||
trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
|
||||
|
||||
echo "Injecting errors into the module $module... (interrupt to stop)"
|
||||
sleep 1000000
|
||||
|
@ -1,7 +1,7 @@
|
||||
SH7760/SH7763 integrated LCDC Framebuffer driver
|
||||
================================================
|
||||
|
||||
0. Overwiew
|
||||
0. Overview
|
||||
-----------
|
||||
The SH7760/SH7763 have an integrated LCD Display controller (LCDC) which
|
||||
supports (in theory) resolutions ranging from 1x1 to 1024x1024,
|
||||
|
@ -95,7 +95,7 @@ There is no way to change the vesafb video mode and/or timings after
|
||||
booting linux. If you are not happy with the 60 Hz refresh rate, you
|
||||
have these options:
|
||||
|
||||
* configure and load the DOS-Tools for your the graphics board (if
|
||||
* configure and load the DOS-Tools for the graphics board (if
|
||||
available) and boot linux with loadlin.
|
||||
* use a native driver (matroxfb/atyfb) instead if vesafb. If none
|
||||
is available, write a new one!
|
||||
|
@ -6,6 +6,20 @@ be removed from this file.
|
||||
|
||||
---------------------------
|
||||
|
||||
What: IRQF_SAMPLE_RANDOM
|
||||
Check: IRQF_SAMPLE_RANDOM
|
||||
When: July 2009
|
||||
|
||||
Why: Many of IRQF_SAMPLE_RANDOM users are technically bogus as entropy
|
||||
sources in the kernel's current entropy model. To resolve this, every
|
||||
input point to the kernel's entropy pool needs to better document the
|
||||
type of entropy source it actually is. This will be replaced with
|
||||
additional add_*_randomness functions in drivers/char/random.c
|
||||
|
||||
Who: Robin Getz <rgetz@blackfin.uclinux.org> & Matt Mackall <mpm@selenic.com>
|
||||
|
||||
---------------------------
|
||||
|
||||
What: The ieee80211_regdom module parameter
|
||||
When: March 2010 / desktop catchup
|
||||
|
||||
@ -437,3 +451,20 @@ Why: Superseded by tdfxfb. I2C/DDC support used to live in a separate
|
||||
driver but this caused driver conflicts.
|
||||
Who: Jean Delvare <khali@linux-fr.org>
|
||||
Krzysztof Helt <krzysztof.h1@wp.pl>
|
||||
|
||||
---------------------------
|
||||
|
||||
What: CONFIG_RFKILL_INPUT
|
||||
When: 2.6.33
|
||||
Why: Should be implemented in userspace, policy daemon.
|
||||
Who: Johannes Berg <johannes@sipsolutions.net>
|
||||
|
||||
----------------------------
|
||||
|
||||
What: CONFIG_X86_OLD_MCE
|
||||
When: 2.6.32
|
||||
Why: Remove the old legacy 32bit machine check code. This has been
|
||||
superseded by the newer machine check code from the 64bit port,
|
||||
but the old version has been kept around for easier testing. Note this
|
||||
doesn't impact the old P5 and WinChip machine check handlers.
|
||||
Who: Andi Kleen <andi@firstfloor.org>
|
||||
|
@ -187,7 +187,7 @@ readpages: no
|
||||
write_begin: no locks the page yes
|
||||
write_end: no yes, unlocks yes
|
||||
perform_write: no n/a yes
|
||||
bmap: yes
|
||||
bmap: no
|
||||
invalidatepage: no yes
|
||||
releasepage: no yes
|
||||
direct_IO: no
|
||||
|
@ -369,7 +369,7 @@ The call requires an initialized struct autofs_dev_ioctl. There are two
|
||||
possible variations. Both use the path field set to the path of the mount
|
||||
point to check and the size field adjusted appropriately. One uses the
|
||||
ioctlfd field to identify a specific mount point to check while the other
|
||||
variation uses the path and optionaly arg1 set to an autofs mount type.
|
||||
variation uses the path and optionally arg1 set to an autofs mount type.
|
||||
The call returns 1 if this is a mount point and sets arg1 to the device
|
||||
number of the mount and field arg2 to the relevant super block magic
|
||||
number (described below) or 0 if it isn't a mountpoint. In both cases
|
||||
|
@ -184,7 +184,7 @@ This has the following fields:
|
||||
have index children.
|
||||
|
||||
If this function is not supplied or if it returns NULL then the first
|
||||
cache in the parent's list will be chosed, or failing that, the first
|
||||
cache in the parent's list will be chosen, or failing that, the first
|
||||
cache in the master list.
|
||||
|
||||
(4) A function to retrieve an object's key from the netfs [mandatory].
|
||||
|
158
Documentation/filesystems/debugfs.txt
Normal file
158
Documentation/filesystems/debugfs.txt
Normal file
@ -0,0 +1,158 @@
|
||||
Copyright 2009 Jonathan Corbet <corbet@lwn.net>
|
||||
|
||||
Debugfs exists as a simple way for kernel developers to make information
|
||||
available to user space. Unlike /proc, which is only meant for information
|
||||
about a process, or sysfs, which has strict one-value-per-file rules,
|
||||
debugfs has no rules at all. Developers can put any information they want
|
||||
there. The debugfs filesystem is also intended to not serve as a stable
|
||||
ABI to user space; in theory, there are no stability constraints placed on
|
||||
files exported there. The real world is not always so simple, though [1];
|
||||
even debugfs interfaces are best designed with the idea that they will need
|
||||
to be maintained forever.
|
||||
|
||||
Debugfs is typically mounted with a command like:
|
||||
|
||||
mount -t debugfs none /sys/kernel/debug
|
||||
|
||||
(Or an equivalent /etc/fstab line).
|
||||
|
||||
Note that the debugfs API is exported GPL-only to modules.
|
||||
|
||||
Code using debugfs should include <linux/debugfs.h>. Then, the first order
|
||||
of business will be to create at least one directory to hold a set of
|
||||
debugfs files:
|
||||
|
||||
struct dentry *debugfs_create_dir(const char *name, struct dentry *parent);
|
||||
|
||||
This call, if successful, will make a directory called name underneath the
|
||||
indicated parent directory. If parent is NULL, the directory will be
|
||||
created in the debugfs root. On success, the return value is a struct
|
||||
dentry pointer which can be used to create files in the directory (and to
|
||||
clean it up at the end). A NULL return value indicates that something went
|
||||
wrong. If ERR_PTR(-ENODEV) is returned, that is an indication that the
|
||||
kernel has been built without debugfs support and none of the functions
|
||||
described below will work.
|
||||
|
||||
The most general way to create a file within a debugfs directory is with:
|
||||
|
||||
struct dentry *debugfs_create_file(const char *name, mode_t mode,
|
||||
struct dentry *parent, void *data,
|
||||
const struct file_operations *fops);
|
||||
|
||||
Here, name is the name of the file to create, mode describes the access
|
||||
permissions the file should have, parent indicates the directory which
|
||||
should hold the file, data will be stored in the i_private field of the
|
||||
resulting inode structure, and fops is a set of file operations which
|
||||
implement the file's behavior. At a minimum, the read() and/or write()
|
||||
operations should be provided; others can be included as needed. Again,
|
||||
the return value will be a dentry pointer to the created file, NULL for
|
||||
error, or ERR_PTR(-ENODEV) if debugfs support is missing.
|
||||
|
||||
In a number of cases, the creation of a set of file operations is not
|
||||
actually necessary; the debugfs code provides a number of helper functions
|
||||
for simple situations. Files containing a single integer value can be
|
||||
created with any of:
|
||||
|
||||
struct dentry *debugfs_create_u8(const char *name, mode_t mode,
|
||||
struct dentry *parent, u8 *value);
|
||||
struct dentry *debugfs_create_u16(const char *name, mode_t mode,
|
||||
struct dentry *parent, u16 *value);
|
||||
struct dentry *debugfs_create_u32(const char *name, mode_t mode,
|
||||
struct dentry *parent, u32 *value);
|
||||
struct dentry *debugfs_create_u64(const char *name, mode_t mode,
|
||||
struct dentry *parent, u64 *value);
|
||||
|
||||
These files support both reading and writing the given value; if a specific
|
||||
file should not be written to, simply set the mode bits accordingly. The
|
||||
values in these files are in decimal; if hexadecimal is more appropriate,
|
||||
the following functions can be used instead:
|
||||
|
||||
struct dentry *debugfs_create_x8(const char *name, mode_t mode,
|
||||
struct dentry *parent, u8 *value);
|
||||
struct dentry *debugfs_create_x16(const char *name, mode_t mode,
|
||||
struct dentry *parent, u16 *value);
|
||||
struct dentry *debugfs_create_x32(const char *name, mode_t mode,
|
||||
struct dentry *parent, u32 *value);
|
||||
|
||||
Note that there is no debugfs_create_x64().
|
||||
|
||||
These functions are useful as long as the developer knows the size of the
|
||||
value to be exported. Some types can have different widths on different
|
||||
architectures, though, complicating the situation somewhat. There is a
|
||||
function meant to help out in one special case:
|
||||
|
||||
struct dentry *debugfs_create_size_t(const char *name, mode_t mode,
|
||||
struct dentry *parent,
|
||||
size_t *value);
|
||||
|
||||
As might be expected, this function will create a debugfs file to represent
|
||||
a variable of type size_t.
|
||||
|
||||
Boolean values can be placed in debugfs with:
|
||||
|
||||
struct dentry *debugfs_create_bool(const char *name, mode_t mode,
|
||||
struct dentry *parent, u32 *value);
|
||||
|
||||
A read on the resulting file will yield either Y (for non-zero values) or
|
||||
N, followed by a newline. If written to, it will accept either upper- or
|
||||
lower-case values, or 1 or 0. Any other input will be silently ignored.
|
||||
|
||||
Finally, a block of arbitrary binary data can be exported with:
|
||||
|
||||
struct debugfs_blob_wrapper {
|
||||
void *data;
|
||||
unsigned long size;
|
||||
};
|
||||
|
||||
struct dentry *debugfs_create_blob(const char *name, mode_t mode,
|
||||
struct dentry *parent,
|
||||
struct debugfs_blob_wrapper *blob);
|
||||
|
||||
A read of this file will return the data pointed to by the
|
||||
debugfs_blob_wrapper structure. Some drivers use "blobs" as a simple way
|
||||
to return several lines of (static) formatted text output. This function
|
||||
can be used to export binary information, but there does not appear to be
|
||||
any code which does so in the mainline. Note that all files created with
|
||||
debugfs_create_blob() are read-only.
|
||||
|
||||
There are a couple of other directory-oriented helper functions:
|
||||
|
||||
struct dentry *debugfs_rename(struct dentry *old_dir,
|
||||
struct dentry *old_dentry,
|
||||
struct dentry *new_dir,
|
||||
const char *new_name);
|
||||
|
||||
struct dentry *debugfs_create_symlink(const char *name,
|
||||
struct dentry *parent,
|
||||
const char *target);
|
||||
|
||||
A call to debugfs_rename() will give a new name to an existing debugfs
|
||||
file, possibly in a different directory. The new_name must not exist prior
|
||||
to the call; the return value is old_dentry with updated information.
|
||||
Symbolic links can be created with debugfs_create_symlink().
|
||||
|
||||
There is one important thing that all debugfs users must take into account:
|
||||
there is no automatic cleanup of any directories created in debugfs. If a
|
||||
module is unloaded without explicitly removing debugfs entries, the result
|
||||
will be a lot of stale pointers and no end of highly antisocial behavior.
|
||||
So all debugfs users - at least those which can be built as modules - must
|
||||
be prepared to remove all files and directories they create there. A file
|
||||
can be removed with:
|
||||
|
||||
void debugfs_remove(struct dentry *dentry);
|
||||
|
||||
The dentry value can be NULL, in which case nothing will be removed.
|
||||
|
||||
Once upon a time, debugfs users were required to remember the dentry
|
||||
pointer for every debugfs file they created so that all files could be
|
||||
cleaned up. We live in more civilized times now, though, and debugfs users
|
||||
can call:
|
||||
|
||||
void debugfs_remove_recursive(struct dentry *dentry);
|
||||
|
||||
If this function is passed a pointer for the dentry corresponding to the
|
||||
top-level directory, the entire hierarchy below that directory will be
|
||||
removed.
|
||||
|
||||
Notes:
|
||||
[1] http://lwn.net/Articles/309298/
|
@ -322,7 +322,7 @@ an upper limit on the block size imposed by the page size of the kernel,
|
||||
so 8kB blocks are only allowed on Alpha systems (and other architectures
|
||||
which support larger pages).
|
||||
|
||||
There is an upper limit of 32768 subdirectories in a single directory.
|
||||
There is an upper limit of 32000 subdirectories in a single directory.
|
||||
|
||||
There is a "soft" upper limit of about 10-15k files in a single directory
|
||||
with the current linear linked-list directory implementation. This limit
|
||||
|
@ -235,6 +235,10 @@ minixdf Make 'df' act like Minix.
|
||||
|
||||
debug Extra debugging information is sent to syslog.
|
||||
|
||||
abort Simulate the effects of calling ext4_abort() for
|
||||
debugging purposes. This is normally used while
|
||||
remounting a filesystem which is already mounted.
|
||||
|
||||
errors=remount-ro Remount the filesystem read-only on an error.
|
||||
errors=continue Keep going on a filesystem error.
|
||||
errors=panic Panic and halt the machine if an error occurs.
|
||||
@ -294,7 +298,7 @@ max_batch_time=usec Maximum amount of time ext4 should wait for
|
||||
amount of time (on average) that it takes to
|
||||
finish committing a transaction. Call this time
|
||||
the "commit time". If the time that the
|
||||
transactoin has been running is less than the
|
||||
transaction has been running is less than the
|
||||
commit time, ext4 will try sleeping for the
|
||||
commit time to see if other operations will join
|
||||
the transaction. The commit time is capped by
|
||||
@ -328,7 +332,7 @@ noauto_da_alloc replacing existing files via patterns such as
|
||||
journal commit, in the default data=ordered
|
||||
mode, the data blocks of the new file are forced
|
||||
to disk before the rename() operation is
|
||||
commited. This provides roughly the same level
|
||||
committed. This provides roughly the same level
|
||||
of guarantees as ext3, and avoids the
|
||||
"zero-length" problem that can happen when a
|
||||
system crashes before the delayed allocation
|
||||
@ -358,7 +362,7 @@ written to the journal first, and then to its final location.
|
||||
In the event of a crash, the journal can be replayed, bringing both data and
|
||||
metadata into a consistent state. This mode is the slowest except when data
|
||||
needs to be read from and written to disk at the same time where it
|
||||
outperforms all others modes. Curently ext4 does not have delayed
|
||||
outperforms all others modes. Currently ext4 does not have delayed
|
||||
allocation support if this data journalling mode is selected.
|
||||
|
||||
References
|
||||
|
@ -204,7 +204,7 @@ fiemap_check_flags() helper:
|
||||
|
||||
int fiemap_check_flags(struct fiemap_extent_info *fieinfo, u32 fs_flags);
|
||||
|
||||
The struct fieinfo should be passed in as recieved from ioctl_fiemap(). The
|
||||
The struct fieinfo should be passed in as received from ioctl_fiemap(). The
|
||||
set of fiemap flags which the fs understands should be passed via fs_flags. If
|
||||
fiemap_check_flags finds invalid user flags, it will place the bad values in
|
||||
fieinfo->fi_flags and return -EBADR. If the file system gets -EBADR, from
|
||||
|
@ -60,7 +60,7 @@ go_lock | Called for the first local holder of a lock
|
||||
go_unlock | Called on the final local unlock of a lock
|
||||
go_dump | Called to print content of object for debugfs file, or on
|
||||
| error to dump glock to the log.
|
||||
go_type; | The type of the glock, LM_TYPE_.....
|
||||
go_type | The type of the glock, LM_TYPE_.....
|
||||
go_min_hold_time | The minimum hold time
|
||||
|
||||
The minimum hold time for each lock is the time after a remote lock
|
||||
|
@ -11,18 +11,15 @@ their I/O so file system consistency is maintained. One of the nifty
|
||||
features of GFS is perfect consistency -- changes made to the file system
|
||||
on one machine show up immediately on all other machines in the cluster.
|
||||
|
||||
GFS uses interchangable inter-node locking mechanisms. Different lock
|
||||
modules can plug into GFS and each file system selects the appropriate
|
||||
lock module at mount time. Lock modules include:
|
||||
GFS uses interchangable inter-node locking mechanisms, the currently
|
||||
supported mechanisms are:
|
||||
|
||||
lock_nolock -- allows gfs to be used as a local file system
|
||||
|
||||
lock_dlm -- uses a distributed lock manager (dlm) for inter-node locking
|
||||
The dlm is found at linux/fs/dlm/
|
||||
|
||||
In addition to interfacing with an external locking manager, a gfs lock
|
||||
module is responsible for interacting with external cluster management
|
||||
systems. Lock_dlm depends on user space cluster management systems found
|
||||
Lock_dlm depends on user space cluster management systems found
|
||||
at the URL above.
|
||||
|
||||
To use gfs as a local file system, no external clustering systems are
|
||||
@ -31,13 +28,19 @@ needed, simply:
|
||||
$ mkfs -t gfs2 -p lock_nolock -j 1 /dev/block_device
|
||||
$ mount -t gfs2 /dev/block_device /dir
|
||||
|
||||
GFS2 is not on-disk compatible with previous versions of GFS.
|
||||
If you are using Fedora, you need to install the gfs2-utils package
|
||||
and, for lock_dlm, you will also need to install the cman package
|
||||
and write a cluster.conf as per the documentation.
|
||||
|
||||
GFS2 is not on-disk compatible with previous versions of GFS, but it
|
||||
is pretty close.
|
||||
|
||||
The following man pages can be found at the URL above:
|
||||
gfs2_fsck to repair a filesystem
|
||||
fsck.gfs2 to repair a filesystem
|
||||
gfs2_grow to expand a filesystem online
|
||||
gfs2_jadd to add journals to a filesystem online
|
||||
gfs2_tool to manipulate, examine and tune a filesystem
|
||||
gfs2_quota to examine and change quota values in a filesystem
|
||||
gfs2_convert to convert a gfs filesystem to gfs2 in-place
|
||||
mount.gfs2 to help mount(8) mount a filesystem
|
||||
mkfs.gfs2 to make a filesystem
|
||||
|
@ -23,8 +23,13 @@ Mount options unique to the isofs filesystem.
|
||||
map=off Do not map non-Rock Ridge filenames to lower case
|
||||
map=normal Map non-Rock Ridge filenames to lower case
|
||||
map=acorn As map=normal but also apply Acorn extensions if present
|
||||
mode=xxx Sets the permissions on files to xxx
|
||||
dmode=xxx Sets the permissions on directories to xxx
|
||||
mode=xxx Sets the permissions on files to xxx unless Rock Ridge
|
||||
extensions set the permissions otherwise
|
||||
dmode=xxx Sets the permissions on directories to xxx unless Rock Ridge
|
||||
extensions set the permissions otherwise
|
||||
overriderockperm Set permissions on files and directories according to
|
||||
'mode' and 'dmode' even though Rock Ridge extensions are
|
||||
present.
|
||||
nojoliet Ignore Joliet extensions if they are present.
|
||||
norock Ignore Rock Ridge extensions if they are present.
|
||||
hide Completely strip hidden files from the file system.
|
||||
|
@ -100,7 +100,7 @@ Installation
|
||||
$ sudo cp utils/mount/mount.nfs /sbin/mount.nfs
|
||||
|
||||
In this location, mount.nfs will be invoked automatically for NFS mounts
|
||||
by the system mount commmand.
|
||||
by the system mount command.
|
||||
|
||||
NOTE: mount.nfs and therefore nfs-utils-1.1.2 or greater is only needed
|
||||
on the NFS client machine. You do not need this specific version of
|
||||
|
@ -39,9 +39,8 @@ Features which NILFS2 does not support yet:
|
||||
- extended attributes
|
||||
- POSIX ACLs
|
||||
- quotas
|
||||
- writable snapshots
|
||||
- remote backup (CDP)
|
||||
- data integrity
|
||||
- fsck
|
||||
- resize
|
||||
- defragmentation
|
||||
|
||||
Mount options
|
||||
|
@ -5,11 +5,12 @@
|
||||
Bodo Bauer <bb@ricochet.net>
|
||||
|
||||
2.4.x update Jorge Nerin <comandante@zaralinux.com> November 14 2000
|
||||
move /proc/sys Shen Feng <shen@cn.fujitsu.com> April 1 2009
|
||||
move /proc/sys Shen Feng <shen@cn.fujitsu.com> April 1 2009
|
||||
------------------------------------------------------------------------------
|
||||
Version 1.3 Kernel version 2.2.12
|
||||
Kernel version 2.4.0-test11-pre4
|
||||
------------------------------------------------------------------------------
|
||||
fixes/update part 1.1 Stefani Seibold <stefani@seibold.net> June 9 2009
|
||||
|
||||
Table of Contents
|
||||
-----------------
|
||||
@ -116,7 +117,7 @@ The link self points to the process reading the file system. Each process
|
||||
subdirectory has the entries listed in Table 1-1.
|
||||
|
||||
|
||||
Table 1-1: Process specific entries in /proc
|
||||
Table 1-1: Process specific entries in /proc
|
||||
..............................................................................
|
||||
File Content
|
||||
clear_refs Clears page referenced bits shown in smaps output
|
||||
@ -134,46 +135,103 @@ Table 1-1: Process specific entries in /proc
|
||||
status Process status in human readable form
|
||||
wchan If CONFIG_KALLSYMS is set, a pre-decoded wchan
|
||||
stack Report full stack trace, enable via CONFIG_STACKTRACE
|
||||
smaps Extension based on maps, the rss size for each mapped file
|
||||
smaps a extension based on maps, showing the memory consumption of
|
||||
each mapping
|
||||
..............................................................................
|
||||
|
||||
For example, to get the status information of a process, all you have to do is
|
||||
read the file /proc/PID/status:
|
||||
|
||||
>cat /proc/self/status
|
||||
Name: cat
|
||||
State: R (running)
|
||||
Pid: 5452
|
||||
PPid: 743
|
||||
>cat /proc/self/status
|
||||
Name: cat
|
||||
State: R (running)
|
||||
Tgid: 5452
|
||||
Pid: 5452
|
||||
PPid: 743
|
||||
TracerPid: 0 (2.4)
|
||||
Uid: 501 501 501 501
|
||||
Gid: 100 100 100 100
|
||||
Groups: 100 14 16
|
||||
VmSize: 1112 kB
|
||||
VmLck: 0 kB
|
||||
VmRSS: 348 kB
|
||||
VmData: 24 kB
|
||||
VmStk: 12 kB
|
||||
VmExe: 8 kB
|
||||
VmLib: 1044 kB
|
||||
SigPnd: 0000000000000000
|
||||
SigBlk: 0000000000000000
|
||||
SigIgn: 0000000000000000
|
||||
SigCgt: 0000000000000000
|
||||
CapInh: 00000000fffffeff
|
||||
CapPrm: 0000000000000000
|
||||
CapEff: 0000000000000000
|
||||
|
||||
Uid: 501 501 501 501
|
||||
Gid: 100 100 100 100
|
||||
FDSize: 256
|
||||
Groups: 100 14 16
|
||||
VmPeak: 5004 kB
|
||||
VmSize: 5004 kB
|
||||
VmLck: 0 kB
|
||||
VmHWM: 476 kB
|
||||
VmRSS: 476 kB
|
||||
VmData: 156 kB
|
||||
VmStk: 88 kB
|
||||
VmExe: 68 kB
|
||||
VmLib: 1412 kB
|
||||
VmPTE: 20 kb
|
||||
Threads: 1
|
||||
SigQ: 0/28578
|
||||
SigPnd: 0000000000000000
|
||||
ShdPnd: 0000000000000000
|
||||
SigBlk: 0000000000000000
|
||||
SigIgn: 0000000000000000
|
||||
SigCgt: 0000000000000000
|
||||
CapInh: 00000000fffffeff
|
||||
CapPrm: 0000000000000000
|
||||
CapEff: 0000000000000000
|
||||
CapBnd: ffffffffffffffff
|
||||
voluntary_ctxt_switches: 0
|
||||
nonvoluntary_ctxt_switches: 1
|
||||
|
||||
This shows you nearly the same information you would get if you viewed it with
|
||||
the ps command. In fact, ps uses the proc file system to obtain its
|
||||
information. The statm file contains more detailed information about the
|
||||
process memory usage. Its seven fields are explained in Table 1-2. The stat
|
||||
file contains details information about the process itself. Its fields are
|
||||
explained in Table 1-3.
|
||||
information. But you get a more detailed view of the process by reading the
|
||||
file /proc/PID/status. It fields are described in table 1-2.
|
||||
|
||||
The statm file contains more detailed information about the process
|
||||
memory usage. Its seven fields are explained in Table 1-3. The stat file
|
||||
contains details information about the process itself. Its fields are
|
||||
explained in Table 1-4.
|
||||
|
||||
Table 1-2: Contents of the statm files (as of 2.6.8-rc3)
|
||||
Table 1-2: Contents of the statm files (as of 2.6.30-rc7)
|
||||
..............................................................................
|
||||
Field Content
|
||||
Name filename of the executable
|
||||
State state (R is running, S is sleeping, D is sleeping
|
||||
in an uninterruptible wait, Z is zombie,
|
||||
T is traced or stopped)
|
||||
Tgid thread group ID
|
||||
Pid process id
|
||||
PPid process id of the parent process
|
||||
TracerPid PID of process tracing this process (0 if not)
|
||||
Uid Real, effective, saved set, and file system UIDs
|
||||
Gid Real, effective, saved set, and file system GIDs
|
||||
FDSize number of file descriptor slots currently allocated
|
||||
Groups supplementary group list
|
||||
VmPeak peak virtual memory size
|
||||
VmSize total program size
|
||||
VmLck locked memory size
|
||||
VmHWM peak resident set size ("high water mark")
|
||||
VmRSS size of memory portions
|
||||
VmData size of data, stack, and text segments
|
||||
VmStk size of data, stack, and text segments
|
||||
VmExe size of text segment
|
||||
VmLib size of shared library code
|
||||
VmPTE size of page table entries
|
||||
Threads number of threads
|
||||
SigQ number of signals queued/max. number for queue
|
||||
SigPnd bitmap of pending signals for the thread
|
||||
ShdPnd bitmap of shared pending signals for the process
|
||||
SigBlk bitmap of blocked signals
|
||||
SigIgn bitmap of ignored signals
|
||||
SigCgt bitmap of catched signals
|
||||
CapInh bitmap of inheritable capabilities
|
||||
CapPrm bitmap of permitted capabilities
|
||||
CapEff bitmap of effective capabilities
|
||||
CapBnd bitmap of capabilities bounding set
|
||||
Cpus_allowed mask of CPUs on which this process may run
|
||||
Cpus_allowed_list Same as previous, but in "list format"
|
||||
Mems_allowed mask of memory nodes allowed to this process
|
||||
Mems_allowed_list Same as previous, but in "list format"
|
||||
voluntary_ctxt_switches number of voluntary context switches
|
||||
nonvoluntary_ctxt_switches number of non voluntary context switches
|
||||
..............................................................................
|
||||
|
||||
Table 1-3: Contents of the statm files (as of 2.6.8-rc3)
|
||||
..............................................................................
|
||||
Field Content
|
||||
size total program size (pages) (same as VmSize in status)
|
||||
@ -188,7 +246,7 @@ Table 1-2: Contents of the statm files (as of 2.6.8-rc3)
|
||||
..............................................................................
|
||||
|
||||
|
||||
Table 1-3: Contents of the stat files (as of 2.6.22-rc3)
|
||||
Table 1-4: Contents of the stat files (as of 2.6.30-rc7)
|
||||
..............................................................................
|
||||
Field Content
|
||||
pid process id
|
||||
@ -222,10 +280,10 @@ Table 1-3: Contents of the stat files (as of 2.6.22-rc3)
|
||||
start_stack address of the start of the stack
|
||||
esp current value of ESP
|
||||
eip current value of EIP
|
||||
pending bitmap of pending signals (obsolete)
|
||||
blocked bitmap of blocked signals (obsolete)
|
||||
sigign bitmap of ignored signals (obsolete)
|
||||
sigcatch bitmap of catched signals (obsolete)
|
||||
pending bitmap of pending signals
|
||||
blocked bitmap of blocked signals
|
||||
sigign bitmap of ignored signals
|
||||
sigcatch bitmap of catched signals
|
||||
wchan address where process went to sleep
|
||||
0 (place holder)
|
||||
0 (place holder)
|
||||
@ -234,19 +292,99 @@ Table 1-3: Contents of the stat files (as of 2.6.22-rc3)
|
||||
rt_priority realtime priority
|
||||
policy scheduling policy (man sched_setscheduler)
|
||||
blkio_ticks time spent waiting for block IO
|
||||
gtime guest time of the task in jiffies
|
||||
cgtime guest time of the task children in jiffies
|
||||
..............................................................................
|
||||
|
||||
The /proc/PID/map file containing the currently mapped memory regions and
|
||||
their access permissions.
|
||||
|
||||
The format is:
|
||||
|
||||
address perms offset dev inode pathname
|
||||
|
||||
08048000-08049000 r-xp 00000000 03:00 8312 /opt/test
|
||||
08049000-0804a000 rw-p 00001000 03:00 8312 /opt/test
|
||||
0804a000-0806b000 rw-p 00000000 00:00 0 [heap]
|
||||
a7cb1000-a7cb2000 ---p 00000000 00:00 0
|
||||
a7cb2000-a7eb2000 rw-p 00000000 00:00 0
|
||||
a7eb2000-a7eb3000 ---p 00000000 00:00 0
|
||||
a7eb3000-a7ed5000 rw-p 00000000 00:00 0
|
||||
a7ed5000-a8008000 r-xp 00000000 03:00 4222 /lib/libc.so.6
|
||||
a8008000-a800a000 r--p 00133000 03:00 4222 /lib/libc.so.6
|
||||
a800a000-a800b000 rw-p 00135000 03:00 4222 /lib/libc.so.6
|
||||
a800b000-a800e000 rw-p 00000000 00:00 0
|
||||
a800e000-a8022000 r-xp 00000000 03:00 14462 /lib/libpthread.so.0
|
||||
a8022000-a8023000 r--p 00013000 03:00 14462 /lib/libpthread.so.0
|
||||
a8023000-a8024000 rw-p 00014000 03:00 14462 /lib/libpthread.so.0
|
||||
a8024000-a8027000 rw-p 00000000 00:00 0
|
||||
a8027000-a8043000 r-xp 00000000 03:00 8317 /lib/ld-linux.so.2
|
||||
a8043000-a8044000 r--p 0001b000 03:00 8317 /lib/ld-linux.so.2
|
||||
a8044000-a8045000 rw-p 0001c000 03:00 8317 /lib/ld-linux.so.2
|
||||
aff35000-aff4a000 rw-p 00000000 00:00 0 [stack]
|
||||
ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]
|
||||
|
||||
where "address" is the address space in the process that it occupies, "perms"
|
||||
is a set of permissions:
|
||||
|
||||
r = read
|
||||
w = write
|
||||
x = execute
|
||||
s = shared
|
||||
p = private (copy on write)
|
||||
|
||||
"offset" is the offset into the mapping, "dev" is the device (major:minor), and
|
||||
"inode" is the inode on that device. 0 indicates that no inode is associated
|
||||
with the memory region, as the case would be with BSS (uninitialized data).
|
||||
The "pathname" shows the name associated file for this mapping. If the mapping
|
||||
is not associated with a file:
|
||||
|
||||
[heap] = the heap of the program
|
||||
[stack] = the stack of the main process
|
||||
[vdso] = the "virtual dynamic shared object",
|
||||
the kernel system call handler
|
||||
|
||||
or if empty, the mapping is anonymous.
|
||||
|
||||
|
||||
The /proc/PID/smaps is an extension based on maps, showing the memory
|
||||
consumption for each of the process's mappings. For each of mappings there
|
||||
is a series of lines such as the following:
|
||||
|
||||
08048000-080bc000 r-xp 00000000 03:02 13130 /bin/bash
|
||||
Size: 1084 kB
|
||||
Rss: 892 kB
|
||||
Pss: 374 kB
|
||||
Shared_Clean: 892 kB
|
||||
Shared_Dirty: 0 kB
|
||||
Private_Clean: 0 kB
|
||||
Private_Dirty: 0 kB
|
||||
Referenced: 892 kB
|
||||
Swap: 0 kB
|
||||
KernelPageSize: 4 kB
|
||||
MMUPageSize: 4 kB
|
||||
|
||||
The first of these lines shows the same information as is displayed for the
|
||||
mapping in /proc/PID/maps. The remaining lines show the size of the mapping,
|
||||
the amount of the mapping that is currently resident in RAM, the "proportional
|
||||
set size” (divide each shared page by the number of processes sharing it), the
|
||||
number of clean and dirty shared pages in the mapping, and the number of clean
|
||||
and dirty private pages in the mapping. The "Referenced" indicates the amount
|
||||
of memory currently marked as referenced or accessed.
|
||||
|
||||
This file is only present if the CONFIG_MMU kernel configuration option is
|
||||
enabled.
|
||||
|
||||
1.2 Kernel data
|
||||
---------------
|
||||
|
||||
Similar to the process entries, the kernel data files give information about
|
||||
the running kernel. The files used to obtain this information are contained in
|
||||
/proc and are listed in Table 1-4. Not all of these will be present in your
|
||||
/proc and are listed in Table 1-5. Not all of these will be present in your
|
||||
system. It depends on the kernel configuration and the loaded modules, which
|
||||
files are there, and which are missing.
|
||||
|
||||
Table 1-4: Kernel info in /proc
|
||||
Table 1-5: Kernel info in /proc
|
||||
..............................................................................
|
||||
File Content
|
||||
apm Advanced power management info
|
||||
@ -283,6 +421,7 @@ Table 1-4: Kernel info in /proc
|
||||
rtc Real time clock
|
||||
scsi SCSI info (see text)
|
||||
slabinfo Slab pool info
|
||||
softirqs softirq usage
|
||||
stat Overall statistics
|
||||
swaps Swap space utilization
|
||||
sys See chapter 2
|
||||
@ -366,7 +505,7 @@ just those considered 'most important'. The new vectors are:
|
||||
RES, CAL, TLB -- rescheduling, call and TLB flush interrupts are
|
||||
sent from one CPU to another per the needs of the OS. Typically,
|
||||
their statistics are used by kernel developers and interested users to
|
||||
determine the occurance of interrupt of the given type.
|
||||
determine the occurrence of interrupts of the given type.
|
||||
|
||||
The above IRQ vectors are displayed only when relevent. For example,
|
||||
the threshold vector does not exist on x86_64 platforms. Others are
|
||||
@ -551,7 +690,7 @@ Committed_AS: The amount of memory presently allocated on the system.
|
||||
memory once that memory has been successfully allocated.
|
||||
VmallocTotal: total size of vmalloc memory area
|
||||
VmallocUsed: amount of vmalloc area which is used
|
||||
VmallocChunk: largest contigious block of vmalloc area which is free
|
||||
VmallocChunk: largest contiguous block of vmalloc area which is free
|
||||
|
||||
..............................................................................
|
||||
|
||||
@ -597,6 +736,25 @@ on the kind of area :
|
||||
0xffffffffa0017000-0xffffffffa0022000 45056 sys_init_module+0xc27/0x1d00 ...
|
||||
pages=10 vmalloc N0=10
|
||||
|
||||
..............................................................................
|
||||
|
||||
softirqs:
|
||||
|
||||
Provides counts of softirq handlers serviced since boot time, for each cpu.
|
||||
|
||||
> cat /proc/softirqs
|
||||
CPU0 CPU1 CPU2 CPU3
|
||||
HI: 0 0 0 0
|
||||
TIMER: 27166 27120 27097 27034
|
||||
NET_TX: 0 0 0 17
|
||||
NET_RX: 42 0 0 39
|
||||
BLOCK: 0 0 107 1121
|
||||
TASKLET: 0 0 0 290
|
||||
SCHED: 27035 26983 26971 26746
|
||||
HRTIMER: 0 0 0 0
|
||||
RCU: 1678 1769 2178 2250
|
||||
|
||||
|
||||
1.3 IDE devices in /proc/ide
|
||||
----------------------------
|
||||
|
||||
@ -614,10 +772,10 @@ IDE devices:
|
||||
|
||||
More detailed information can be found in the controller specific
|
||||
subdirectories. These are named ide0, ide1 and so on. Each of these
|
||||
directories contains the files shown in table 1-5.
|
||||
directories contains the files shown in table 1-6.
|
||||
|
||||
|
||||
Table 1-5: IDE controller info in /proc/ide/ide?
|
||||
Table 1-6: IDE controller info in /proc/ide/ide?
|
||||
..............................................................................
|
||||
File Content
|
||||
channel IDE channel (0 or 1)
|
||||
@ -627,11 +785,11 @@ Table 1-5: IDE controller info in /proc/ide/ide?
|
||||
..............................................................................
|
||||
|
||||
Each device connected to a controller has a separate subdirectory in the
|
||||
controllers directory. The files listed in table 1-6 are contained in these
|
||||
controllers directory. The files listed in table 1-7 are contained in these
|
||||
directories.
|
||||
|
||||
|
||||
Table 1-6: IDE device information
|
||||
Table 1-7: IDE device information
|
||||
..............................................................................
|
||||
File Content
|
||||
cache The cache
|
||||
@ -673,12 +831,12 @@ the drive parameters:
|
||||
1.4 Networking info in /proc/net
|
||||
--------------------------------
|
||||
|
||||
The subdirectory /proc/net follows the usual pattern. Table 1-6 shows the
|
||||
The subdirectory /proc/net follows the usual pattern. Table 1-8 shows the
|
||||
additional values you get for IP version 6 if you configure the kernel to
|
||||
support this. Table 1-7 lists the files and their meaning.
|
||||
support this. Table 1-9 lists the files and their meaning.
|
||||
|
||||
|
||||
Table 1-6: IPv6 info in /proc/net
|
||||
Table 1-8: IPv6 info in /proc/net
|
||||
..............................................................................
|
||||
File Content
|
||||
udp6 UDP sockets (IPv6)
|
||||
@ -693,7 +851,7 @@ Table 1-6: IPv6 info in /proc/net
|
||||
..............................................................................
|
||||
|
||||
|
||||
Table 1-7: Network info in /proc/net
|
||||
Table 1-9: Network info in /proc/net
|
||||
..............................................................................
|
||||
File Content
|
||||
arp Kernel ARP table
|
||||
@ -817,10 +975,10 @@ The directory /proc/parport contains information about the parallel ports of
|
||||
your system. It has one subdirectory for each port, named after the port
|
||||
number (0,1,2,...).
|
||||
|
||||
These directories contain the four files shown in Table 1-8.
|
||||
These directories contain the four files shown in Table 1-10.
|
||||
|
||||
|
||||
Table 1-8: Files in /proc/parport
|
||||
Table 1-10: Files in /proc/parport
|
||||
..............................................................................
|
||||
File Content
|
||||
autoprobe Any IEEE-1284 device ID information that has been acquired.
|
||||
@ -838,10 +996,10 @@ Table 1-8: Files in /proc/parport
|
||||
|
||||
Information about the available and actually used tty's can be found in the
|
||||
directory /proc/tty.You'll find entries for drivers and line disciplines in
|
||||
this directory, as shown in Table 1-9.
|
||||
this directory, as shown in Table 1-11.
|
||||
|
||||
|
||||
Table 1-9: Files in /proc/tty
|
||||
Table 1-11: Files in /proc/tty
|
||||
..............................................................................
|
||||
File Content
|
||||
drivers list of drivers and their usage
|
||||
@ -883,6 +1041,7 @@ since the system first booted. For a quick look, simply cat the file:
|
||||
processes 2915
|
||||
procs_running 1
|
||||
procs_blocked 0
|
||||
softirq 183433 0 21755 12 39 1137 231 21459 2263
|
||||
|
||||
The very first "cpu" line aggregates the numbers in all of the other "cpuN"
|
||||
lines. These numbers identify the amount of time the CPU has spent performing
|
||||
@ -918,6 +1077,11 @@ CPUs.
|
||||
The "procs_blocked" line gives the number of processes currently blocked,
|
||||
waiting for I/O to complete.
|
||||
|
||||
The "softirq" line gives counts of softirqs serviced since boot time, for each
|
||||
of the possible system softirqs. The first column is the total of all
|
||||
softirqs serviced; each subsequent column is the total for that particular
|
||||
softirq.
|
||||
|
||||
|
||||
1.9 Ext4 file system parameters
|
||||
------------------------------
|
||||
@ -926,9 +1090,9 @@ Information about mounted ext4 file systems can be found in
|
||||
/proc/fs/ext4. Each mounted filesystem will have a directory in
|
||||
/proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or
|
||||
/proc/fs/ext4/dm-0). The files in each per-device directory are shown
|
||||
in Table 1-10, below.
|
||||
in Table 1-12, below.
|
||||
|
||||
Table 1-10: Files in /proc/fs/ext4/<devname>
|
||||
Table 1-12: Files in /proc/fs/ext4/<devname>
|
||||
..............................................................................
|
||||
File Content
|
||||
mb_groups details of multiblock allocator buddy cache of free blocks
|
||||
@ -1003,11 +1167,13 @@ CHAPTER 3: PER-PROCESS PARAMETERS
|
||||
3.1 /proc/<pid>/oom_adj - Adjust the oom-killer score
|
||||
------------------------------------------------------
|
||||
|
||||
This file can be used to adjust the score used to select which processes
|
||||
should be killed in an out-of-memory situation. Giving it a high score will
|
||||
increase the likelihood of this process being killed by the oom-killer. Valid
|
||||
values are in the range -16 to +15, plus the special value -17, which disables
|
||||
oom-killing altogether for this process.
|
||||
This file can be used to adjust the score used to select which processes should
|
||||
be killed in an out-of-memory situation. The oom_adj value is a characteristic
|
||||
of the task's mm, so all threads that share an mm with pid will have the same
|
||||
oom_adj value. A high value will increase the likelihood of this process being
|
||||
killed by the oom-killer. Valid values are in the range -16 to +15 as
|
||||
explained below and a special value of -17, which disables oom-killing
|
||||
altogether for threads sharing pid's mm.
|
||||
|
||||
The process to be killed in an out-of-memory situation is selected among all others
|
||||
based on its badness score. This value equals the original memory size of the process
|
||||
@ -1021,6 +1187,9 @@ the parent's score if they do not share the same memory. Thus forking servers
|
||||
are the prime candidates to be killed. Having only one 'hungry' child will make
|
||||
parent less preferable than the child.
|
||||
|
||||
/proc/<pid>/oom_adj cannot be changed for kthreads since they are immune from
|
||||
oom-killing already.
|
||||
|
||||
/proc/<pid>/oom_score shows process' current badness score.
|
||||
|
||||
The following heuristics are then applied:
|
||||
|
@ -72,7 +72,7 @@ The 'rom' file is special in that it provides read-only access to the device's
|
||||
ROM file, if available. It's disabled by default, however, so applications
|
||||
should write the string "1" to the file to enable it before attempting a read
|
||||
call, and disable it following the access by writing "0" to the file. Note
|
||||
that the device must be enabled for a rom read to return data succesfully.
|
||||
that the device must be enabled for a rom read to return data successfully.
|
||||
In the event a driver is not bound to the device, it can be enabled using the
|
||||
'enable' file, documented above.
|
||||
|
||||
|
@ -124,14 +124,19 @@ sys_immutable -- If set, ATTR_SYS attribute on FAT is handled as
|
||||
flush -- If set, the filesystem will try to flush to disk more
|
||||
early than normal. Not set by default.
|
||||
|
||||
rodir -- FAT has the ATTR_RO (read-only) attribute. But on Windows,
|
||||
the ATTR_RO of the directory will be just ignored actually,
|
||||
and is used by only applications as flag. E.g. it's setted
|
||||
for the customized folder.
|
||||
rodir -- FAT has the ATTR_RO (read-only) attribute. On Windows,
|
||||
the ATTR_RO of the directory will just be ignored,
|
||||
and is used only by applications as a flag (e.g. it's set
|
||||
for the customized folder).
|
||||
|
||||
If you want to use ATTR_RO as read-only flag even for
|
||||
the directory, set this option.
|
||||
|
||||
errors=panic|continue|remount-ro
|
||||
-- specify FAT behavior on critical errors: panic, continue
|
||||
without doing anything or remount the partition in
|
||||
read-only mode (default behavior).
|
||||
|
||||
<bool>: 0,1,yes,no,true,false
|
||||
|
||||
TODO
|
||||
|
@ -77,7 +77,8 @@
|
||||
seconds for the whole load operation.
|
||||
|
||||
- request_firmware_nowait() is also provided for convenience in
|
||||
non-user contexts.
|
||||
user contexts to request firmware asynchronously, but can't be called
|
||||
in atomic contexts.
|
||||
|
||||
|
||||
about in-kernel persistence:
|
||||
|
246
Documentation/gcov.txt
Normal file
246
Documentation/gcov.txt
Normal file
@ -0,0 +1,246 @@
|
||||
Using gcov with the Linux kernel
|
||||
================================
|
||||
|
||||
1. Introduction
|
||||
2. Preparation
|
||||
3. Customization
|
||||
4. Files
|
||||
5. Modules
|
||||
6. Separated build and test machines
|
||||
7. Troubleshooting
|
||||
Appendix A: sample script: gather_on_build.sh
|
||||
Appendix B: sample script: gather_on_test.sh
|
||||
|
||||
|
||||
1. Introduction
|
||||
===============
|
||||
|
||||
gcov profiling kernel support enables the use of GCC's coverage testing
|
||||
tool gcov [1] with the Linux kernel. Coverage data of a running kernel
|
||||
is exported in gcov-compatible format via the "gcov" debugfs directory.
|
||||
To get coverage data for a specific file, change to the kernel build
|
||||
directory and use gcov with the -o option as follows (requires root):
|
||||
|
||||
# cd /tmp/linux-out
|
||||
# gcov -o /sys/kernel/debug/gcov/tmp/linux-out/kernel spinlock.c
|
||||
|
||||
This will create source code files annotated with execution counts
|
||||
in the current directory. In addition, graphical gcov front-ends such
|
||||
as lcov [2] can be used to automate the process of collecting data
|
||||
for the entire kernel and provide coverage overviews in HTML format.
|
||||
|
||||
Possible uses:
|
||||
|
||||
* debugging (has this line been reached at all?)
|
||||
* test improvement (how do I change my test to cover these lines?)
|
||||
* minimizing kernel configurations (do I need this option if the
|
||||
associated code is never run?)
|
||||
|
||||
--
|
||||
|
||||
[1] http://gcc.gnu.org/onlinedocs/gcc/Gcov.html
|
||||
[2] http://ltp.sourceforge.net/coverage/lcov.php
|
||||
|
||||
|
||||
2. Preparation
|
||||
==============
|
||||
|
||||
Configure the kernel with:
|
||||
|
||||
CONFIG_DEBUGFS=y
|
||||
CONFIG_GCOV_KERNEL=y
|
||||
|
||||
and to get coverage data for the entire kernel:
|
||||
|
||||
CONFIG_GCOV_PROFILE_ALL=y
|
||||
|
||||
Note that kernels compiled with profiling flags will be significantly
|
||||
larger and run slower. Also CONFIG_GCOV_PROFILE_ALL may not be supported
|
||||
on all architectures.
|
||||
|
||||
Profiling data will only become accessible once debugfs has been
|
||||
mounted:
|
||||
|
||||
mount -t debugfs none /sys/kernel/debug
|
||||
|
||||
|
||||
3. Customization
|
||||
================
|
||||
|
||||
To enable profiling for specific files or directories, add a line
|
||||
similar to the following to the respective kernel Makefile:
|
||||
|
||||
For a single file (e.g. main.o):
|
||||
GCOV_PROFILE_main.o := y
|
||||
|
||||
For all files in one directory:
|
||||
GCOV_PROFILE := y
|
||||
|
||||
To exclude files from being profiled even when CONFIG_GCOV_PROFILE_ALL
|
||||
is specified, use:
|
||||
|
||||
GCOV_PROFILE_main.o := n
|
||||
and:
|
||||
GCOV_PROFILE := n
|
||||
|
||||
Only files which are linked to the main kernel image or are compiled as
|
||||
kernel modules are supported by this mechanism.
|
||||
|
||||
|
||||
4. Files
|
||||
========
|
||||
|
||||
The gcov kernel support creates the following files in debugfs:
|
||||
|
||||
/sys/kernel/debug/gcov
|
||||
Parent directory for all gcov-related files.
|
||||
|
||||
/sys/kernel/debug/gcov/reset
|
||||
Global reset file: resets all coverage data to zero when
|
||||
written to.
|
||||
|
||||
/sys/kernel/debug/gcov/path/to/compile/dir/file.gcda
|
||||
The actual gcov data file as understood by the gcov
|
||||
tool. Resets file coverage data to zero when written to.
|
||||
|
||||
/sys/kernel/debug/gcov/path/to/compile/dir/file.gcno
|
||||
Symbolic link to a static data file required by the gcov
|
||||
tool. This file is generated by gcc when compiling with
|
||||
option -ftest-coverage.
|
||||
|
||||
|
||||
5. Modules
|
||||
==========
|
||||
|
||||
Kernel modules may contain cleanup code which is only run during
|
||||
module unload time. The gcov mechanism provides a means to collect
|
||||
coverage data for such code by keeping a copy of the data associated
|
||||
with the unloaded module. This data remains available through debugfs.
|
||||
Once the module is loaded again, the associated coverage counters are
|
||||
initialized with the data from its previous instantiation.
|
||||
|
||||
This behavior can be deactivated by specifying the gcov_persist kernel
|
||||
parameter:
|
||||
|
||||
gcov_persist=0
|
||||
|
||||
At run-time, a user can also choose to discard data for an unloaded
|
||||
module by writing to its data file or the global reset file.
|
||||
|
||||
|
||||
6. Separated build and test machines
|
||||
====================================
|
||||
|
||||
The gcov kernel profiling infrastructure is designed to work out-of-the
|
||||
box for setups where kernels are built and run on the same machine. In
|
||||
cases where the kernel runs on a separate machine, special preparations
|
||||
must be made, depending on where the gcov tool is used:
|
||||
|
||||
a) gcov is run on the TEST machine
|
||||
|
||||
The gcov tool version on the test machine must be compatible with the
|
||||
gcc version used for kernel build. Also the following files need to be
|
||||
copied from build to test machine:
|
||||
|
||||
from the source tree:
|
||||
- all C source files + headers
|
||||
|
||||
from the build tree:
|
||||
- all C source files + headers
|
||||
- all .gcda and .gcno files
|
||||
- all links to directories
|
||||
|
||||
It is important to note that these files need to be placed into the
|
||||
exact same file system location on the test machine as on the build
|
||||
machine. If any of the path components is symbolic link, the actual
|
||||
directory needs to be used instead (due to make's CURDIR handling).
|
||||
|
||||
b) gcov is run on the BUILD machine
|
||||
|
||||
The following files need to be copied after each test case from test
|
||||
to build machine:
|
||||
|
||||
from the gcov directory in sysfs:
|
||||
- all .gcda files
|
||||
- all links to .gcno files
|
||||
|
||||
These files can be copied to any location on the build machine. gcov
|
||||
must then be called with the -o option pointing to that directory.
|
||||
|
||||
Example directory setup on the build machine:
|
||||
|
||||
/tmp/linux: kernel source tree
|
||||
/tmp/out: kernel build directory as specified by make O=
|
||||
/tmp/coverage: location of the files copied from the test machine
|
||||
|
||||
[user@build] cd /tmp/out
|
||||
[user@build] gcov -o /tmp/coverage/tmp/out/init main.c
|
||||
|
||||
|
||||
7. Troubleshooting
|
||||
==================
|
||||
|
||||
Problem: Compilation aborts during linker step.
|
||||
Cause: Profiling flags are specified for source files which are not
|
||||
linked to the main kernel or which are linked by a custom
|
||||
linker procedure.
|
||||
Solution: Exclude affected source files from profiling by specifying
|
||||
GCOV_PROFILE := n or GCOV_PROFILE_basename.o := n in the
|
||||
corresponding Makefile.
|
||||
|
||||
|
||||
Appendix A: gather_on_build.sh
|
||||
==============================
|
||||
|
||||
Sample script to gather coverage meta files on the build machine
|
||||
(see 6a):
|
||||
|
||||
#!/bin/bash
|
||||
|
||||
KSRC=$1
|
||||
KOBJ=$2
|
||||
DEST=$3
|
||||
|
||||
if [ -z "$KSRC" ] || [ -z "$KOBJ" ] || [ -z "$DEST" ]; then
|
||||
echo "Usage: $0 <ksrc directory> <kobj directory> <output.tar.gz>" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
KSRC=$(cd $KSRC; printf "all:\n\t@echo \${CURDIR}\n" | make -f -)
|
||||
KOBJ=$(cd $KOBJ; printf "all:\n\t@echo \${CURDIR}\n" | make -f -)
|
||||
|
||||
find $KSRC $KOBJ \( -name '*.gcno' -o -name '*.[ch]' -o -type l \) -a \
|
||||
-perm /u+r,g+r | tar cfz $DEST -P -T -
|
||||
|
||||
if [ $? -eq 0 ] ; then
|
||||
echo "$DEST successfully created, copy to test system and unpack with:"
|
||||
echo " tar xfz $DEST -P"
|
||||
else
|
||||
echo "Could not create file $DEST"
|
||||
fi
|
||||
|
||||
|
||||
Appendix B: gather_on_test.sh
|
||||
=============================
|
||||
|
||||
Sample script to gather coverage data files on the test machine
|
||||
(see 6b):
|
||||
|
||||
#!/bin/bash
|
||||
|
||||
DEST=$1
|
||||
GCDA=/sys/kernel/debug/gcov
|
||||
|
||||
if [ -z "$DEST" ] ; then
|
||||
echo "Usage: $0 <output.tar.gz>" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
find $GCDA -name '*.gcno' -o -name '*.gcda' | tar cfz $DEST -T -
|
||||
|
||||
if [ $? -eq 0 ] ; then
|
||||
echo "$DEST successfully created, copy to build system and unpack with:"
|
||||
echo " tar xfz $DEST"
|
||||
else
|
||||
echo "Could not create file $DEST"
|
||||
fi
|
@ -458,7 +458,7 @@ debugfs interface, since it provides control over GPIO direction and
|
||||
value instead of just showing a gpio state summary. Plus, it could be
|
||||
present on production systems without debugging support.
|
||||
|
||||
Given approprate hardware documentation for the system, userspace could
|
||||
Given appropriate hardware documentation for the system, userspace could
|
||||
know for example that GPIO #23 controls the write protect line used to
|
||||
protect boot loader segments in flash memory. System upgrade procedures
|
||||
may need to temporarily remove that protection, first importing a GPIO,
|
||||
|
@ -2,14 +2,18 @@ Kernel driver f71882fg
|
||||
======================
|
||||
|
||||
Supported chips:
|
||||
* Fintek F71882FG and F71883FG
|
||||
Prefix: 'f71882fg'
|
||||
* Fintek F71858FG
|
||||
Prefix: 'f71858fg'
|
||||
Addresses scanned: none, address read from Super I/O config space
|
||||
Datasheet: Available from the Fintek website
|
||||
* Fintek F71862FG and F71863FG
|
||||
Prefix: 'f71862fg'
|
||||
Addresses scanned: none, address read from Super I/O config space
|
||||
Datasheet: Available from the Fintek website
|
||||
* Fintek F71882FG and F71883FG
|
||||
Prefix: 'f71882fg'
|
||||
Addresses scanned: none, address read from Super I/O config space
|
||||
Datasheet: Available from the Fintek website
|
||||
* Fintek F8000
|
||||
Prefix: 'f8000'
|
||||
Addresses scanned: none, address read from Super I/O config space
|
||||
@ -66,13 +70,13 @@ printed when loading the driver.
|
||||
|
||||
Three different fan control modes are supported; the mode number is written
|
||||
to the pwm#_enable file. Note that not all modes are supported on all
|
||||
chips, and some modes may only be available in RPM / PWM mode on the F8000.
|
||||
chips, and some modes may only be available in RPM / PWM mode.
|
||||
Writing an unsupported mode will result in an invalid parameter error.
|
||||
|
||||
* 1: Manual mode
|
||||
You ask for a specific PWM duty cycle / DC voltage or a specific % of
|
||||
fan#_full_speed by writing to the pwm# file. This mode is only
|
||||
available on the F8000 if the fan channel is in RPM mode.
|
||||
available on the F71858FG / F8000 if the fan channel is in RPM mode.
|
||||
|
||||
* 2: Normal auto mode
|
||||
You can define a number of temperature/fan speed trip points, which % the
|
||||
|
@ -7,7 +7,7 @@ henceforth as AEM.
|
||||
Supported systems:
|
||||
* Any recent IBM System X server with AEM support.
|
||||
This includes the x3350, x3550, x3650, x3655, x3755, x3850 M2,
|
||||
x3950 M2, and certain HS2x/LS2x/QS2x blades. The IPMI host interface
|
||||
x3950 M2, and certain HC10/HS2x/LS2x/QS2x blades. The IPMI host interface
|
||||
driver ("ipmi-si") needs to be loaded for this driver to do anything.
|
||||
Prefix: 'ibmaem'
|
||||
Datasheet: Not available
|
||||
|
@ -70,6 +70,7 @@ are interpreted as 0! For more on how written strings are interpreted see the
|
||||
[0-*] denotes any positive number starting from 0
|
||||
[1-*] denotes any positive number starting from 1
|
||||
RO read only value
|
||||
WO write only value
|
||||
RW read/write value
|
||||
|
||||
Read/write values may be read-only for some chips, depending on the
|
||||
@ -295,6 +296,24 @@ temp[1-*]_label Suggested temperature channel label.
|
||||
user-space.
|
||||
RO
|
||||
|
||||
temp[1-*]_lowest
|
||||
Historical minimum temperature
|
||||
Unit: millidegree Celsius
|
||||
RO
|
||||
|
||||
temp[1-*]_highest
|
||||
Historical maximum temperature
|
||||
Unit: millidegree Celsius
|
||||
RO
|
||||
|
||||
temp[1-*]_reset_history
|
||||
Reset temp_lowest and temp_highest
|
||||
WO
|
||||
|
||||
temp_reset_history
|
||||
Reset temp_lowest and temp_highest for all sensors
|
||||
WO
|
||||
|
||||
Some chips measure temperature using external thermistors and an ADC, and
|
||||
report the temperature measurement as a voltage. Converting this voltage
|
||||
back to a temperature (or the other way around for limits) requires
|
||||
|
42
Documentation/hwmon/tmp401
Normal file
42
Documentation/hwmon/tmp401
Normal file
@ -0,0 +1,42 @@
|
||||
Kernel driver tmp401
|
||||
====================
|
||||
|
||||
Supported chips:
|
||||
* Texas Instruments TMP401
|
||||
Prefix: 'tmp401'
|
||||
Addresses scanned: I2C 0x4c
|
||||
Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp401.html
|
||||
* Texas Instruments TMP411
|
||||
Prefix: 'tmp411'
|
||||
Addresses scanned: I2C 0x4c
|
||||
Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp411.html
|
||||
|
||||
Authors:
|
||||
Hans de Goede <hdegoede@redhat.com>
|
||||
Andre Prendel <andre.prendel@gmx.de>
|
||||
|
||||
Description
|
||||
-----------
|
||||
|
||||
This driver implements support for Texas Instruments TMP401 and
|
||||
TMP411 chips. These chips implements one remote and one local
|
||||
temperature sensor. Temperature is measured in degrees
|
||||
Celsius. Resolution of the remote sensor is 0.0625 degree. Local
|
||||
sensor resolution can be set to 0.5, 0.25, 0.125 or 0.0625 degree (not
|
||||
supported by the driver so far, so using the default resolution of 0.5
|
||||
degree).
|
||||
|
||||
The driver provides the common sysfs-interface for temperatures (see
|
||||
/Documentation/hwmon/sysfs-interface under Temperatures).
|
||||
|
||||
The TMP411 chip is compatible with TMP401. It provides some additional
|
||||
features.
|
||||
|
||||
* Minimum and Maximum temperature measured since power-on, chip-reset
|
||||
|
||||
Exported via sysfs attributes tempX_lowest and tempX_highest.
|
||||
|
||||
* Reset of historical minimum/maximum temperature measurements
|
||||
|
||||
Exported via sysfs attribute temp_reset_history. Writing 1 to this
|
||||
file triggers a reset.
|
@ -12,6 +12,10 @@ Supported chips:
|
||||
Addresses scanned: ISA address retrieved from Super I/O registers
|
||||
Datasheet:
|
||||
http://www.nuvoton.com.tw/NR/rdonlyres/7885623D-A487-4CF9-A47F-30C5F73D6FE6/0/W83627DHG.pdf
|
||||
* Winbond W83627DHG-P
|
||||
Prefix: 'w83627dhg'
|
||||
Addresses scanned: ISA address retrieved from Super I/O registers
|
||||
Datasheet: not available
|
||||
* Winbond W83667HG
|
||||
Prefix: 'w83667hg'
|
||||
Addresses scanned: ISA address retrieved from Super I/O registers
|
||||
@ -28,8 +32,8 @@ Description
|
||||
-----------
|
||||
|
||||
This driver implements support for the Winbond W83627EHF, W83627EHG,
|
||||
W83627DHG and W83667HG super I/O chips. We will refer to them collectively
|
||||
as Winbond chips.
|
||||
W83627DHG, W83627DHG-P and W83667HG super I/O chips. We will refer to them
|
||||
collectively as Winbond chips.
|
||||
|
||||
The chips implement three temperature sensors, five fan rotation
|
||||
speed sensors, ten analog voltage sensors (only nine for the 627DHG), one
|
||||
@ -135,3 +139,6 @@ done in the driver for all register addresses.
|
||||
The DHG also supports PECI, where the DHG queries Intel CPU temperatures, and
|
||||
the ICH8 southbridge gets that data via PECI from the DHG, so that the
|
||||
southbridge drives the fans. And the DHG supports SST, a one-wire serial bus.
|
||||
|
||||
The DHG-P has an additional automatic fan speed control mode named Smart Fan
|
||||
(TM) III+. This mode is not yet supported by the driver.
|
||||
|
@ -20,6 +20,8 @@ platform_device with the base address and interrupt number. The
|
||||
dev.platform_data of the device should also point to a struct
|
||||
ocores_i2c_platform_data (see linux/i2c-ocores.h) describing the
|
||||
distance between registers and the input clock speed.
|
||||
There is also a possibility to attach a list of i2c_board_info which
|
||||
the i2c-ocores driver will add to the bus upon creation.
|
||||
|
||||
E.G. something like:
|
||||
|
||||
@ -36,9 +38,24 @@ static struct resource ocores_resources[] = {
|
||||
},
|
||||
};
|
||||
|
||||
/* optional board info */
|
||||
struct i2c_board_info ocores_i2c_board_info[] = {
|
||||
{
|
||||
I2C_BOARD_INFO("tsc2003", 0x48),
|
||||
.platform_data = &tsc2003_platform_data,
|
||||
.irq = TSC_IRQ
|
||||
},
|
||||
{
|
||||
I2C_BOARD_INFO("adv7180", 0x42 >> 1),
|
||||
.irq = ADV_IRQ
|
||||
}
|
||||
};
|
||||
|
||||
static struct ocores_i2c_platform_data myi2c_data = {
|
||||
.regstep = 2, /* two bytes between registers */
|
||||
.clock_khz = 50000, /* input clock of 50MHz */
|
||||
.devices = ocores_i2c_board_info, /* optional table of devices */
|
||||
.num_devices = ARRAY_SIZE(ocores_i2c_board_info), /* table size */
|
||||
};
|
||||
|
||||
static struct platform_device myi2c = {
|
||||
|
@ -19,6 +19,9 @@ Supported adapters:
|
||||
* VIA Technologies, Inc. VX800/VX820
|
||||
Datasheet: available on http://linux.via.com.tw
|
||||
|
||||
* VIA Technologies, Inc. VX855/VX875
|
||||
Datasheet: Availability unknown
|
||||
|
||||
Authors:
|
||||
Kyösti Mälkki <kmalkki@cc.hut.fi>,
|
||||
Mark D. Studebaker <mdsxyz123@yahoo.com>,
|
||||
@ -53,6 +56,7 @@ Your lspci -n listing must show one of these :
|
||||
device 1106:3287 (VT8251)
|
||||
device 1106:8324 (CX700)
|
||||
device 1106:8353 (VX800/VX820)
|
||||
device 1106:8409 (VX855/VX875)
|
||||
|
||||
If none of these show up, you should look in the BIOS for settings like
|
||||
enable ACPI / SMBus or even USB.
|
||||
|
@ -216,6 +216,8 @@ Other kernel parameters for ide_core are:
|
||||
|
||||
* "noflush=[interface_number.device_number]" to disable flush requests
|
||||
|
||||
* "nohpa=[interface_number.device_number]" to disable Host Protected Area
|
||||
|
||||
* "noprobe=[interface_number.device_number]" to skip probing
|
||||
|
||||
* "nowerr=[interface_number.device_number]" to ignore the WRERR_STAT bit
|
||||
|
@ -149,6 +149,8 @@ Code Seq# Include File Comments
|
||||
'p' 40-7F linux/nvram.h
|
||||
'p' 80-9F user-space parport
|
||||
<mailto:tim@cyberelk.net>
|
||||
'p' a1-a4 linux/pps.h LinuxPPS
|
||||
<mailto:giometti@linux.it>
|
||||
'q' 00-1F linux/serio.h
|
||||
'q' 80-FF Internet PhoneJACK, Internet LineJACK
|
||||
<http://www.quicknet.net>
|
||||
|
@ -22,16 +22,11 @@ README.gigaset
|
||||
- info on the drivers for Siemens Gigaset ISDN adapters.
|
||||
README.icn
|
||||
- info on the ICN-ISDN-card and its driver.
|
||||
>>>>>>> 93af7aca44f0e82e67bda10a0fb73d383edcc8bd:Documentation/isdn/00-INDEX
|
||||
README.HiSax
|
||||
- info on the HiSax driver which replaces the old teles.
|
||||
README.hfc-pci
|
||||
- info on hfc-pci based cards.
|
||||
README.pcbit
|
||||
- info on the PCBIT-D ISDN adapter and driver.
|
||||
README.syncppp
|
||||
- info on running Sync PPP over ISDN.
|
||||
syncPPP.FAQ
|
||||
- frequently asked questions about running PPP over ISDN.
|
||||
README.audio
|
||||
- info for running audio over ISDN.
|
||||
README.avmb1
|
||||
- info on driver for AVM-B1 ISDN card.
|
||||
README.act2000
|
||||
@ -42,10 +37,28 @@ README.concap
|
||||
- info on "CONCAP" encapsulation protocol interface used for X.25.
|
||||
README.diversion
|
||||
- info on module for isdn diversion services.
|
||||
README.fax
|
||||
- info for using Fax over ISDN.
|
||||
README.gigaset
|
||||
- info on the drivers for Siemens Gigaset ISDN adapters
|
||||
README.hfc-pci
|
||||
- info on hfc-pci based cards.
|
||||
README.hysdn
|
||||
- info on driver for Hypercope active HYSDN cards
|
||||
README.icn
|
||||
- info on the ICN-ISDN-card and its driver.
|
||||
README.mISDN
|
||||
- info on the Modular ISDN subsystem (mISDN)
|
||||
README.pcbit
|
||||
- info on the PCBIT-D ISDN adapter and driver.
|
||||
README.sc
|
||||
- info on driver for Spellcaster cards.
|
||||
README.syncppp
|
||||
- info on running Sync PPP over ISDN.
|
||||
README.x25
|
||||
- info for running X.25 over ISDN.
|
||||
syncPPP.FAQ
|
||||
- frequently asked questions about running PPP over ISDN.
|
||||
README.hysdn
|
||||
- info on driver for Hypercope active HYSDN cards
|
||||
README.mISDN
|
||||
|
@ -45,7 +45,7 @@ From then on, Kernel CAPI may call the registered callback functions for the
|
||||
device.
|
||||
|
||||
If the device becomes unusable for any reason (shutdown, disconnect ...), the
|
||||
driver has to call capi_ctr_reseted(). This will prevent further calls to the
|
||||
driver has to call capi_ctr_down(). This will prevent further calls to the
|
||||
callback functions by Kernel CAPI.
|
||||
|
||||
|
||||
@ -114,20 +114,36 @@ char *driver_name
|
||||
int (*load_firmware)(struct capi_ctr *ctrlr, capiloaddata *ldata)
|
||||
(optional) pointer to a callback function for sending firmware and
|
||||
configuration data to the device
|
||||
Return value: 0 on success, error code on error
|
||||
Called in process context.
|
||||
|
||||
void (*reset_ctr)(struct capi_ctr *ctrlr)
|
||||
pointer to a callback function for performing a reset on the device,
|
||||
releasing all registered applications
|
||||
(optional) pointer to a callback function for performing a reset on
|
||||
the device, releasing all registered applications
|
||||
Called in process context.
|
||||
|
||||
void (*register_appl)(struct capi_ctr *ctrlr, u16 applid,
|
||||
capi_register_params *rparam)
|
||||
void (*release_appl)(struct capi_ctr *ctrlr, u16 applid)
|
||||
pointers to callback functions for registration and deregistration of
|
||||
applications with the device
|
||||
Calls to these functions are serialized by Kernel CAPI so that only
|
||||
one call to any of them is active at any time.
|
||||
|
||||
u16 (*send_message)(struct capi_ctr *ctrlr, struct sk_buff *skb)
|
||||
pointer to a callback function for sending a CAPI message to the
|
||||
device
|
||||
Return value: CAPI error code
|
||||
If the method returns 0 (CAPI_NOERROR) the driver has taken ownership
|
||||
of the skb and the caller may no longer access it. If it returns a
|
||||
non-zero (error) value then ownership of the skb returns to the caller
|
||||
who may reuse or free it.
|
||||
The return value should only be used to signal problems with respect
|
||||
to accepting or queueing the message. Errors occurring during the
|
||||
actual processing of the message should be signaled with an
|
||||
appropriate reply message.
|
||||
Calls to this function are not serialized by Kernel CAPI, ie. it must
|
||||
be prepared to be re-entered.
|
||||
|
||||
char *(*procinfo)(struct capi_ctr *ctrlr)
|
||||
pointer to a callback function returning the entry for the device in
|
||||
@ -138,6 +154,8 @@ read_proc_t *ctr_read_proc
|
||||
system entry, /proc/capi/controllers/<n>; will be called with a
|
||||
pointer to the device's capi_ctr structure as the last (data) argument
|
||||
|
||||
Note: Callback functions are never called in interrupt context.
|
||||
|
||||
- to be filled in before calling capi_ctr_ready():
|
||||
|
||||
u8 manu[CAPI_MANUFACTURER_LEN]
|
||||
@ -153,6 +171,45 @@ u8 serial[CAPI_SERIAL_LEN]
|
||||
value to return for CAPI_GET_SERIAL
|
||||
|
||||
|
||||
4.3 The _cmsg Structure
|
||||
|
||||
(declared in <linux/isdn/capiutil.h>)
|
||||
|
||||
The _cmsg structure stores the contents of a CAPI 2.0 message in an easily
|
||||
accessible form. It contains members for all possible CAPI 2.0 parameters, of
|
||||
which only those appearing in the message type currently being processed are
|
||||
actually used. Unused members should be set to zero.
|
||||
|
||||
Members are named after the CAPI 2.0 standard names of the parameters they
|
||||
represent. See <linux/isdn/capiutil.h> for the exact spelling. Member data
|
||||
types are:
|
||||
|
||||
u8 for CAPI parameters of type 'byte'
|
||||
|
||||
u16 for CAPI parameters of type 'word'
|
||||
|
||||
u32 for CAPI parameters of type 'dword'
|
||||
|
||||
_cstruct for CAPI parameters of type 'struct' not containing any
|
||||
variably-sized (struct) subparameters (eg. 'Called Party Number')
|
||||
The member is a pointer to a buffer containing the parameter in
|
||||
CAPI encoding (length + content). It may also be NULL, which will
|
||||
be taken to represent an empty (zero length) parameter.
|
||||
|
||||
_cmstruct for CAPI parameters of type 'struct' containing 'struct'
|
||||
subparameters ('Additional Info' and 'B Protocol')
|
||||
The representation is a single byte containing one of the values:
|
||||
CAPI_DEFAULT: the parameter is empty
|
||||
CAPI_COMPOSE: the values of the subparameters are stored
|
||||
individually in the corresponding _cmsg structure members
|
||||
|
||||
Functions capi_cmsg2message() and capi_message2cmsg() are provided to convert
|
||||
messages between their transport encoding described in the CAPI 2.0 standard
|
||||
and their _cmsg structure representation. Note that capi_cmsg2message() does
|
||||
not know or check the size of its destination buffer. The caller must make
|
||||
sure it is big enough to accomodate the resulting CAPI message.
|
||||
|
||||
|
||||
5. Lower Layer Interface Functions
|
||||
|
||||
(declared in <linux/isdn/capilli.h>)
|
||||
@ -166,7 +223,7 @@ int detach_capi_ctr(struct capi_ctr *ctrlr)
|
||||
register/unregister a device (controller) with Kernel CAPI
|
||||
|
||||
void capi_ctr_ready(struct capi_ctr *ctrlr)
|
||||
void capi_ctr_reseted(struct capi_ctr *ctrlr)
|
||||
void capi_ctr_down(struct capi_ctr *ctrlr)
|
||||
signal controller ready/not ready
|
||||
|
||||
void capi_ctr_suspend_output(struct capi_ctr *ctrlr)
|
||||
@ -211,3 +268,32 @@ CAPIMSG_CONTROL(m) CAPIMSG_SETCONTROL(m, contr) Controller/PLCI/NCCI
|
||||
(u32)
|
||||
CAPIMSG_DATALEN(m) CAPIMSG_SETDATALEN(m, len) Data Length (u16)
|
||||
|
||||
|
||||
Library functions for working with _cmsg structures
|
||||
(from <linux/isdn/capiutil.h>):
|
||||
|
||||
unsigned capi_cmsg2message(_cmsg *cmsg, u8 *msg)
|
||||
Assembles a CAPI 2.0 message from the parameters in *cmsg, storing the
|
||||
result in *msg.
|
||||
|
||||
unsigned capi_message2cmsg(_cmsg *cmsg, u8 *msg)
|
||||
Disassembles the CAPI 2.0 message in *msg, storing the parameters in
|
||||
*cmsg.
|
||||
|
||||
unsigned capi_cmsg_header(_cmsg *cmsg, u16 ApplId, u8 Command, u8 Subcommand,
|
||||
u16 Messagenumber, u32 Controller)
|
||||
Fills the header part and address field of the _cmsg structure *cmsg
|
||||
with the given values, zeroing the remainder of the structure so only
|
||||
parameters with non-default values need to be changed before sending
|
||||
the message.
|
||||
|
||||
void capi_cmsg_answer(_cmsg *cmsg)
|
||||
Sets the low bit of the Subcommand field in *cmsg, thereby converting
|
||||
_REQ to _CONF and _IND to _RESP.
|
||||
|
||||
char *capi_cmd2str(u8 Command, u8 Subcommand)
|
||||
Returns the CAPI 2.0 message name corresponding to the given command
|
||||
and subcommand values, as a static ASCII string. The return value may
|
||||
be NULL if the command/subcommand is not one of those defined in the
|
||||
CAPI 2.0 standard.
|
||||
|
||||
|
@ -149,10 +149,8 @@ GigaSet 307x Device Driver
|
||||
configuration files and chat scripts in the gigaset-VERSION/ppp directory
|
||||
in the driver packages from http://sourceforge.net/projects/gigaset307x/.
|
||||
Please note that the USB drivers are not able to change the state of the
|
||||
control lines (the M105 driver can be configured to use some undocumented
|
||||
control requests, if you really need the control lines, though). This means
|
||||
you must use "Stupid Mode" if you are using wvdial or you should use the
|
||||
nocrtscts option of pppd.
|
||||
control lines. This means you must use "Stupid Mode" if you are using
|
||||
wvdial or you should use the nocrtscts option of pppd.
|
||||
You must also assure that the ppp_async module is loaded with the parameter
|
||||
flag_time=0. You can do this e.g. by adding a line like
|
||||
|
||||
@ -190,20 +188,19 @@ GigaSet 307x Device Driver
|
||||
You can also use /sys/class/tty/ttyGxy/cidmode for changing the CID mode
|
||||
setting (ttyGxy is ttyGU0 or ttyGB0).
|
||||
|
||||
2.6. M105 Undocumented USB Requests
|
||||
------------------------------
|
||||
|
||||
The Gigaset M105 USB data box understands a couple of useful, but
|
||||
undocumented USB commands. These requests are not used in normal
|
||||
operation (for wireless access to the base), but are needed for access
|
||||
to the M105's own configuration mode (registration to the base, baudrate
|
||||
and line format settings, device status queries) via the gigacontr
|
||||
utility. Their use is controlled by the kernel configuration option
|
||||
"Support for undocumented USB requests" (CONFIG_GIGASET_UNDOCREQ). If you
|
||||
encounter error code -ENOTTY when trying to use some features of the
|
||||
M105, try setting that option to "y" via 'make {x,menu}config' and
|
||||
recompiling the driver.
|
||||
2.6. Unregistered Wireless Devices (M101/M105)
|
||||
-----------------------------------------
|
||||
The main purpose of the ser_gigaset and usb_gigaset drivers is to allow
|
||||
the M101 and M105 wireless devices to be used as ISDN devices for ISDN
|
||||
connections through a Gigaset base. Therefore they assume that the device
|
||||
is registered to a DECT base.
|
||||
|
||||
If the M101/M105 device is not registered to a base, initialization of
|
||||
the device fails, and a corresponding error message is logged by the
|
||||
driver. In that situation, a restricted set of functions is available
|
||||
which includes, in particular, those necessary for registering the device
|
||||
to a base or for switching it between Fixed Part and Portable Part
|
||||
modes.
|
||||
|
||||
3. Troubleshooting
|
||||
---------------
|
||||
@ -234,11 +231,12 @@ GigaSet 307x Device Driver
|
||||
Select Unimodem mode for all DECT data adapters. (see section 2.4.)
|
||||
|
||||
Problem:
|
||||
You want to configure your USB DECT data adapter (M105) but gigacontr
|
||||
reports an error: "/dev/ttyGU0: Inappropriate ioctl for device".
|
||||
Messages like this:
|
||||
usb_gigaset 3-2:1.0: Could not initialize the device.
|
||||
appear in your syslog.
|
||||
Solution:
|
||||
Recompile the usb_gigaset driver with the kernel configuration option
|
||||
CONFIG_GIGASET_UNDOCREQ set to 'y'. (see section 2.6.)
|
||||
Check whether your M10x wireless device is correctly registered to the
|
||||
Gigaset base. (see section 2.6.)
|
||||
|
||||
3.2. Telling the driver to provide more information
|
||||
----------------------------------------------
|
||||
|
@ -35,6 +35,79 @@ new .config files to see the differences:
|
||||
|
||||
(Yes, we need something better here.)
|
||||
|
||||
______________________________________________________________________
|
||||
Environment variables for '*config'
|
||||
|
||||
KCONFIG_CONFIG
|
||||
--------------------------------------------------
|
||||
This environment variable can be used to specify a default kernel config
|
||||
file name to override the default name of ".config".
|
||||
|
||||
KCONFIG_OVERWRITECONFIG
|
||||
--------------------------------------------------
|
||||
If you set KCONFIG_OVERWRITECONFIG in the environment, Kconfig will not
|
||||
break symlinks when .config is a symlink to somewhere else.
|
||||
|
||||
KCONFIG_NOTIMESTAMP
|
||||
--------------------------------------------------
|
||||
If this environment variable exists and is non-null, the timestamp line
|
||||
in generated .config files is omitted.
|
||||
|
||||
______________________________________________________________________
|
||||
Environment variables for '{allyes/allmod/allno/rand}config'
|
||||
|
||||
KCONFIG_ALLCONFIG
|
||||
--------------------------------------------------
|
||||
(partially based on lkml email from/by Rob Landley, re: miniconfig)
|
||||
--------------------------------------------------
|
||||
The allyesconfig/allmodconfig/allnoconfig/randconfig variants can
|
||||
also use the environment variable KCONFIG_ALLCONFIG as a flag or a
|
||||
filename that contains config symbols that the user requires to be
|
||||
set to a specific value. If KCONFIG_ALLCONFIG is used without a
|
||||
filename, "make *config" checks for a file named
|
||||
"all{yes/mod/no/random}.config" (corresponding to the *config command
|
||||
that was used) for symbol values that are to be forced. If this file
|
||||
is not found, it checks for a file named "all.config" to contain forced
|
||||
values.
|
||||
|
||||
This enables you to create "miniature" config (miniconfig) or custom
|
||||
config files containing just the config symbols that you are interested
|
||||
in. Then the kernel config system generates the full .config file,
|
||||
including symbols of your miniconfig file.
|
||||
|
||||
This 'KCONFIG_ALLCONFIG' file is a config file which contains
|
||||
(usually a subset of all) preset config symbols. These variable
|
||||
settings are still subject to normal dependency checks.
|
||||
|
||||
Examples:
|
||||
KCONFIG_ALLCONFIG=custom-notebook.config make allnoconfig
|
||||
or
|
||||
KCONFIG_ALLCONFIG=mini.config make allnoconfig
|
||||
or
|
||||
make KCONFIG_ALLCONFIG=mini.config allnoconfig
|
||||
|
||||
These examples will disable most options (allnoconfig) but enable or
|
||||
disable the options that are explicitly listed in the specified
|
||||
mini-config files.
|
||||
|
||||
______________________________________________________________________
|
||||
Environment variables for 'silentoldconfig'
|
||||
|
||||
KCONFIG_NOSILENTUPDATE
|
||||
--------------------------------------------------
|
||||
If this variable has a non-blank value, it prevents silent kernel
|
||||
config udpates (requires explicit updates).
|
||||
|
||||
KCONFIG_AUTOCONFIG
|
||||
--------------------------------------------------
|
||||
This environment variable can be set to specify the path & name of the
|
||||
"auto.conf" file. Its default value is "include/config/auto.conf".
|
||||
|
||||
KCONFIG_AUTOHEADER
|
||||
--------------------------------------------------
|
||||
This environment variable can be set to specify the path & name of the
|
||||
"autoconf.h" (header) file. Its default value is "include/linux/autoconf.h".
|
||||
|
||||
|
||||
======================================================================
|
||||
menuconfig
|
||||
@ -60,10 +133,11 @@ Searching in menuconfig:
|
||||
|
||||
/^hotplug
|
||||
|
||||
|
||||
______________________________________________________________________
|
||||
Color Themes for 'menuconfig'
|
||||
User interface options for 'menuconfig'
|
||||
|
||||
MENUCONFIG_COLOR
|
||||
--------------------------------------------------
|
||||
It is possible to select different color themes using the variable
|
||||
MENUCONFIG_COLOR. To select a theme use:
|
||||
|
||||
@ -75,83 +149,13 @@ Available themes are:
|
||||
classic => theme with blue background. The classic look
|
||||
bluetitle => a LCD friendly version of classic. (default)
|
||||
|
||||
______________________________________________________________________
|
||||
Environment variables in 'menuconfig'
|
||||
|
||||
KCONFIG_ALLCONFIG
|
||||
--------------------------------------------------
|
||||
(partially based on lkml email from/by Rob Landley, re: miniconfig)
|
||||
--------------------------------------------------
|
||||
The allyesconfig/allmodconfig/allnoconfig/randconfig variants can
|
||||
also use the environment variable KCONFIG_ALLCONFIG as a flag or a
|
||||
filename that contains config symbols that the user requires to be
|
||||
set to a specific value. If KCONFIG_ALLCONFIG is used without a
|
||||
filename, "make *config" checks for a file named
|
||||
"all{yes/mod/no/random}.config" (corresponding to the *config command
|
||||
that was used) for symbol values that are to be forced. If this file
|
||||
is not found, it checks for a file named "all.config" to contain forced
|
||||
values.
|
||||
|
||||
This enables you to create "miniature" config (miniconfig) or custom
|
||||
config files containing just the config symbols that you are interested
|
||||
in. Then the kernel config system generates the full .config file,
|
||||
including dependencies of your miniconfig file, based on the miniconfig
|
||||
file.
|
||||
|
||||
This 'KCONFIG_ALLCONFIG' file is a config file which contains
|
||||
(usually a subset of all) preset config symbols. These variable
|
||||
settings are still subject to normal dependency checks.
|
||||
|
||||
Examples:
|
||||
KCONFIG_ALLCONFIG=custom-notebook.config make allnoconfig
|
||||
or
|
||||
KCONFIG_ALLCONFIG=mini.config make allnoconfig
|
||||
or
|
||||
make KCONFIG_ALLCONFIG=mini.config allnoconfig
|
||||
|
||||
These examples will disable most options (allnoconfig) but enable or
|
||||
disable the options that are explicitly listed in the specified
|
||||
mini-config files.
|
||||
|
||||
KCONFIG_NOSILENTUPDATE
|
||||
--------------------------------------------------
|
||||
If this variable has a non-blank value, it prevents silent kernel
|
||||
config udpates (requires explicit updates).
|
||||
|
||||
KCONFIG_CONFIG
|
||||
--------------------------------------------------
|
||||
This environment variable can be used to specify a default kernel config
|
||||
file name to override the default name of ".config".
|
||||
|
||||
KCONFIG_OVERWRITECONFIG
|
||||
--------------------------------------------------
|
||||
If you set KCONFIG_OVERWRITECONFIG in the environment, Kconfig will not
|
||||
break symlinks when .config is a symlink to somewhere else.
|
||||
|
||||
KCONFIG_NOTIMESTAMP
|
||||
--------------------------------------------------
|
||||
If this environment variable exists and is non-null, the timestamp line
|
||||
in generated .config files is omitted.
|
||||
|
||||
KCONFIG_AUTOCONFIG
|
||||
--------------------------------------------------
|
||||
This environment variable can be set to specify the path & name of the
|
||||
"auto.conf" file. Its default value is "include/config/auto.conf".
|
||||
|
||||
KCONFIG_AUTOHEADER
|
||||
--------------------------------------------------
|
||||
This environment variable can be set to specify the path & name of the
|
||||
"autoconf.h" (header) file. Its default value is "include/linux/autoconf.h".
|
||||
|
||||
______________________________________________________________________
|
||||
menuconfig User Interface Options
|
||||
----------------------------------------------------------------------
|
||||
MENUCONFIG_MODE
|
||||
--------------------------------------------------
|
||||
This mode shows all sub-menus in one large tree.
|
||||
|
||||
Example:
|
||||
MENUCONFIG_MODE=single_menu make menuconfig
|
||||
make MENUCONFIG_MODE=single_menu menuconfig
|
||||
|
||||
|
||||
======================================================================
|
||||
xconfig
|
||||
|
@ -275,7 +275,7 @@ following files:
|
||||
|
||||
KERNELDIR := /lib/modules/`uname -r`/build
|
||||
all::
|
||||
$(MAKE) -C $KERNELDIR M=`pwd` $@
|
||||
$(MAKE) -C $(KERNELDIR) M=`pwd` $@
|
||||
|
||||
# Module specific targets
|
||||
genbin:
|
||||
|
@ -108,7 +108,7 @@ There are two possible methods of using Kdump.
|
||||
|
||||
2) Or use the system kernel binary itself as dump-capture kernel and there is
|
||||
no need to build a separate dump-capture kernel. This is possible
|
||||
only with the architecutres which support a relocatable kernel. As
|
||||
only with the architectures which support a relocatable kernel. As
|
||||
of today, i386, x86_64, ppc64 and ia64 architectures support relocatable
|
||||
kernel.
|
||||
|
||||
@ -222,7 +222,7 @@ Dump-capture kernel config options (Arch Dependent, ia64)
|
||||
----------------------------------------------------------
|
||||
|
||||
- No specific options are required to create a dump-capture kernel
|
||||
for ia64, other than those specified in the arch idependent section
|
||||
for ia64, other than those specified in the arch independent section
|
||||
above. This means that it is possible to use the system kernel
|
||||
as a dump-capture kernel if desired.
|
||||
|
||||
|
@ -48,6 +48,7 @@ parameter is applicable:
|
||||
EFI EFI Partitioning (GPT) is enabled
|
||||
EIDE EIDE/ATAPI support is enabled.
|
||||
FB The frame buffer device is enabled.
|
||||
GCOV GCOV profiling is enabled.
|
||||
HW Appropriate hardware is enabled.
|
||||
IA-64 IA-64 architecture is enabled.
|
||||
IMA Integrity measurement architecture is enabled.
|
||||
@ -491,6 +492,13 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
Also note the kernel might malfunction if you disable
|
||||
some critical bits.
|
||||
|
||||
cmo_free_hint= [PPC] Format: { yes | no }
|
||||
Specify whether pages are marked as being inactive
|
||||
when they are freed. This is used in CMO environments
|
||||
to determine OS memory pressure for page stealing by
|
||||
a hypervisor.
|
||||
Default: yes
|
||||
|
||||
code_bytes [X86] How many bytes of object code to print
|
||||
in an oops report.
|
||||
Range: 0 - 8192
|
||||
@ -539,6 +547,10 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
console=brl,ttyS0
|
||||
For now, only VisioBraille is supported.
|
||||
|
||||
consoleblank= [KNL] The console blank (screen saver) timeout in
|
||||
seconds. Defaults to 10*60 = 10mins. A value of 0
|
||||
disables the blank timer.
|
||||
|
||||
coredump_filter=
|
||||
[KNL] Change the default value for
|
||||
/proc/<pid>/coredump_filter.
|
||||
@ -785,6 +797,12 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
Format: off | on
|
||||
default: on
|
||||
|
||||
gcov_persist= [GCOV] When non-zero (default), profiling data for
|
||||
kernel modules is saved and remains accessible via
|
||||
debugfs, even when the module is unloaded/reloaded.
|
||||
When zero, profiling data is discarded and associated
|
||||
debugfs files are removed at module unload time.
|
||||
|
||||
gdth= [HW,SCSI]
|
||||
See header of drivers/scsi/gdth.c.
|
||||
|
||||
@ -887,11 +905,8 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
|
||||
ide-core.nodma= [HW] (E)IDE subsystem
|
||||
Format: =0.0 to prevent dma on hda, =0.1 hdb =1.0 hdc
|
||||
.vlb_clock .pci_clock .noflush .noprobe .nowerr .cdrom
|
||||
.chs .ignore_cable are additional options
|
||||
See Documentation/ide/ide.txt.
|
||||
|
||||
idebus= [HW] (E)IDE subsystem - VLB/PCI bus speed
|
||||
.vlb_clock .pci_clock .noflush .nohpa .noprobe .nowerr
|
||||
.cdrom .chs .ignore_cable are additional options
|
||||
See Documentation/ide/ide.txt.
|
||||
|
||||
ide-pci-generic.all-generic-ide [HW] (E)IDE subsystem
|
||||
@ -928,6 +943,12 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
Formt: { "sha1" | "md5" }
|
||||
default: "sha1"
|
||||
|
||||
ima_tcb [IMA]
|
||||
Load a policy which meets the needs of the Trusted
|
||||
Computing Base. This means IMA will measure all
|
||||
programs exec'd, files mmap'd for exec, and all files
|
||||
opened for read by uid=0.
|
||||
|
||||
in2000= [HW,SCSI]
|
||||
See header of drivers/scsi/in2000.c.
|
||||
|
||||
@ -1070,13 +1091,17 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
|
||||
kgdboc= [HW] kgdb over consoles.
|
||||
Requires a tty driver that supports console polling.
|
||||
(only serial suported for now)
|
||||
(only serial supported for now)
|
||||
Format: <serial_device>[,baud]
|
||||
|
||||
kmac= [MIPS] korina ethernet MAC address.
|
||||
Configure the RouterBoard 532 series on-chip
|
||||
Ethernet adapter MAC address.
|
||||
|
||||
kmemleak= [KNL] Boot-time kmemleak enable/disable
|
||||
Valid arguments: on, off
|
||||
Default: on
|
||||
|
||||
kstack=N [X86] Print N words from the kernel stack
|
||||
in oops dumps.
|
||||
|
||||
@ -1395,7 +1420,7 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
('y', default) or cooked coordinates ('n')
|
||||
|
||||
mtrr_chunk_size=nn[KMG] [X86]
|
||||
used for mtrr cleanup. It is largest continous chunk
|
||||
used for mtrr cleanup. It is largest continuous chunk
|
||||
that could hold holes aka. UC entries.
|
||||
|
||||
mtrr_gran_size=nn[KMG] [X86]
|
||||
|
773
Documentation/kmemcheck.txt
Normal file
773
Documentation/kmemcheck.txt
Normal file
@ -0,0 +1,773 @@
|
||||
GETTING STARTED WITH KMEMCHECK
|
||||
==============================
|
||||
|
||||
Vegard Nossum <vegardno@ifi.uio.no>
|
||||
|
||||
|
||||
Contents
|
||||
========
|
||||
0. Introduction
|
||||
1. Downloading
|
||||
2. Configuring and compiling
|
||||
3. How to use
|
||||
3.1. Booting
|
||||
3.2. Run-time enable/disable
|
||||
3.3. Debugging
|
||||
3.4. Annotating false positives
|
||||
4. Reporting errors
|
||||
5. Technical description
|
||||
|
||||
|
||||
0. Introduction
|
||||
===============
|
||||
|
||||
kmemcheck is a debugging feature for the Linux Kernel. More specifically, it
|
||||
is a dynamic checker that detects and warns about some uses of uninitialized
|
||||
memory.
|
||||
|
||||
Userspace programmers might be familiar with Valgrind's memcheck. The main
|
||||
difference between memcheck and kmemcheck is that memcheck works for userspace
|
||||
programs only, and kmemcheck works for the kernel only. The implementations
|
||||
are of course vastly different. Because of this, kmemcheck is not as accurate
|
||||
as memcheck, but it turns out to be good enough in practice to discover real
|
||||
programmer errors that the compiler is not able to find through static
|
||||
analysis.
|
||||
|
||||
Enabling kmemcheck on a kernel will probably slow it down to the extent that
|
||||
the machine will not be usable for normal workloads such as e.g. an
|
||||
interactive desktop. kmemcheck will also cause the kernel to use about twice
|
||||
as much memory as normal. For this reason, kmemcheck is strictly a debugging
|
||||
feature.
|
||||
|
||||
|
||||
1. Downloading
|
||||
==============
|
||||
|
||||
kmemcheck can only be downloaded using git. If you want to write patches
|
||||
against the current code, you should use the kmemcheck development branch of
|
||||
the tip tree. It is also possible to use the linux-next tree, which also
|
||||
includes the latest version of kmemcheck.
|
||||
|
||||
Assuming that you've already cloned the linux-2.6.git repository, all you
|
||||
have to do is add the -tip tree as a remote, like this:
|
||||
|
||||
$ git remote add tip git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git
|
||||
|
||||
To actually download the tree, fetch the remote:
|
||||
|
||||
$ git fetch tip
|
||||
|
||||
And to check out a new local branch with the kmemcheck code:
|
||||
|
||||
$ git checkout -b kmemcheck tip/kmemcheck
|
||||
|
||||
General instructions for the -tip tree can be found here:
|
||||
http://people.redhat.com/mingo/tip.git/readme.txt
|
||||
|
||||
|
||||
2. Configuring and compiling
|
||||
============================
|
||||
|
||||
kmemcheck only works for the x86 (both 32- and 64-bit) platform. A number of
|
||||
configuration variables must have specific settings in order for the kmemcheck
|
||||
menu to even appear in "menuconfig". These are:
|
||||
|
||||
o CONFIG_CC_OPTIMIZE_FOR_SIZE=n
|
||||
|
||||
This option is located under "General setup" / "Optimize for size".
|
||||
|
||||
Without this, gcc will use certain optimizations that usually lead to
|
||||
false positive warnings from kmemcheck. An example of this is a 16-bit
|
||||
field in a struct, where gcc may load 32 bits, then discard the upper
|
||||
16 bits. kmemcheck sees only the 32-bit load, and may trigger a
|
||||
warning for the upper 16 bits (if they're uninitialized).
|
||||
|
||||
o CONFIG_SLAB=y or CONFIG_SLUB=y
|
||||
|
||||
This option is located under "General setup" / "Choose SLAB
|
||||
allocator".
|
||||
|
||||
o CONFIG_FUNCTION_TRACER=n
|
||||
|
||||
This option is located under "Kernel hacking" / "Tracers" / "Kernel
|
||||
Function Tracer"
|
||||
|
||||
When function tracing is compiled in, gcc emits a call to another
|
||||
function at the beginning of every function. This means that when the
|
||||
page fault handler is called, the ftrace framework will be called
|
||||
before kmemcheck has had a chance to handle the fault. If ftrace then
|
||||
modifies memory that was tracked by kmemcheck, the result is an
|
||||
endless recursive page fault.
|
||||
|
||||
o CONFIG_DEBUG_PAGEALLOC=n
|
||||
|
||||
This option is located under "Kernel hacking" / "Debug page memory
|
||||
allocations".
|
||||
|
||||
In addition, I highly recommend turning on CONFIG_DEBUG_INFO=y. This is also
|
||||
located under "Kernel hacking". With this, you will be able to get line number
|
||||
information from the kmemcheck warnings, which is extremely valuable in
|
||||
debugging a problem. This option is not mandatory, however, because it slows
|
||||
down the compilation process and produces a much bigger kernel image.
|
||||
|
||||
Now the kmemcheck menu should be visible (under "Kernel hacking" / "kmemcheck:
|
||||
trap use of uninitialized memory"). Here follows a description of the
|
||||
kmemcheck configuration variables:
|
||||
|
||||
o CONFIG_KMEMCHECK
|
||||
|
||||
This must be enabled in order to use kmemcheck at all...
|
||||
|
||||
o CONFIG_KMEMCHECK_[DISABLED | ENABLED | ONESHOT]_BY_DEFAULT
|
||||
|
||||
This option controls the status of kmemcheck at boot-time. "Enabled"
|
||||
will enable kmemcheck right from the start, "disabled" will boot the
|
||||
kernel as normal (but with the kmemcheck code compiled in, so it can
|
||||
be enabled at run-time after the kernel has booted), and "one-shot" is
|
||||
a special mode which will turn kmemcheck off automatically after
|
||||
detecting the first use of uninitialized memory.
|
||||
|
||||
If you are using kmemcheck to actively debug a problem, then you
|
||||
probably want to choose "enabled" here.
|
||||
|
||||
The one-shot mode is mostly useful in automated test setups because it
|
||||
can prevent floods of warnings and increase the chances of the machine
|
||||
surviving in case something is really wrong. In other cases, the one-
|
||||
shot mode could actually be counter-productive because it would turn
|
||||
itself off at the very first error -- in the case of a false positive
|
||||
too -- and this would come in the way of debugging the specific
|
||||
problem you were interested in.
|
||||
|
||||
If you would like to use your kernel as normal, but with a chance to
|
||||
enable kmemcheck in case of some problem, it might be a good idea to
|
||||
choose "disabled" here. When kmemcheck is disabled, most of the run-
|
||||
time overhead is not incurred, and the kernel will be almost as fast
|
||||
as normal.
|
||||
|
||||
o CONFIG_KMEMCHECK_QUEUE_SIZE
|
||||
|
||||
Select the maximum number of error reports to store in an internal
|
||||
(fixed-size) buffer. Since errors can occur virtually anywhere and in
|
||||
any context, we need a temporary storage area which is guaranteed not
|
||||
to generate any other page faults when accessed. The queue will be
|
||||
emptied as soon as a tasklet may be scheduled. If the queue is full,
|
||||
new error reports will be lost.
|
||||
|
||||
The default value of 64 is probably fine. If some code produces more
|
||||
than 64 errors within an irqs-off section, then the code is likely to
|
||||
produce many, many more, too, and these additional reports seldom give
|
||||
any more information (the first report is usually the most valuable
|
||||
anyway).
|
||||
|
||||
This number might have to be adjusted if you are not using serial
|
||||
console or similar to capture the kernel log. If you are using the
|
||||
"dmesg" command to save the log, then getting a lot of kmemcheck
|
||||
warnings might overflow the kernel log itself, and the earlier reports
|
||||
will get lost in that way instead. Try setting this to 10 or so on
|
||||
such a setup.
|
||||
|
||||
o CONFIG_KMEMCHECK_SHADOW_COPY_SHIFT
|
||||
|
||||
Select the number of shadow bytes to save along with each entry of the
|
||||
error-report queue. These bytes indicate what parts of an allocation
|
||||
are initialized, uninitialized, etc. and will be displayed when an
|
||||
error is detected to help the debugging of a particular problem.
|
||||
|
||||
The number entered here is actually the logarithm of the number of
|
||||
bytes that will be saved. So if you pick for example 5 here, kmemcheck
|
||||
will save 2^5 = 32 bytes.
|
||||
|
||||
The default value should be fine for debugging most problems. It also
|
||||
fits nicely within 80 columns.
|
||||
|
||||
o CONFIG_KMEMCHECK_PARTIAL_OK
|
||||
|
||||
This option (when enabled) works around certain GCC optimizations that
|
||||
produce 32-bit reads from 16-bit variables where the upper 16 bits are
|
||||
thrown away afterwards.
|
||||
|
||||
The default value (enabled) is recommended. This may of course hide
|
||||
some real errors, but disabling it would probably produce a lot of
|
||||
false positives.
|
||||
|
||||
o CONFIG_KMEMCHECK_BITOPS_OK
|
||||
|
||||
This option silences warnings that would be generated for bit-field
|
||||
accesses where not all the bits are initialized at the same time. This
|
||||
may also hide some real bugs.
|
||||
|
||||
This option is probably obsolete, or it should be replaced with
|
||||
the kmemcheck-/bitfield-annotations for the code in question. The
|
||||
default value is therefore fine.
|
||||
|
||||
Now compile the kernel as usual.
|
||||
|
||||
|
||||
3. How to use
|
||||
=============
|
||||
|
||||
3.1. Booting
|
||||
============
|
||||
|
||||
First some information about the command-line options. There is only one
|
||||
option specific to kmemcheck, and this is called "kmemcheck". It can be used
|
||||
to override the default mode as chosen by the CONFIG_KMEMCHECK_*_BY_DEFAULT
|
||||
option. Its possible settings are:
|
||||
|
||||
o kmemcheck=0 (disabled)
|
||||
o kmemcheck=1 (enabled)
|
||||
o kmemcheck=2 (one-shot mode)
|
||||
|
||||
If SLUB debugging has been enabled in the kernel, it may take precedence over
|
||||
kmemcheck in such a way that the slab caches which are under SLUB debugging
|
||||
will not be tracked by kmemcheck. In order to ensure that this doesn't happen
|
||||
(even though it shouldn't by default), use SLUB's boot option "slub_debug",
|
||||
like this: slub_debug=-
|
||||
|
||||
In fact, this option may also be used for fine-grained control over SLUB vs.
|
||||
kmemcheck. For example, if the command line includes "kmemcheck=1
|
||||
slub_debug=,dentry", then SLUB debugging will be used only for the "dentry"
|
||||
slab cache, and with kmemcheck tracking all the other caches. This is advanced
|
||||
usage, however, and is not generally recommended.
|
||||
|
||||
|
||||
3.2. Run-time enable/disable
|
||||
============================
|
||||
|
||||
When the kernel has booted, it is possible to enable or disable kmemcheck at
|
||||
run-time. WARNING: This feature is still experimental and may cause false
|
||||
positive warnings to appear. Therefore, try not to use this. If you find that
|
||||
it doesn't work properly (e.g. you see an unreasonable amount of warnings), I
|
||||
will be happy to take bug reports.
|
||||
|
||||
Use the file /proc/sys/kernel/kmemcheck for this purpose, e.g.:
|
||||
|
||||
$ echo 0 > /proc/sys/kernel/kmemcheck # disables kmemcheck
|
||||
|
||||
The numbers are the same as for the kmemcheck= command-line option.
|
||||
|
||||
|
||||
3.3. Debugging
|
||||
==============
|
||||
|
||||
A typical report will look something like this:
|
||||
|
||||
WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88003e4a2024)
|
||||
80000000000000000000000000000000000000000088ffff0000000000000000
|
||||
i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u
|
||||
^
|
||||
|
||||
Pid: 1856, comm: ntpdate Not tainted 2.6.29-rc5 #264 945P-A
|
||||
RIP: 0010:[<ffffffff8104ede8>] [<ffffffff8104ede8>] __dequeue_signal+0xc8/0x190
|
||||
RSP: 0018:ffff88003cdf7d98 EFLAGS: 00210002
|
||||
RAX: 0000000000000030 RBX: ffff88003d4ea968 RCX: 0000000000000009
|
||||
RDX: ffff88003e5d6018 RSI: ffff88003e5d6024 RDI: ffff88003cdf7e84
|
||||
RBP: ffff88003cdf7db8 R08: ffff88003e5d6000 R09: 0000000000000000
|
||||
R10: 0000000000000080 R11: 0000000000000000 R12: 000000000000000e
|
||||
R13: ffff88003cdf7e78 R14: ffff88003d530710 R15: ffff88003d5a98c8
|
||||
FS: 0000000000000000(0000) GS:ffff880001982000(0063) knlGS:00000
|
||||
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
|
||||
CR2: ffff88003f806ea0 CR3: 000000003c036000 CR4: 00000000000006a0
|
||||
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
|
||||
DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
|
||||
[<ffffffff8104f04e>] dequeue_signal+0x8e/0x170
|
||||
[<ffffffff81050bd8>] get_signal_to_deliver+0x98/0x390
|
||||
[<ffffffff8100b87d>] do_notify_resume+0xad/0x7d0
|
||||
[<ffffffff8100c7b5>] int_signal+0x12/0x17
|
||||
[<ffffffffffffffff>] 0xffffffffffffffff
|
||||
|
||||
The single most valuable information in this report is the RIP (or EIP on 32-
|
||||
bit) value. This will help us pinpoint exactly which instruction that caused
|
||||
the warning.
|
||||
|
||||
If your kernel was compiled with CONFIG_DEBUG_INFO=y, then all we have to do
|
||||
is give this address to the addr2line program, like this:
|
||||
|
||||
$ addr2line -e vmlinux -i ffffffff8104ede8
|
||||
arch/x86/include/asm/string_64.h:12
|
||||
include/asm-generic/siginfo.h:287
|
||||
kernel/signal.c:380
|
||||
kernel/signal.c:410
|
||||
|
||||
The "-e vmlinux" tells addr2line which file to look in. IMPORTANT: This must
|
||||
be the vmlinux of the kernel that produced the warning in the first place! If
|
||||
not, the line number information will almost certainly be wrong.
|
||||
|
||||
The "-i" tells addr2line to also print the line numbers of inlined functions.
|
||||
In this case, the flag was very important, because otherwise, it would only
|
||||
have printed the first line, which is just a call to memcpy(), which could be
|
||||
called from a thousand places in the kernel, and is therefore not very useful.
|
||||
These inlined functions would not show up in the stack trace above, simply
|
||||
because the kernel doesn't load the extra debugging information. This
|
||||
technique can of course be used with ordinary kernel oopses as well.
|
||||
|
||||
In this case, it's the caller of memcpy() that is interesting, and it can be
|
||||
found in include/asm-generic/siginfo.h, line 287:
|
||||
|
||||
281 static inline void copy_siginfo(struct siginfo *to, struct siginfo *from)
|
||||
282 {
|
||||
283 if (from->si_code < 0)
|
||||
284 memcpy(to, from, sizeof(*to));
|
||||
285 else
|
||||
286 /* _sigchld is currently the largest know union member */
|
||||
287 memcpy(to, from, __ARCH_SI_PREAMBLE_SIZE + sizeof(from->_sifields._sigchld));
|
||||
288 }
|
||||
|
||||
Since this was a read (kmemcheck usually warns about reads only, though it can
|
||||
warn about writes to unallocated or freed memory as well), it was probably the
|
||||
"from" argument which contained some uninitialized bytes. Following the chain
|
||||
of calls, we move upwards to see where "from" was allocated or initialized,
|
||||
kernel/signal.c, line 380:
|
||||
|
||||
359 static void collect_signal(int sig, struct sigpending *list, siginfo_t *info)
|
||||
360 {
|
||||
...
|
||||
367 list_for_each_entry(q, &list->list, list) {
|
||||
368 if (q->info.si_signo == sig) {
|
||||
369 if (first)
|
||||
370 goto still_pending;
|
||||
371 first = q;
|
||||
...
|
||||
377 if (first) {
|
||||
378 still_pending:
|
||||
379 list_del_init(&first->list);
|
||||
380 copy_siginfo(info, &first->info);
|
||||
381 __sigqueue_free(first);
|
||||
...
|
||||
392 }
|
||||
393 }
|
||||
|
||||
Here, it is &first->info that is being passed on to copy_siginfo(). The
|
||||
variable "first" was found on a list -- passed in as the second argument to
|
||||
collect_signal(). We continue our journey through the stack, to figure out
|
||||
where the item on "list" was allocated or initialized. We move to line 410:
|
||||
|
||||
395 static int __dequeue_signal(struct sigpending *pending, sigset_t *mask,
|
||||
396 siginfo_t *info)
|
||||
397 {
|
||||
...
|
||||
410 collect_signal(sig, pending, info);
|
||||
...
|
||||
414 }
|
||||
|
||||
Now we need to follow the "pending" pointer, since that is being passed on to
|
||||
collect_signal() as "list". At this point, we've run out of lines from the
|
||||
"addr2line" output. Not to worry, we just paste the next addresses from the
|
||||
kmemcheck stack dump, i.e.:
|
||||
|
||||
[<ffffffff8104f04e>] dequeue_signal+0x8e/0x170
|
||||
[<ffffffff81050bd8>] get_signal_to_deliver+0x98/0x390
|
||||
[<ffffffff8100b87d>] do_notify_resume+0xad/0x7d0
|
||||
[<ffffffff8100c7b5>] int_signal+0x12/0x17
|
||||
|
||||
$ addr2line -e vmlinux -i ffffffff8104f04e ffffffff81050bd8 \
|
||||
ffffffff8100b87d ffffffff8100c7b5
|
||||
kernel/signal.c:446
|
||||
kernel/signal.c:1806
|
||||
arch/x86/kernel/signal.c:805
|
||||
arch/x86/kernel/signal.c:871
|
||||
arch/x86/kernel/entry_64.S:694
|
||||
|
||||
Remember that since these addresses were found on the stack and not as the
|
||||
RIP value, they actually point to the _next_ instruction (they are return
|
||||
addresses). This becomes obvious when we look at the code for line 446:
|
||||
|
||||
422 int dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info)
|
||||
423 {
|
||||
...
|
||||
431 signr = __dequeue_signal(&tsk->signal->shared_pending,
|
||||
432 mask, info);
|
||||
433 /*
|
||||
434 * itimer signal ?
|
||||
435 *
|
||||
436 * itimers are process shared and we restart periodic
|
||||
437 * itimers in the signal delivery path to prevent DoS
|
||||
438 * attacks in the high resolution timer case. This is
|
||||
439 * compliant with the old way of self restarting
|
||||
440 * itimers, as the SIGALRM is a legacy signal and only
|
||||
441 * queued once. Changing the restart behaviour to
|
||||
442 * restart the timer in the signal dequeue path is
|
||||
443 * reducing the timer noise on heavy loaded !highres
|
||||
444 * systems too.
|
||||
445 */
|
||||
446 if (unlikely(signr == SIGALRM)) {
|
||||
...
|
||||
489 }
|
||||
|
||||
So instead of looking at 446, we should be looking at 431, which is the line
|
||||
that executes just before 446. Here we see that what we are looking for is
|
||||
&tsk->signal->shared_pending.
|
||||
|
||||
Our next task is now to figure out which function that puts items on this
|
||||
"shared_pending" list. A crude, but efficient tool, is git grep:
|
||||
|
||||
$ git grep -n 'shared_pending' kernel/
|
||||
...
|
||||
kernel/signal.c:828: pending = group ? &t->signal->shared_pending : &t->pending;
|
||||
kernel/signal.c:1339: pending = group ? &t->signal->shared_pending : &t->pending;
|
||||
...
|
||||
|
||||
There were more results, but none of them were related to list operations,
|
||||
and these were the only assignments. We inspect the line numbers more closely
|
||||
and find that this is indeed where items are being added to the list:
|
||||
|
||||
816 static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
|
||||
817 int group)
|
||||
818 {
|
||||
...
|
||||
828 pending = group ? &t->signal->shared_pending : &t->pending;
|
||||
...
|
||||
851 q = __sigqueue_alloc(t, GFP_ATOMIC, (sig < SIGRTMIN &&
|
||||
852 (is_si_special(info) ||
|
||||
853 info->si_code >= 0)));
|
||||
854 if (q) {
|
||||
855 list_add_tail(&q->list, &pending->list);
|
||||
...
|
||||
890 }
|
||||
|
||||
and:
|
||||
|
||||
1309 int send_sigqueue(struct sigqueue *q, struct task_struct *t, int group)
|
||||
1310 {
|
||||
....
|
||||
1339 pending = group ? &t->signal->shared_pending : &t->pending;
|
||||
1340 list_add_tail(&q->list, &pending->list);
|
||||
....
|
||||
1347 }
|
||||
|
||||
In the first case, the list element we are looking for, "q", is being returned
|
||||
from the function __sigqueue_alloc(), which looks like an allocation function.
|
||||
Let's take a look at it:
|
||||
|
||||
187 static struct sigqueue *__sigqueue_alloc(struct task_struct *t, gfp_t flags,
|
||||
188 int override_rlimit)
|
||||
189 {
|
||||
190 struct sigqueue *q = NULL;
|
||||
191 struct user_struct *user;
|
||||
192
|
||||
193 /*
|
||||
194 * We won't get problems with the target's UID changing under us
|
||||
195 * because changing it requires RCU be used, and if t != current, the
|
||||
196 * caller must be holding the RCU readlock (by way of a spinlock) and
|
||||
197 * we use RCU protection here
|
||||
198 */
|
||||
199 user = get_uid(__task_cred(t)->user);
|
||||
200 atomic_inc(&user->sigpending);
|
||||
201 if (override_rlimit ||
|
||||
202 atomic_read(&user->sigpending) <=
|
||||
203 t->signal->rlim[RLIMIT_SIGPENDING].rlim_cur)
|
||||
204 q = kmem_cache_alloc(sigqueue_cachep, flags);
|
||||
205 if (unlikely(q == NULL)) {
|
||||
206 atomic_dec(&user->sigpending);
|
||||
207 free_uid(user);
|
||||
208 } else {
|
||||
209 INIT_LIST_HEAD(&q->list);
|
||||
210 q->flags = 0;
|
||||
211 q->user = user;
|
||||
212 }
|
||||
213
|
||||
214 return q;
|
||||
215 }
|
||||
|
||||
We see that this function initializes q->list, q->flags, and q->user. It seems
|
||||
that now is the time to look at the definition of "struct sigqueue", e.g.:
|
||||
|
||||
14 struct sigqueue {
|
||||
15 struct list_head list;
|
||||
16 int flags;
|
||||
17 siginfo_t info;
|
||||
18 struct user_struct *user;
|
||||
19 };
|
||||
|
||||
And, you might remember, it was a memcpy() on &first->info that caused the
|
||||
warning, so this makes perfect sense. It also seems reasonable to assume that
|
||||
it is the caller of __sigqueue_alloc() that has the responsibility of filling
|
||||
out (initializing) this member.
|
||||
|
||||
But just which fields of the struct were uninitialized? Let's look at
|
||||
kmemcheck's report again:
|
||||
|
||||
WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88003e4a2024)
|
||||
80000000000000000000000000000000000000000088ffff0000000000000000
|
||||
i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u
|
||||
^
|
||||
|
||||
These first two lines are the memory dump of the memory object itself, and the
|
||||
shadow bytemap, respectively. The memory object itself is in this case
|
||||
&first->info. Just beware that the start of this dump is NOT the start of the
|
||||
object itself! The position of the caret (^) corresponds with the address of
|
||||
the read (ffff88003e4a2024).
|
||||
|
||||
The shadow bytemap dump legend is as follows:
|
||||
|
||||
i - initialized
|
||||
u - uninitialized
|
||||
a - unallocated (memory has been allocated by the slab layer, but has not
|
||||
yet been handed off to anybody)
|
||||
f - freed (memory has been allocated by the slab layer, but has been freed
|
||||
by the previous owner)
|
||||
|
||||
In order to figure out where (relative to the start of the object) the
|
||||
uninitialized memory was located, we have to look at the disassembly. For
|
||||
that, we'll need the RIP address again:
|
||||
|
||||
RIP: 0010:[<ffffffff8104ede8>] [<ffffffff8104ede8>] __dequeue_signal+0xc8/0x190
|
||||
|
||||
$ objdump -d --no-show-raw-insn vmlinux | grep -C 8 ffffffff8104ede8:
|
||||
ffffffff8104edc8: mov %r8,0x8(%r8)
|
||||
ffffffff8104edcc: test %r10d,%r10d
|
||||
ffffffff8104edcf: js ffffffff8104ee88 <__dequeue_signal+0x168>
|
||||
ffffffff8104edd5: mov %rax,%rdx
|
||||
ffffffff8104edd8: mov $0xc,%ecx
|
||||
ffffffff8104eddd: mov %r13,%rdi
|
||||
ffffffff8104ede0: mov $0x30,%eax
|
||||
ffffffff8104ede5: mov %rdx,%rsi
|
||||
ffffffff8104ede8: rep movsl %ds:(%rsi),%es:(%rdi)
|
||||
ffffffff8104edea: test $0x2,%al
|
||||
ffffffff8104edec: je ffffffff8104edf0 <__dequeue_signal+0xd0>
|
||||
ffffffff8104edee: movsw %ds:(%rsi),%es:(%rdi)
|
||||
ffffffff8104edf0: test $0x1,%al
|
||||
ffffffff8104edf2: je ffffffff8104edf5 <__dequeue_signal+0xd5>
|
||||
ffffffff8104edf4: movsb %ds:(%rsi),%es:(%rdi)
|
||||
ffffffff8104edf5: mov %r8,%rdi
|
||||
ffffffff8104edf8: callq ffffffff8104de60 <__sigqueue_free>
|
||||
|
||||
As expected, it's the "rep movsl" instruction from the memcpy() that causes
|
||||
the warning. We know about REP MOVSL that it uses the register RCX to count
|
||||
the number of remaining iterations. By taking a look at the register dump
|
||||
again (from the kmemcheck report), we can figure out how many bytes were left
|
||||
to copy:
|
||||
|
||||
RAX: 0000000000000030 RBX: ffff88003d4ea968 RCX: 0000000000000009
|
||||
|
||||
By looking at the disassembly, we also see that %ecx is being loaded with the
|
||||
value $0xc just before (ffffffff8104edd8), so we are very lucky. Keep in mind
|
||||
that this is the number of iterations, not bytes. And since this is a "long"
|
||||
operation, we need to multiply by 4 to get the number of bytes. So this means
|
||||
that the uninitialized value was encountered at 4 * (0xc - 0x9) = 12 bytes
|
||||
from the start of the object.
|
||||
|
||||
We can now try to figure out which field of the "struct siginfo" that was not
|
||||
initialized. This is the beginning of the struct:
|
||||
|
||||
40 typedef struct siginfo {
|
||||
41 int si_signo;
|
||||
42 int si_errno;
|
||||
43 int si_code;
|
||||
44
|
||||
45 union {
|
||||
..
|
||||
92 } _sifields;
|
||||
93 } siginfo_t;
|
||||
|
||||
On 64-bit, the int is 4 bytes long, so it must the the union member that has
|
||||
not been initialized. We can verify this using gdb:
|
||||
|
||||
$ gdb vmlinux
|
||||
...
|
||||
(gdb) p &((struct siginfo *) 0)->_sifields
|
||||
$1 = (union {...} *) 0x10
|
||||
|
||||
Actually, it seems that the union member is located at offset 0x10 -- which
|
||||
means that gcc has inserted 4 bytes of padding between the members si_code
|
||||
and _sifields. We can now get a fuller picture of the memory dump:
|
||||
|
||||
_----------------------------=> si_code
|
||||
/ _--------------------=> (padding)
|
||||
| / _------------=> _sifields(._kill._pid)
|
||||
| | / _----=> _sifields(._kill._uid)
|
||||
| | | /
|
||||
-------|-------|-------|-------|
|
||||
80000000000000000000000000000000000000000088ffff0000000000000000
|
||||
i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u
|
||||
|
||||
This allows us to realize another important fact: si_code contains the value
|
||||
0x80. Remember that x86 is little endian, so the first 4 bytes "80000000" are
|
||||
really the number 0x00000080. With a bit of research, we find that this is
|
||||
actually the constant SI_KERNEL defined in include/asm-generic/siginfo.h:
|
||||
|
||||
144 #define SI_KERNEL 0x80 /* sent by the kernel from somewhere */
|
||||
|
||||
This macro is used in exactly one place in the x86 kernel: In send_signal()
|
||||
in kernel/signal.c:
|
||||
|
||||
816 static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
|
||||
817 int group)
|
||||
818 {
|
||||
...
|
||||
828 pending = group ? &t->signal->shared_pending : &t->pending;
|
||||
...
|
||||
851 q = __sigqueue_alloc(t, GFP_ATOMIC, (sig < SIGRTMIN &&
|
||||
852 (is_si_special(info) ||
|
||||
853 info->si_code >= 0)));
|
||||
854 if (q) {
|
||||
855 list_add_tail(&q->list, &pending->list);
|
||||
856 switch ((unsigned long) info) {
|
||||
...
|
||||
865 case (unsigned long) SEND_SIG_PRIV:
|
||||
866 q->info.si_signo = sig;
|
||||
867 q->info.si_errno = 0;
|
||||
868 q->info.si_code = SI_KERNEL;
|
||||
869 q->info.si_pid = 0;
|
||||
870 q->info.si_uid = 0;
|
||||
871 break;
|
||||
...
|
||||
890 }
|
||||
|
||||
Not only does this match with the .si_code member, it also matches the place
|
||||
we found earlier when looking for where siginfo_t objects are enqueued on the
|
||||
"shared_pending" list.
|
||||
|
||||
So to sum up: It seems that it is the padding introduced by the compiler
|
||||
between two struct fields that is uninitialized, and this gets reported when
|
||||
we do a memcpy() on the struct. This means that we have identified a false
|
||||
positive warning.
|
||||
|
||||
Normally, kmemcheck will not report uninitialized accesses in memcpy() calls
|
||||
when both the source and destination addresses are tracked. (Instead, we copy
|
||||
the shadow bytemap as well). In this case, the destination address clearly
|
||||
was not tracked. We can dig a little deeper into the stack trace from above:
|
||||
|
||||
arch/x86/kernel/signal.c:805
|
||||
arch/x86/kernel/signal.c:871
|
||||
arch/x86/kernel/entry_64.S:694
|
||||
|
||||
And we clearly see that the destination siginfo object is located on the
|
||||
stack:
|
||||
|
||||
782 static void do_signal(struct pt_regs *regs)
|
||||
783 {
|
||||
784 struct k_sigaction ka;
|
||||
785 siginfo_t info;
|
||||
...
|
||||
804 signr = get_signal_to_deliver(&info, &ka, regs, NULL);
|
||||
...
|
||||
854 }
|
||||
|
||||
And this &info is what eventually gets passed to copy_siginfo() as the
|
||||
destination argument.
|
||||
|
||||
Now, even though we didn't find an actual error here, the example is still a
|
||||
good one, because it shows how one would go about to find out what the report
|
||||
was all about.
|
||||
|
||||
|
||||
3.4. Annotating false positives
|
||||
===============================
|
||||
|
||||
There are a few different ways to make annotations in the source code that
|
||||
will keep kmemcheck from checking and reporting certain allocations. Here
|
||||
they are:
|
||||
|
||||
o __GFP_NOTRACK_FALSE_POSITIVE
|
||||
|
||||
This flag can be passed to kmalloc() or kmem_cache_alloc() (therefore
|
||||
also to other functions that end up calling one of these) to indicate
|
||||
that the allocation should not be tracked because it would lead to
|
||||
a false positive report. This is a "big hammer" way of silencing
|
||||
kmemcheck; after all, even if the false positive pertains to
|
||||
particular field in a struct, for example, we will now lose the
|
||||
ability to find (real) errors in other parts of the same struct.
|
||||
|
||||
Example:
|
||||
|
||||
/* No warnings will ever trigger on accessing any part of x */
|
||||
x = kmalloc(sizeof *x, GFP_KERNEL | __GFP_NOTRACK_FALSE_POSITIVE);
|
||||
|
||||
o kmemcheck_bitfield_begin(name)/kmemcheck_bitfield_end(name) and
|
||||
kmemcheck_annotate_bitfield(ptr, name)
|
||||
|
||||
The first two of these three macros can be used inside struct
|
||||
definitions to signal, respectively, the beginning and end of a
|
||||
bitfield. Additionally, this will assign the bitfield a name, which
|
||||
is given as an argument to the macros.
|
||||
|
||||
Having used these markers, one can later use
|
||||
kmemcheck_annotate_bitfield() at the point of allocation, to indicate
|
||||
which parts of the allocation is part of a bitfield.
|
||||
|
||||
Example:
|
||||
|
||||
struct foo {
|
||||
int x;
|
||||
|
||||
kmemcheck_bitfield_begin(flags);
|
||||
int flag_a:1;
|
||||
int flag_b:1;
|
||||
kmemcheck_bitfield_end(flags);
|
||||
|
||||
int y;
|
||||
};
|
||||
|
||||
struct foo *x = kmalloc(sizeof *x);
|
||||
|
||||
/* No warnings will trigger on accessing the bitfield of x */
|
||||
kmemcheck_annotate_bitfield(x, flags);
|
||||
|
||||
Note that kmemcheck_annotate_bitfield() can be used even before the
|
||||
return value of kmalloc() is checked -- in other words, passing NULL
|
||||
as the first argument is legal (and will do nothing).
|
||||
|
||||
|
||||
4. Reporting errors
|
||||
===================
|
||||
|
||||
As we have seen, kmemcheck will produce false positive reports. Therefore, it
|
||||
is not very wise to blindly post kmemcheck warnings to mailing lists and
|
||||
maintainers. Instead, I encourage maintainers and developers to find errors
|
||||
in their own code. If you get a warning, you can try to work around it, try
|
||||
to figure out if it's a real error or not, or simply ignore it. Most
|
||||
developers know their own code and will quickly and efficiently determine the
|
||||
root cause of a kmemcheck report. This is therefore also the most efficient
|
||||
way to work with kmemcheck.
|
||||
|
||||
That said, we (the kmemcheck maintainers) will always be on the lookout for
|
||||
false positives that we can annotate and silence. So whatever you find,
|
||||
please drop us a note privately! Kernel configs and steps to reproduce (if
|
||||
available) are of course a great help too.
|
||||
|
||||
Happy hacking!
|
||||
|
||||
|
||||
5. Technical description
|
||||
========================
|
||||
|
||||
kmemcheck works by marking memory pages non-present. This means that whenever
|
||||
somebody attempts to access the page, a page fault is generated. The page
|
||||
fault handler notices that the page was in fact only hidden, and so it calls
|
||||
on the kmemcheck code to make further investigations.
|
||||
|
||||
When the investigations are completed, kmemcheck "shows" the page by marking
|
||||
it present (as it would be under normal circumstances). This way, the
|
||||
interrupted code can continue as usual.
|
||||
|
||||
But after the instruction has been executed, we should hide the page again, so
|
||||
that we can catch the next access too! Now kmemcheck makes use of a debugging
|
||||
feature of the processor, namely single-stepping. When the processor has
|
||||
finished the one instruction that generated the memory access, a debug
|
||||
exception is raised. From here, we simply hide the page again and continue
|
||||
execution, this time with the single-stepping feature turned off.
|
||||
|
||||
kmemcheck requires some assistance from the memory allocator in order to work.
|
||||
The memory allocator needs to
|
||||
|
||||
1. Tell kmemcheck about newly allocated pages and pages that are about to
|
||||
be freed. This allows kmemcheck to set up and tear down the shadow memory
|
||||
for the pages in question. The shadow memory stores the status of each
|
||||
byte in the allocation proper, e.g. whether it is initialized or
|
||||
uninitialized.
|
||||
|
||||
2. Tell kmemcheck which parts of memory should be marked uninitialized.
|
||||
There are actually a few more states, such as "not yet allocated" and
|
||||
"recently freed".
|
||||
|
||||
If a slab cache is set up using the SLAB_NOTRACK flag, it will never return
|
||||
memory that can take page faults because of kmemcheck.
|
||||
|
||||
If a slab cache is NOT set up using the SLAB_NOTRACK flag, callers can still
|
||||
request memory with the __GFP_NOTRACK or __GFP_NOTRACK_FALSE_POSITIVE flags.
|
||||
This does not prevent the page faults from occurring, however, but marks the
|
||||
object in question as being initialized so that no warnings will ever be
|
||||
produced for this object.
|
||||
|
||||
Currently, the SLAB and SLUB allocators are supported by kmemcheck.
|
142
Documentation/kmemleak.txt
Normal file
142
Documentation/kmemleak.txt
Normal file
@ -0,0 +1,142 @@
|
||||
Kernel Memory Leak Detector
|
||||
===========================
|
||||
|
||||
Introduction
|
||||
------------
|
||||
|
||||
Kmemleak provides a way of detecting possible kernel memory leaks in a
|
||||
way similar to a tracing garbage collector
|
||||
(http://en.wikipedia.org/wiki/Garbage_collection_%28computer_science%29#Tracing_garbage_collectors),
|
||||
with the difference that the orphan objects are not freed but only
|
||||
reported via /sys/kernel/debug/kmemleak. A similar method is used by the
|
||||
Valgrind tool (memcheck --leak-check) to detect the memory leaks in
|
||||
user-space applications.
|
||||
|
||||
Usage
|
||||
-----
|
||||
|
||||
CONFIG_DEBUG_KMEMLEAK in "Kernel hacking" has to be enabled. A kernel
|
||||
thread scans the memory every 10 minutes (by default) and prints any new
|
||||
unreferenced objects found. To trigger an intermediate scan and display
|
||||
all the possible memory leaks:
|
||||
|
||||
# mount -t debugfs nodev /sys/kernel/debug/
|
||||
# cat /sys/kernel/debug/kmemleak
|
||||
|
||||
Note that the orphan objects are listed in the order they were allocated
|
||||
and one object at the beginning of the list may cause other subsequent
|
||||
objects to be reported as orphan.
|
||||
|
||||
Memory scanning parameters can be modified at run-time by writing to the
|
||||
/sys/kernel/debug/kmemleak file. The following parameters are supported:
|
||||
|
||||
off - disable kmemleak (irreversible)
|
||||
stack=on - enable the task stacks scanning
|
||||
stack=off - disable the tasks stacks scanning
|
||||
scan=on - start the automatic memory scanning thread
|
||||
scan=off - stop the automatic memory scanning thread
|
||||
scan=<secs> - set the automatic memory scanning period in seconds (0
|
||||
to disable it)
|
||||
|
||||
Kmemleak can also be disabled at boot-time by passing "kmemleak=off" on
|
||||
the kernel command line.
|
||||
|
||||
Basic Algorithm
|
||||
---------------
|
||||
|
||||
The memory allocations via kmalloc, vmalloc, kmem_cache_alloc and
|
||||
friends are traced and the pointers, together with additional
|
||||
information like size and stack trace, are stored in a prio search tree.
|
||||
The corresponding freeing function calls are tracked and the pointers
|
||||
removed from the kmemleak data structures.
|
||||
|
||||
An allocated block of memory is considered orphan if no pointer to its
|
||||
start address or to any location inside the block can be found by
|
||||
scanning the memory (including saved registers). This means that there
|
||||
might be no way for the kernel to pass the address of the allocated
|
||||
block to a freeing function and therefore the block is considered a
|
||||
memory leak.
|
||||
|
||||
The scanning algorithm steps:
|
||||
|
||||
1. mark all objects as white (remaining white objects will later be
|
||||
considered orphan)
|
||||
2. scan the memory starting with the data section and stacks, checking
|
||||
the values against the addresses stored in the prio search tree. If
|
||||
a pointer to a white object is found, the object is added to the
|
||||
gray list
|
||||
3. scan the gray objects for matching addresses (some white objects
|
||||
can become gray and added at the end of the gray list) until the
|
||||
gray set is finished
|
||||
4. the remaining white objects are considered orphan and reported via
|
||||
/sys/kernel/debug/kmemleak
|
||||
|
||||
Some allocated memory blocks have pointers stored in the kernel's
|
||||
internal data structures and they cannot be detected as orphans. To
|
||||
avoid this, kmemleak can also store the number of values pointing to an
|
||||
address inside the block address range that need to be found so that the
|
||||
block is not considered a leak. One example is __vmalloc().
|
||||
|
||||
Kmemleak API
|
||||
------------
|
||||
|
||||
See the include/linux/kmemleak.h header for the functions prototype.
|
||||
|
||||
kmemleak_init - initialize kmemleak
|
||||
kmemleak_alloc - notify of a memory block allocation
|
||||
kmemleak_free - notify of a memory block freeing
|
||||
kmemleak_not_leak - mark an object as not a leak
|
||||
kmemleak_ignore - do not scan or report an object as leak
|
||||
kmemleak_scan_area - add scan areas inside a memory block
|
||||
kmemleak_no_scan - do not scan a memory block
|
||||
kmemleak_erase - erase an old value in a pointer variable
|
||||
kmemleak_alloc_recursive - as kmemleak_alloc but checks the recursiveness
|
||||
kmemleak_free_recursive - as kmemleak_free but checks the recursiveness
|
||||
|
||||
Dealing with false positives/negatives
|
||||
--------------------------------------
|
||||
|
||||
The false negatives are real memory leaks (orphan objects) but not
|
||||
reported by kmemleak because values found during the memory scanning
|
||||
point to such objects. To reduce the number of false negatives, kmemleak
|
||||
provides the kmemleak_ignore, kmemleak_scan_area, kmemleak_no_scan and
|
||||
kmemleak_erase functions (see above). The task stacks also increase the
|
||||
amount of false negatives and their scanning is not enabled by default.
|
||||
|
||||
The false positives are objects wrongly reported as being memory leaks
|
||||
(orphan). For objects known not to be leaks, kmemleak provides the
|
||||
kmemleak_not_leak function. The kmemleak_ignore could also be used if
|
||||
the memory block is known not to contain other pointers and it will no
|
||||
longer be scanned.
|
||||
|
||||
Some of the reported leaks are only transient, especially on SMP
|
||||
systems, because of pointers temporarily stored in CPU registers or
|
||||
stacks. Kmemleak defines MSECS_MIN_AGE (defaulting to 1000) representing
|
||||
the minimum age of an object to be reported as a memory leak.
|
||||
|
||||
Limitations and Drawbacks
|
||||
-------------------------
|
||||
|
||||
The main drawback is the reduced performance of memory allocation and
|
||||
freeing. To avoid other penalties, the memory scanning is only performed
|
||||
when the /sys/kernel/debug/kmemleak file is read. Anyway, this tool is
|
||||
intended for debugging purposes where the performance might not be the
|
||||
most important requirement.
|
||||
|
||||
To keep the algorithm simple, kmemleak scans for values pointing to any
|
||||
address inside a block's address range. This may lead to an increased
|
||||
number of false negatives. However, it is likely that a real memory leak
|
||||
will eventually become visible.
|
||||
|
||||
Another source of false negatives is the data stored in non-pointer
|
||||
values. In a future version, kmemleak could only scan the pointer
|
||||
members in the allocated structures. This feature would solve many of
|
||||
the false negative cases described above.
|
||||
|
||||
The tool can report false positives. These are cases where an allocated
|
||||
block doesn't need to be freed (some cases in the init_call functions),
|
||||
the pointer is calculated by other methods than the usual container_of
|
||||
macro or the pointer is stored in a location not scanned by kmemleak.
|
||||
|
||||
Page allocations and ioremap are not tracked. Only the ARM and x86
|
||||
architectures are currently supported.
|
@ -132,7 +132,7 @@ kobject_name():
|
||||
const char *kobject_name(const struct kobject * kobj);
|
||||
|
||||
There is a helper function to both initialize and add the kobject to the
|
||||
kernel at the same time, called supprisingly enough kobject_init_and_add():
|
||||
kernel at the same time, called surprisingly enough kobject_init_and_add():
|
||||
|
||||
int kobject_init_and_add(struct kobject *kobj, struct kobj_type *ktype,
|
||||
struct kobject *parent, const char *fmt, ...);
|
||||
|
@ -507,9 +507,9 @@ http://www.linuxsymposium.org/2006/linuxsymposium_procv2.pdf (pages 101-115)
|
||||
Appendix A: The kprobes debugfs interface
|
||||
|
||||
With recent kernels (> 2.6.20) the list of registered kprobes is visible
|
||||
under the /debug/kprobes/ directory (assuming debugfs is mounted at /debug).
|
||||
under the /sys/kernel/debug/kprobes/ directory (assuming debugfs is mounted at //sys/kernel/debug).
|
||||
|
||||
/debug/kprobes/list: Lists all registered probes on the system
|
||||
/sys/kernel/debug/kprobes/list: Lists all registered probes on the system
|
||||
|
||||
c015d71a k vfs_read+0x0
|
||||
c011a316 j do_fork+0x0
|
||||
@ -525,7 +525,7 @@ virtual addresses that correspond to modules that've been unloaded),
|
||||
such probes are marked with [GONE]. If the probe is temporarily disabled,
|
||||
such probes are marked with [DISABLED].
|
||||
|
||||
/debug/kprobes/enabled: Turn kprobes ON/OFF forcibly.
|
||||
/sys/kernel/debug/kprobes/enabled: Turn kprobes ON/OFF forcibly.
|
||||
|
||||
Provides a knob to globally and forcibly turn registered kprobes ON or OFF.
|
||||
By default, all kprobes are enabled. By echoing "0" to this file, all
|
||||
|
@ -40,7 +40,7 @@ NOTE: The Acer Aspire One is not supported hardware. It cannot work with
|
||||
acer-wmi until Acer fix their ACPI-WMI implementation on them, so has been
|
||||
blacklisted until that happens.
|
||||
|
||||
Please see the website for the current list of known working hardare:
|
||||
Please see the website for the current list of known working hardware:
|
||||
|
||||
http://code.google.com/p/aceracpi/wiki/SupportedHardware
|
||||
|
||||
|
@ -22,7 +22,7 @@ If your laptop model supports it, you will find sysfs files in the
|
||||
/sys/class/backlight/sony/
|
||||
directory. You will be able to query and set the current screen
|
||||
brightness:
|
||||
brightness get/set screen brightness (an iteger
|
||||
brightness get/set screen brightness (an integer
|
||||
between 0 and 7)
|
||||
actual_brightness reading from this file will query the HW
|
||||
to get real brightness value
|
||||
|
@ -506,7 +506,7 @@ generate input device EV_KEY events.
|
||||
In addition to the EV_KEY events, thinkpad-acpi may also issue EV_SW
|
||||
events for switches:
|
||||
|
||||
SW_RFKILL_ALL T60 and later hardare rfkill rocker switch
|
||||
SW_RFKILL_ALL T60 and later hardware rfkill rocker switch
|
||||
SW_TABLET_MODE Tablet ThinkPads HKEY events 0x5009 and 0x500A
|
||||
|
||||
Non hot-key ACPI HKEY event map:
|
||||
|
@ -1,6 +1,5 @@
|
||||
# This creates the demonstration utility "lguest" which runs a Linux guest.
|
||||
CFLAGS:=-Wall -Wmissing-declarations -Wmissing-prototypes -O3 -I../../include -I../../arch/x86/include -U_FORTIFY_SOURCE
|
||||
LDLIBS:=-lz
|
||||
CFLAGS:=-m32 -Wall -Wmissing-declarations -Wmissing-prototypes -O3 -I../../include -I../../arch/x86/include -U_FORTIFY_SOURCE
|
||||
|
||||
all: lguest
|
||||
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -37,7 +37,6 @@ Running Lguest:
|
||||
"Paravirtualized guest support" = Y
|
||||
"Lguest guest support" = Y
|
||||
"High Memory Support" = off/4GB
|
||||
"PAE (Physical Address Extension) Support" = N
|
||||
"Alignment value to which kernel should be aligned" = 0x100000
|
||||
(CONFIG_PARAVIRT=y, CONFIG_LGUEST_GUEST=y, CONFIG_HIGHMEM64G=n and
|
||||
CONFIG_PHYSICAL_ALIGN=0x100000)
|
||||
|
@ -34,7 +34,7 @@ out of order wrt other memory writes by the owner CPU.
|
||||
|
||||
It can be done by slightly modifying the standard atomic operations : only
|
||||
their UP variant must be kept. It typically means removing LOCK prefix (on
|
||||
i386 and x86_64) and any SMP sychronization barrier. If the architecture does
|
||||
i386 and x86_64) and any SMP synchronization barrier. If the architecture does
|
||||
not have a different behavior between SMP and UP, including asm-generic/local.h
|
||||
in your architecture's local.h is sufficient.
|
||||
|
||||
|
@ -73,13 +73,13 @@ this phase is triggered automatically. ACPI can notify this event. If not,
|
||||
(see Section 4.).
|
||||
|
||||
Logical Memory Hotplug phase is to change memory state into
|
||||
avaiable/unavailable for users. Amount of memory from user's view is
|
||||
available/unavailable for users. Amount of memory from user's view is
|
||||
changed by this phase. The kernel makes all memory in it as free pages
|
||||
when a memory range is available.
|
||||
|
||||
In this document, this phase is described as online/offline.
|
||||
|
||||
Logical Memory Hotplug phase is triggred by write of sysfs file by system
|
||||
Logical Memory Hotplug phase is triggered by write of sysfs file by system
|
||||
administrator. For the hot-add case, it must be executed after Physical Hotplug
|
||||
phase by hand.
|
||||
(However, if you writes udev's hotplug scripts for memory hotplug, these
|
||||
@ -334,7 +334,7 @@ MEMORY_CANCEL_ONLINE
|
||||
Generated if MEMORY_GOING_ONLINE fails.
|
||||
|
||||
MEMORY_ONLINE
|
||||
Generated when memory has succesfully brought online. The callback may
|
||||
Generated when memory has successfully brought online. The callback may
|
||||
allocate pages from the new memory.
|
||||
|
||||
MEMORY_GOING_OFFLINE
|
||||
@ -359,7 +359,7 @@ The third argument is passed by pointer of struct memory_notify.
|
||||
struct memory_notify {
|
||||
unsigned long start_pfn;
|
||||
unsigned long nr_pages;
|
||||
int status_cahnge_nid;
|
||||
int status_change_nid;
|
||||
}
|
||||
|
||||
start_pfn is start_pfn of online/offline memory.
|
||||
|
@ -26,7 +26,7 @@ registers and the stack. If the first argument is a 64-bit value, it will be
|
||||
passed in D0:D1. If the first argument is not a 64-bit value, but the second
|
||||
is, the second will be passed entirely on the stack and D1 will be unused.
|
||||
|
||||
Arguments smaller than 32-bits are not coelesced within a register or a stack
|
||||
Arguments smaller than 32-bits are not coalesced within a register or a stack
|
||||
word. For example, two byte-sized arguments will always be passed in separate
|
||||
registers or word-sized stack slots.
|
||||
|
||||
|
@ -50,7 +50,7 @@ byte 255: bit7 bit6 bit5 bit4 bit3 bit2 bit1 bit0 rp1 rp3 rp5 ... rp15
|
||||
cp5 cp5 cp5 cp5 cp4 cp4 cp4 cp4
|
||||
|
||||
This figure represents a sector of 256 bytes.
|
||||
cp is my abbreviaton for column parity, rp for row parity.
|
||||
cp is my abbreviation for column parity, rp for row parity.
|
||||
|
||||
Let's start to explain column parity.
|
||||
cp0 is the parity that belongs to all bit0, bit2, bit4, bit6.
|
||||
@ -560,7 +560,7 @@ Measuring this code again showed big gain. When executing the original
|
||||
linux code 1 million times, this took about 1 second on my system.
|
||||
(using time to measure the performance). After this iteration I was back
|
||||
to 0.075 sec. Actually I had to decide to start measuring over 10
|
||||
million interations in order not to loose too much accuracy. This one
|
||||
million iterations in order not to lose too much accuracy. This one
|
||||
definitely seemed to be the jackpot!
|
||||
|
||||
There is a little bit more room for improvement though. There are three
|
||||
@ -571,8 +571,8 @@ loop; This eliminates 3 statements per loop. Of course after the loop we
|
||||
need to correct by adding:
|
||||
rp4 ^= rp4_6;
|
||||
rp6 ^= rp4_6
|
||||
Furthermore there are 4 sequential assingments to rp8. This can be
|
||||
encoded slightly more efficient by saving tmppar before those 4 lines
|
||||
Furthermore there are 4 sequential assignments to rp8. This can be
|
||||
encoded slightly more efficiently by saving tmppar before those 4 lines
|
||||
and later do rp8 = rp8 ^ tmppar ^ notrp8;
|
||||
(where notrp8 is the value of rp8 before those 4 lines).
|
||||
Again a use of the commutative property of xor.
|
||||
@ -622,7 +622,7 @@ Not a big change, but every penny counts :-)
|
||||
Analysis 7
|
||||
==========
|
||||
|
||||
Acutally this made things worse. Not very much, but I don't want to move
|
||||
Actually this made things worse. Not very much, but I don't want to move
|
||||
into the wrong direction. Maybe something to investigate later. Could
|
||||
have to do with caching again.
|
||||
|
||||
@ -642,7 +642,7 @@ Analysis 8
|
||||
This makes things worse. Let's stick with attempt 6 and continue from there.
|
||||
Although it seems that the code within the loop cannot be optimised
|
||||
further there is still room to optimize the generation of the ecc codes.
|
||||
We can simply calcualate the total parity. If this is 0 then rp4 = rp5
|
||||
We can simply calculate the total parity. If this is 0 then rp4 = rp5
|
||||
etc. If the parity is 1, then rp4 = !rp5;
|
||||
But if rp4 = rp5 we do not need rp5 etc. We can just write the even bits
|
||||
in the result byte and then do something like
|
||||
|
@ -221,7 +221,7 @@ ad_select
|
||||
|
||||
- Any slave's 802.3ad association state changes
|
||||
|
||||
- The bond's adminstrative state changes to up
|
||||
- The bond's administrative state changes to up
|
||||
|
||||
count or 2
|
||||
|
||||
@ -369,7 +369,7 @@ fail_over_mac
|
||||
When this policy is used in conjuction with the mii
|
||||
monitor, devices which assert link up prior to being
|
||||
able to actually transmit and receive are particularly
|
||||
susecptible to loss of the gratuitous ARP, and an
|
||||
susceptible to loss of the gratuitous ARP, and an
|
||||
appropriate updelay setting may be required.
|
||||
|
||||
follow or 2
|
||||
@ -1794,7 +1794,7 @@ target to query.
|
||||
generally referred to as "trunk failover." This is a feature of the
|
||||
switch that causes the link state of a particular switch port to be set
|
||||
down (or up) when the state of another switch port goes down (or up).
|
||||
It's purpose is to propogate link failures from logically "exterior" ports
|
||||
Its purpose is to propagate link failures from logically "exterior" ports
|
||||
to the logically "interior" ports that bonding is able to monitor via
|
||||
miimon. Availability and configuration for trunk failover varies by
|
||||
switch, but this can be a viable alternative to the ARP monitor when using
|
||||
|
@ -36,10 +36,15 @@ This file contains
|
||||
6.2 local loopback of sent frames
|
||||
6.3 CAN controller hardware filters
|
||||
6.4 The virtual CAN driver (vcan)
|
||||
6.5 currently supported CAN hardware
|
||||
6.6 todo
|
||||
6.5 The CAN network device driver interface
|
||||
6.5.1 Netlink interface to set/get devices properties
|
||||
6.5.2 Setting the CAN bit-timing
|
||||
6.5.3 Starting and stopping the CAN network device
|
||||
6.6 supported CAN hardware
|
||||
|
||||
7 Credits
|
||||
7 Socket CAN resources
|
||||
|
||||
8 Credits
|
||||
|
||||
============================================================================
|
||||
|
||||
@ -234,6 +239,8 @@ solution for a couple of reasons:
|
||||
the user application using the common CAN filter mechanisms. Inside
|
||||
this filter definition the (interested) type of errors may be
|
||||
selected. The reception of error frames is disabled by default.
|
||||
The format of the CAN error frame is briefly decribed in the Linux
|
||||
header file "include/linux/can/error.h".
|
||||
|
||||
4. How to use Socket CAN
|
||||
------------------------
|
||||
@ -327,7 +334,7 @@ solution for a couple of reasons:
|
||||
return 1;
|
||||
}
|
||||
|
||||
/* paraniod check ... */
|
||||
/* paranoid check ... */
|
||||
if (nbytes < sizeof(struct can_frame)) {
|
||||
fprintf(stderr, "read: incomplete CAN frame\n");
|
||||
return 1;
|
||||
@ -605,61 +612,213 @@ solution for a couple of reasons:
|
||||
removal of vcan network devices can be managed with the ip(8) tool:
|
||||
|
||||
- Create a virtual CAN network interface:
|
||||
ip link add type vcan
|
||||
$ ip link add type vcan
|
||||
|
||||
- Create a virtual CAN network interface with a specific name 'vcan42':
|
||||
ip link add dev vcan42 type vcan
|
||||
$ ip link add dev vcan42 type vcan
|
||||
|
||||
- Remove a (virtual CAN) network interface 'vcan42':
|
||||
ip link del vcan42
|
||||
$ ip link del vcan42
|
||||
|
||||
The tool 'vcan' from the SocketCAN SVN repository on BerliOS is obsolete.
|
||||
6.5 The CAN network device driver interface
|
||||
|
||||
Virtual CAN network device creation in older Kernels:
|
||||
In Linux Kernel versions < 2.6.24 the vcan driver creates 4 vcan
|
||||
netdevices at module load time by default. This value can be changed
|
||||
with the module parameter 'numdev'. E.g. 'modprobe vcan numdev=8'
|
||||
The CAN network device driver interface provides a generic interface
|
||||
to setup, configure and monitor CAN network devices. The user can then
|
||||
configure the CAN device, like setting the bit-timing parameters, via
|
||||
the netlink interface using the program "ip" from the "IPROUTE2"
|
||||
utility suite. The following chapter describes briefly how to use it.
|
||||
Furthermore, the interface uses a common data structure and exports a
|
||||
set of common functions, which all real CAN network device drivers
|
||||
should use. Please have a look to the SJA1000 or MSCAN driver to
|
||||
understand how to use them. The name of the module is can-dev.ko.
|
||||
|
||||
6.5 currently supported CAN hardware
|
||||
6.5.1 Netlink interface to set/get devices properties
|
||||
|
||||
On the project website http://developer.berlios.de/projects/socketcan
|
||||
there are different drivers available:
|
||||
The CAN device must be configured via netlink interface. The supported
|
||||
netlink message types are defined and briefly described in
|
||||
"include/linux/can/netlink.h". CAN link support for the program "ip"
|
||||
of the IPROUTE2 utility suite is avaiable and it can be used as shown
|
||||
below:
|
||||
|
||||
vcan: Virtual CAN interface driver (if no real hardware is available)
|
||||
sja1000: Philips SJA1000 CAN controller (recommended)
|
||||
i82527: Intel i82527 CAN controller
|
||||
mscan: Motorola/Freescale CAN controller (e.g. inside SOC MPC5200)
|
||||
ccan: CCAN controller core (e.g. inside SOC h7202)
|
||||
slcan: For a bunch of CAN adaptors that are attached via a
|
||||
serial line ASCII protocol (for serial / USB adaptors)
|
||||
- Setting CAN device properties:
|
||||
|
||||
Additionally the different CAN adaptors (ISA/PCI/PCMCIA/USB/Parport)
|
||||
from PEAK Systemtechnik support the CAN netdevice driver model
|
||||
since Linux driver v6.0: http://www.peak-system.com/linux/index.htm
|
||||
$ ip link set can0 type can help
|
||||
Usage: ip link set DEVICE type can
|
||||
[ bitrate BITRATE [ sample-point SAMPLE-POINT] ] |
|
||||
[ tq TQ prop-seg PROP_SEG phase-seg1 PHASE-SEG1
|
||||
phase-seg2 PHASE-SEG2 [ sjw SJW ] ]
|
||||
|
||||
Please check the Mailing Lists on the berlios OSS project website.
|
||||
[ loopback { on | off } ]
|
||||
[ listen-only { on | off } ]
|
||||
[ triple-sampling { on | off } ]
|
||||
|
||||
6.6 todo
|
||||
[ restart-ms TIME-MS ]
|
||||
[ restart ]
|
||||
|
||||
The configuration interface for CAN network drivers is still an open
|
||||
issue that has not been finalized in the socketcan project. Also the
|
||||
idea of having a library module (candev.ko) that holds functions
|
||||
that are needed by all CAN netdevices is not ready to ship.
|
||||
Your contribution is welcome.
|
||||
Where: BITRATE := { 1..1000000 }
|
||||
SAMPLE-POINT := { 0.000..0.999 }
|
||||
TQ := { NUMBER }
|
||||
PROP-SEG := { 1..8 }
|
||||
PHASE-SEG1 := { 1..8 }
|
||||
PHASE-SEG2 := { 1..8 }
|
||||
SJW := { 1..4 }
|
||||
RESTART-MS := { 0 | NUMBER }
|
||||
|
||||
7. Credits
|
||||
- Display CAN device details and statistics:
|
||||
|
||||
$ ip -details -statistics link show can0
|
||||
2: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP qlen 10
|
||||
link/can
|
||||
can <TRIPLE-SAMPLING> state ERROR-ACTIVE restart-ms 100
|
||||
bitrate 125000 sample_point 0.875
|
||||
tq 125 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1
|
||||
sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
|
||||
clock 8000000
|
||||
re-started bus-errors arbit-lost error-warn error-pass bus-off
|
||||
41 17457 0 41 42 41
|
||||
RX: bytes packets errors dropped overrun mcast
|
||||
140859 17608 17457 0 0 0
|
||||
TX: bytes packets errors dropped carrier collsns
|
||||
861 112 0 41 0 0
|
||||
|
||||
More info to the above output:
|
||||
|
||||
"<TRIPLE-SAMPLING>"
|
||||
Shows the list of selected CAN controller modes: LOOPBACK,
|
||||
LISTEN-ONLY, or TRIPLE-SAMPLING.
|
||||
|
||||
"state ERROR-ACTIVE"
|
||||
The current state of the CAN controller: "ERROR-ACTIVE",
|
||||
"ERROR-WARNING", "ERROR-PASSIVE", "BUS-OFF" or "STOPPED"
|
||||
|
||||
"restart-ms 100"
|
||||
Automatic restart delay time. If set to a non-zero value, a
|
||||
restart of the CAN controller will be triggered automatically
|
||||
in case of a bus-off condition after the specified delay time
|
||||
in milliseconds. By default it's off.
|
||||
|
||||
"bitrate 125000 sample_point 0.875"
|
||||
Shows the real bit-rate in bits/sec and the sample-point in the
|
||||
range 0.000..0.999. If the calculation of bit-timing parameters
|
||||
is enabled in the kernel (CONFIG_CAN_CALC_BITTIMING=y), the
|
||||
bit-timing can be defined by setting the "bitrate" argument.
|
||||
Optionally the "sample-point" can be specified. By default it's
|
||||
0.000 assuming CIA-recommended sample-points.
|
||||
|
||||
"tq 125 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1"
|
||||
Shows the time quanta in ns, propagation segment, phase buffer
|
||||
segment 1 and 2 and the synchronisation jump width in units of
|
||||
tq. They allow to define the CAN bit-timing in a hardware
|
||||
independent format as proposed by the Bosch CAN 2.0 spec (see
|
||||
chapter 8 of http://www.semiconductors.bosch.de/pdf/can2spec.pdf).
|
||||
|
||||
"sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
|
||||
clock 8000000"
|
||||
Shows the bit-timing constants of the CAN controller, here the
|
||||
"sja1000". The minimum and maximum values of the time segment 1
|
||||
and 2, the synchronisation jump width in units of tq, the
|
||||
bitrate pre-scaler and the CAN system clock frequency in Hz.
|
||||
These constants could be used for user-defined (non-standard)
|
||||
bit-timing calculation algorithms in user-space.
|
||||
|
||||
"re-started bus-errors arbit-lost error-warn error-pass bus-off"
|
||||
Shows the number of restarts, bus and arbitration lost errors,
|
||||
and the state changes to the error-warning, error-passive and
|
||||
bus-off state. RX overrun errors are listed in the "overrun"
|
||||
field of the standard network statistics.
|
||||
|
||||
6.5.2 Setting the CAN bit-timing
|
||||
|
||||
The CAN bit-timing parameters can always be defined in a hardware
|
||||
independent format as proposed in the Bosch CAN 2.0 specification
|
||||
specifying the arguments "tq", "prop_seg", "phase_seg1", "phase_seg2"
|
||||
and "sjw":
|
||||
|
||||
$ ip link set canX type can tq 125 prop-seg 6 \
|
||||
phase-seg1 7 phase-seg2 2 sjw 1
|
||||
|
||||
If the kernel option CONFIG_CAN_CALC_BITTIMING is enabled, CIA
|
||||
recommended CAN bit-timing parameters will be calculated if the bit-
|
||||
rate is specified with the argument "bitrate":
|
||||
|
||||
$ ip link set canX type can bitrate 125000
|
||||
|
||||
Note that this works fine for the most common CAN controllers with
|
||||
standard bit-rates but may *fail* for exotic bit-rates or CAN system
|
||||
clock frequencies. Disabling CONFIG_CAN_CALC_BITTIMING saves some
|
||||
space and allows user-space tools to solely determine and set the
|
||||
bit-timing parameters. The CAN controller specific bit-timing
|
||||
constants can be used for that purpose. They are listed by the
|
||||
following command:
|
||||
|
||||
$ ip -details link show can0
|
||||
...
|
||||
sja1000: clock 8000000 tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
|
||||
|
||||
6.5.3 Starting and stopping the CAN network device
|
||||
|
||||
A CAN network device is started or stopped as usual with the command
|
||||
"ifconfig canX up/down" or "ip link set canX up/down". Be aware that
|
||||
you *must* define proper bit-timing parameters for real CAN devices
|
||||
before you can start it to avoid error-prone default settings:
|
||||
|
||||
$ ip link set canX up type can bitrate 125000
|
||||
|
||||
A device may enter the "bus-off" state if too much errors occurred on
|
||||
the CAN bus. Then no more messages are received or sent. An automatic
|
||||
bus-off recovery can be enabled by setting the "restart-ms" to a
|
||||
non-zero value, e.g.:
|
||||
|
||||
$ ip link set canX type can restart-ms 100
|
||||
|
||||
Alternatively, the application may realize the "bus-off" condition
|
||||
by monitoring CAN error frames and do a restart when appropriate with
|
||||
the command:
|
||||
|
||||
$ ip link set canX type can restart
|
||||
|
||||
Note that a restart will also create a CAN error frame (see also
|
||||
chapter 3.4).
|
||||
|
||||
6.6 Supported CAN hardware
|
||||
|
||||
Please check the "Kconfig" file in "drivers/net/can" to get an actual
|
||||
list of the support CAN hardware. On the Socket CAN project website
|
||||
(see chapter 7) there might be further drivers available, also for
|
||||
older kernel versions.
|
||||
|
||||
7. Socket CAN resources
|
||||
-----------------------
|
||||
|
||||
You can find further resources for Socket CAN like user space tools,
|
||||
support for old kernel versions, more drivers, mailing lists, etc.
|
||||
at the BerliOS OSS project website for Socket CAN:
|
||||
|
||||
http://developer.berlios.de/projects/socketcan
|
||||
|
||||
If you have questions, bug fixes, etc., don't hesitate to post them to
|
||||
the Socketcan-Users mailing list. But please search the archives first.
|
||||
|
||||
8. Credits
|
||||
----------
|
||||
|
||||
Oliver Hartkopp (PF_CAN core, filters, drivers, bcm)
|
||||
Oliver Hartkopp (PF_CAN core, filters, drivers, bcm, SJA1000 driver)
|
||||
Urs Thuermann (PF_CAN core, kernel integration, socket interfaces, raw, vcan)
|
||||
Jan Kizka (RT-SocketCAN core, Socket-API reconciliation)
|
||||
Wolfgang Grandegger (RT-SocketCAN core & drivers, Raw Socket-API reviews)
|
||||
Wolfgang Grandegger (RT-SocketCAN core & drivers, Raw Socket-API reviews,
|
||||
CAN device driver interface, MSCAN driver)
|
||||
Robert Schwebel (design reviews, PTXdist integration)
|
||||
Marc Kleine-Budde (design reviews, Kernel 2.6 cleanups, drivers)
|
||||
Benedikt Spranger (reviews)
|
||||
Thomas Gleixner (LKML reviews, coding style, posting hints)
|
||||
Andrey Volkov (kernel subtree structure, ioctls, mscan driver)
|
||||
Andrey Volkov (kernel subtree structure, ioctls, MSCAN driver)
|
||||
Matthias Brukner (first SJA1000 CAN netdevice implementation Q2/2003)
|
||||
Klaus Hitschler (PEAK driver integration)
|
||||
Uwe Koppe (CAN netdevices with PF_PACKET approach)
|
||||
Michael Schulze (driver layer loopback requirement, RT CAN drivers review)
|
||||
Pavel Pisa (Bit-timing calculation)
|
||||
Sascha Hauer (SJA1000 platform driver)
|
||||
Sebastian Haas (SJA1000 EMS PCI driver)
|
||||
Markus Plessing (SJA1000 EMS PCI driver)
|
||||
Per Dalen (SJA1000 Kvaser PCI driver)
|
||||
Sam Ravnborg (reviews, coding style, kbuild help)
|
||||
|
@ -129,7 +129,7 @@ PHY Link state polling
|
||||
----------------------
|
||||
|
||||
The driver keeps track of the link state and informs the network core
|
||||
about link (carrier) availablilty. This is managed by several methods
|
||||
about link (carrier) availability. This is managed by several methods
|
||||
depending on the version of the chip and on which PHY is being used.
|
||||
|
||||
For the internal PHY, the original (and currently default) method is
|
||||
|
76
Documentation/networking/ieee802154.txt
Normal file
76
Documentation/networking/ieee802154.txt
Normal file
@ -0,0 +1,76 @@
|
||||
|
||||
Linux IEEE 802.15.4 implementation
|
||||
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
The Linux-ZigBee project goal is to provide complete implementation
|
||||
of IEEE 802.15.4 / ZigBee / 6LoWPAN protocols. IEEE 802.15.4 is a stack
|
||||
of protocols for organizing Low-Rate Wireless Personal Area Networks.
|
||||
|
||||
Currently only IEEE 802.15.4 layer is implemented. We have choosen
|
||||
to use plain Berkeley socket API, the generic Linux networking stack
|
||||
to transfer IEEE 802.15.4 messages and a special protocol over genetlink
|
||||
for configuration/management
|
||||
|
||||
|
||||
Socket API
|
||||
==========
|
||||
|
||||
int sd = socket(PF_IEEE802154, SOCK_DGRAM, 0);
|
||||
.....
|
||||
|
||||
The address family, socket addresses etc. are defined in the
|
||||
include/net/ieee802154/af_ieee802154.h header or in the special header
|
||||
in our userspace package (see either linux-zigbee sourceforge download page
|
||||
or git tree at git://linux-zigbee.git.sourceforge.net/gitroot/linux-zigbee).
|
||||
|
||||
One can use SOCK_RAW for passing raw data towards device xmit function. YMMV.
|
||||
|
||||
|
||||
MLME - MAC Level Management
|
||||
============================
|
||||
|
||||
Most of IEEE 802.15.4 MLME interfaces are directly mapped on netlink commands.
|
||||
See the include/net/ieee802154/nl802154.h header. Our userspace tools package
|
||||
(see above) provides CLI configuration utility for radio interfaces and simple
|
||||
coordinator for IEEE 802.15.4 networks as an example users of MLME protocol.
|
||||
|
||||
|
||||
Kernel side
|
||||
=============
|
||||
|
||||
Like with WiFi, there are several types of devices implementing IEEE 802.15.4.
|
||||
1) 'HardMAC'. The MAC layer is implemented in the device itself, the device
|
||||
exports MLME and data API.
|
||||
2) 'SoftMAC' or just radio. These types of devices are just radio transceivers
|
||||
possibly with some kinds of acceleration like automatic CRC computation and
|
||||
comparation, automagic ACK handling, address matching, etc.
|
||||
|
||||
Those types of devices require different approach to be hooked into Linux kernel.
|
||||
|
||||
|
||||
HardMAC
|
||||
=======
|
||||
|
||||
See the header include/net/ieee802154/netdevice.h. You have to implement Linux
|
||||
net_device, with .type = ARPHRD_IEEE802154. Data is exchanged with socket family
|
||||
code via plain sk_buffs. The control block of sk_buffs will contain additional
|
||||
info as described in the struct ieee802154_mac_cb.
|
||||
|
||||
To hook the MLME interface you have to populate the ml_priv field of your
|
||||
net_device with a pointer to struct ieee802154_mlme_ops instance. All fields are
|
||||
required.
|
||||
|
||||
We provide an example of simple HardMAC driver at drivers/ieee802154/fakehard.c
|
||||
|
||||
|
||||
SoftMAC
|
||||
=======
|
||||
|
||||
We are going to provide intermediate layer impelementing IEEE 802.15.4 MAC
|
||||
in software. This is currently WIP.
|
||||
|
||||
See header include/net/ieee802154/mac802154.h and several drivers in
|
||||
drivers/ieee802154/
|
@ -168,7 +168,16 @@ tcp_dsack - BOOLEAN
|
||||
Allows TCP to send "duplicate" SACKs.
|
||||
|
||||
tcp_ecn - BOOLEAN
|
||||
Enable Explicit Congestion Notification in TCP.
|
||||
Enable Explicit Congestion Notification (ECN) in TCP. ECN is only
|
||||
used when both ends of the TCP flow support it. It is useful to
|
||||
avoid losses due to congestion (when the bottleneck router supports
|
||||
ECN).
|
||||
Possible values are:
|
||||
0 disable ECN
|
||||
1 ECN enabled
|
||||
2 Only server-side ECN enabled. If the other end does
|
||||
not support ECN, behavior is like with ECN disabled.
|
||||
Default: 2
|
||||
|
||||
tcp_fack - BOOLEAN
|
||||
Enable FACK congestion avoidance and fast retransmission.
|
||||
@ -1048,6 +1057,13 @@ disable_ipv6 - BOOLEAN
|
||||
address.
|
||||
Default: FALSE (enable IPv6 operation)
|
||||
|
||||
When this value is changed from 1 to 0 (IPv6 is being enabled),
|
||||
it will dynamically create a link-local address on the given
|
||||
interface and start Duplicate Address Detection, if necessary.
|
||||
|
||||
When this value is changed from 0 to 1 (IPv6 is being disabled),
|
||||
it will dynamically delete all address on the given interface.
|
||||
|
||||
accept_dad - INTEGER
|
||||
Whether to accept DAD (Duplicate Address Detection).
|
||||
0: Disable DAD
|
||||
|
@ -33,3 +33,40 @@ disable
|
||||
|
||||
A reboot is required to enable IPv6.
|
||||
|
||||
autoconf
|
||||
|
||||
Specifies whether to enable IPv6 address autoconfiguration
|
||||
on all interfaces. This might be used when one does not wish
|
||||
for addresses to be automatically generated from prefixes
|
||||
received in Router Advertisements.
|
||||
|
||||
The possible values and their effects are:
|
||||
|
||||
0
|
||||
IPv6 address autoconfiguration is disabled on all interfaces.
|
||||
|
||||
Only the IPv6 loopback address (::1) and link-local addresses
|
||||
will be added to interfaces.
|
||||
|
||||
1
|
||||
IPv6 address autoconfiguration is enabled on all interfaces.
|
||||
|
||||
This is the default value.
|
||||
|
||||
disable_ipv6
|
||||
|
||||
Specifies whether to disable IPv6 on all interfaces.
|
||||
This might be used when no IPv6 addresses are desired.
|
||||
|
||||
The possible values and their effects are:
|
||||
|
||||
0
|
||||
IPv6 is enabled on all interfaces.
|
||||
|
||||
This is the default value.
|
||||
|
||||
1
|
||||
IPv6 is disabled on all interfaces.
|
||||
|
||||
No IPv6 addresses will be added to interfaces.
|
||||
|
||||
|
@ -158,7 +158,7 @@ Sample Userspace Code
|
||||
}
|
||||
return 0;
|
||||
|
||||
Miscellanous
|
||||
Miscellaneous
|
||||
============
|
||||
|
||||
The PPPoL2TP driver was developed as part of the OpenL2TP project by
|
||||
|
@ -12,38 +12,22 @@ following format:
|
||||
The radiotap format is discussed in
|
||||
./Documentation/networking/radiotap-headers.txt.
|
||||
|
||||
Despite 13 radiotap argument types are currently defined, most only make sense
|
||||
Despite many radiotap parameters being currently defined, most only make sense
|
||||
to appear on received packets. The following information is parsed from the
|
||||
radiotap headers and used to control injection:
|
||||
|
||||
* IEEE80211_RADIOTAP_RATE
|
||||
|
||||
rate in 500kbps units, automatic if invalid or not present
|
||||
|
||||
|
||||
* IEEE80211_RADIOTAP_ANTENNA
|
||||
|
||||
antenna to use, automatic if not present
|
||||
|
||||
|
||||
* IEEE80211_RADIOTAP_DBM_TX_POWER
|
||||
|
||||
transmit power in dBm, automatic if not present
|
||||
|
||||
|
||||
* IEEE80211_RADIOTAP_FLAGS
|
||||
|
||||
IEEE80211_RADIOTAP_F_FCS: FCS will be removed and recalculated
|
||||
IEEE80211_RADIOTAP_F_WEP: frame will be encrypted if key available
|
||||
IEEE80211_RADIOTAP_F_FRAG: frame will be fragmented if longer than the
|
||||
current fragmentation threshold. Note that
|
||||
this flag is only reliable when software
|
||||
fragmentation is enabled)
|
||||
current fragmentation threshold.
|
||||
|
||||
|
||||
The injection code can also skip all other currently defined radiotap fields
|
||||
facilitating replay of captured radiotap headers directly.
|
||||
|
||||
Here is an example valid radiotap header defining these three parameters
|
||||
Here is an example valid radiotap header defining some parameters
|
||||
|
||||
0x00, 0x00, // <-- radiotap version
|
||||
0x0b, 0x00, // <- radiotap header length
|
||||
@ -72,8 +56,8 @@ interface), along the following lines:
|
||||
...
|
||||
r = pcap_inject(ppcap, u8aSendBuffer, nLength);
|
||||
|
||||
You can also find sources for a complete inject test applet here:
|
||||
You can also find a link to a complete inject application here:
|
||||
|
||||
http://penumbra.warmcat.com/_twk/tiki-index.php?page=packetspammer
|
||||
http://wireless.kernel.org/en/users/Documentation/packetspammer
|
||||
|
||||
Andy Green <andy@warmcat.com>
|
||||
|
@ -74,7 +74,7 @@ dev->hard_start_xmit:
|
||||
for this and return NETDEV_TX_LOCKED when the spin lock fails.
|
||||
The locking there should also properly protect against
|
||||
set_multicast_list. Note that the use of NETIF_F_LLTX is deprecated.
|
||||
Dont use it for new drivers.
|
||||
Don't use it for new drivers.
|
||||
|
||||
Context: Process with BHs disabled or BH (timer),
|
||||
will be called with interrupts disabled by netconsole.
|
||||
|
@ -38,9 +38,6 @@ ifinfomsg::if_flags & IFF_LOWER_UP:
|
||||
ifinfomsg::if_flags & IFF_DORMANT:
|
||||
Driver has signaled netif_dormant_on()
|
||||
|
||||
These interface flags can also be queried without netlink using the
|
||||
SIOCGIFFLAGS ioctl.
|
||||
|
||||
TLV IFLA_OPERSTATE
|
||||
|
||||
contains RFC2863 state of the interface in numeric representation:
|
||||
|
@ -4,16 +4,18 @@
|
||||
|
||||
This file documents the CONFIG_PACKET_MMAP option available with the PACKET
|
||||
socket interface on 2.4 and 2.6 kernels. This type of sockets is used for
|
||||
capture network traffic with utilities like tcpdump or any other that uses
|
||||
the libpcap library.
|
||||
|
||||
You can find the latest version of this document at
|
||||
capture network traffic with utilities like tcpdump or any other that needs
|
||||
raw access to network interface.
|
||||
|
||||
You can find the latest version of this document at:
|
||||
http://pusa.uv.es/~ulisses/packet_mmap/
|
||||
|
||||
Please send me your comments to
|
||||
Howto can be found at:
|
||||
http://wiki.gnu-log.net (packet_mmap)
|
||||
|
||||
Please send your comments to
|
||||
Ulisses Alonso Camaró <uaca@i.hate.spam.alumni.uv.es>
|
||||
Johann Baudy <johann.baudy@gnu-log.net>
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
+ Why use PACKET_MMAP
|
||||
@ -25,19 +27,24 @@ to capture each packet, it requires two if you want to get packet's
|
||||
timestamp (like libpcap always does).
|
||||
|
||||
In the other hand PACKET_MMAP is very efficient. PACKET_MMAP provides a size
|
||||
configurable circular buffer mapped in user space. This way reading packets just
|
||||
needs to wait for them, most of the time there is no need to issue a single
|
||||
system call. By using a shared buffer between the kernel and the user
|
||||
also has the benefit of minimizing packet copies.
|
||||
configurable circular buffer mapped in user space that can be used to either
|
||||
send or receive packets. This way reading packets just needs to wait for them,
|
||||
most of the time there is no need to issue a single system call. Concerning
|
||||
transmission, multiple packets can be sent through one system call to get the
|
||||
highest bandwidth.
|
||||
By using a shared buffer between the kernel and the user also has the benefit
|
||||
of minimizing packet copies.
|
||||
|
||||
It's fine to use PACKET_MMAP to improve the performance of the capture process,
|
||||
but it isn't everything. At least, if you are capturing at high speeds (this
|
||||
is relative to the cpu speed), you should check if the device driver of your
|
||||
network interface card supports some sort of interrupt load mitigation or
|
||||
(even better) if it supports NAPI, also make sure it is enabled.
|
||||
It's fine to use PACKET_MMAP to improve the performance of the capture and
|
||||
transmission process, but it isn't everything. At least, if you are capturing
|
||||
at high speeds (this is relative to the cpu speed), you should check if the
|
||||
device driver of your network interface card supports some sort of interrupt
|
||||
load mitigation or (even better) if it supports NAPI, also make sure it is
|
||||
enabled. For transmission, check the MTU (Maximum Transmission Unit) used and
|
||||
supported by devices of your network.
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
+ How to use CONFIG_PACKET_MMAP
|
||||
+ How to use CONFIG_PACKET_MMAP to improve capture process
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
From the user standpoint, you should use the higher level libpcap library, which
|
||||
@ -57,7 +64,7 @@ the low level details or want to improve libpcap by including PACKET_MMAP
|
||||
support.
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
+ How to use CONFIG_PACKET_MMAP directly
|
||||
+ How to use CONFIG_PACKET_MMAP directly to improve capture process
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
From the system calls stand point, the use of PACKET_MMAP involves
|
||||
@ -66,6 +73,7 @@ the following process:
|
||||
|
||||
[setup] socket() -------> creation of the capture socket
|
||||
setsockopt() ---> allocation of the circular buffer (ring)
|
||||
option: PACKET_RX_RING
|
||||
mmap() ---------> mapping of the allocated buffer to the
|
||||
user process
|
||||
|
||||
@ -96,6 +104,65 @@ Next I will describe PACKET_MMAP settings and it's constraints,
|
||||
also the mapping of the circular buffer in the user process and
|
||||
the use of this buffer.
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
+ How to use CONFIG_PACKET_MMAP directly to improve transmission process
|
||||
--------------------------------------------------------------------------------
|
||||
Transmission process is similar to capture as shown below.
|
||||
|
||||
[setup] socket() -------> creation of the transmission socket
|
||||
setsockopt() ---> allocation of the circular buffer (ring)
|
||||
option: PACKET_TX_RING
|
||||
bind() ---------> bind transmission socket with a network interface
|
||||
mmap() ---------> mapping of the allocated buffer to the
|
||||
user process
|
||||
|
||||
[transmission] poll() ---------> wait for free packets (optional)
|
||||
send() ---------> send all packets that are set as ready in
|
||||
the ring
|
||||
The flag MSG_DONTWAIT can be used to return
|
||||
before end of transfer.
|
||||
|
||||
[shutdown] close() --------> destruction of the transmission socket and
|
||||
deallocation of all associated resources.
|
||||
|
||||
Binding the socket to your network interface is mandatory (with zero copy) to
|
||||
know the header size of frames used in the circular buffer.
|
||||
|
||||
As capture, each frame contains two parts:
|
||||
|
||||
--------------------
|
||||
| struct tpacket_hdr | Header. It contains the status of
|
||||
| | of this frame
|
||||
|--------------------|
|
||||
| data buffer |
|
||||
. . Data that will be sent over the network interface.
|
||||
. .
|
||||
--------------------
|
||||
|
||||
bind() associates the socket to your network interface thanks to
|
||||
sll_ifindex parameter of struct sockaddr_ll.
|
||||
|
||||
Initialization example:
|
||||
|
||||
struct sockaddr_ll my_addr;
|
||||
struct ifreq s_ifr;
|
||||
...
|
||||
|
||||
strncpy (s_ifr.ifr_name, "eth0", sizeof(s_ifr.ifr_name));
|
||||
|
||||
/* get interface index of eth0 */
|
||||
ioctl(this->socket, SIOCGIFINDEX, &s_ifr);
|
||||
|
||||
/* fill sockaddr_ll struct to prepare binding */
|
||||
my_addr.sll_family = AF_PACKET;
|
||||
my_addr.sll_protocol = ETH_P_ALL;
|
||||
my_addr.sll_ifindex = s_ifr.ifr_ifindex;
|
||||
|
||||
/* bind socket to eth0 */
|
||||
bind(this->socket, (struct sockaddr *)&my_addr, sizeof(struct sockaddr_ll));
|
||||
|
||||
A complete tutorial is available at: http://wiki.gnu-log.net/
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
+ PACKET_MMAP settings
|
||||
--------------------------------------------------------------------------------
|
||||
@ -103,7 +170,10 @@ the use of this buffer.
|
||||
|
||||
To setup PACKET_MMAP from user level code is done with a call like
|
||||
|
||||
- Capture process
|
||||
setsockopt(fd, SOL_PACKET, PACKET_RX_RING, (void *) &req, sizeof(req))
|
||||
- Transmission process
|
||||
setsockopt(fd, SOL_PACKET, PACKET_TX_RING, (void *) &req, sizeof(req))
|
||||
|
||||
The most significant argument in the previous call is the req parameter,
|
||||
this parameter must to have the following structure:
|
||||
@ -117,11 +187,11 @@ this parameter must to have the following structure:
|
||||
};
|
||||
|
||||
This structure is defined in /usr/include/linux/if_packet.h and establishes a
|
||||
circular buffer (ring) of unswappable memory mapped in the capture process.
|
||||
circular buffer (ring) of unswappable memory.
|
||||
Being mapped in the capture process allows reading the captured frames and
|
||||
related meta-information like timestamps without requiring a system call.
|
||||
|
||||
Captured frames are grouped in blocks. Each block is a physically contiguous
|
||||
Frames are grouped in blocks. Each block is a physically contiguous
|
||||
region of memory and holds tp_block_size/tp_frame_size frames. The total number
|
||||
of blocks is tp_block_nr. Note that tp_frame_nr is a redundant parameter because
|
||||
|
||||
@ -336,6 +406,7 @@ struct tpacket_hdr). If this field is 0 means that the frame is ready
|
||||
to be used for the kernel, If not, there is a frame the user can read
|
||||
and the following flags apply:
|
||||
|
||||
+++ Capture process:
|
||||
from include/linux/if_packet.h
|
||||
|
||||
#define TP_STATUS_COPY 2
|
||||
@ -391,6 +462,37 @@ packets are in the ring:
|
||||
It doesn't incur in a race condition to first check the status value and
|
||||
then poll for frames.
|
||||
|
||||
|
||||
++ Transmission process
|
||||
Those defines are also used for transmission:
|
||||
|
||||
#define TP_STATUS_AVAILABLE 0 // Frame is available
|
||||
#define TP_STATUS_SEND_REQUEST 1 // Frame will be sent on next send()
|
||||
#define TP_STATUS_SENDING 2 // Frame is currently in transmission
|
||||
#define TP_STATUS_WRONG_FORMAT 4 // Frame format is not correct
|
||||
|
||||
First, the kernel initializes all frames to TP_STATUS_AVAILABLE. To send a
|
||||
packet, the user fills a data buffer of an available frame, sets tp_len to
|
||||
current data buffer size and sets its status field to TP_STATUS_SEND_REQUEST.
|
||||
This can be done on multiple frames. Once the user is ready to transmit, it
|
||||
calls send(). Then all buffers with status equal to TP_STATUS_SEND_REQUEST are
|
||||
forwarded to the network device. The kernel updates each status of sent
|
||||
frames with TP_STATUS_SENDING until the end of transfer.
|
||||
At the end of each transfer, buffer status returns to TP_STATUS_AVAILABLE.
|
||||
|
||||
header->tp_len = in_i_size;
|
||||
header->tp_status = TP_STATUS_SEND_REQUEST;
|
||||
retval = send(this->socket, NULL, 0, 0);
|
||||
|
||||
The user can also use poll() to check if a buffer is available:
|
||||
(status == TP_STATUS_SENDING)
|
||||
|
||||
struct pollfd pfd;
|
||||
pfd.fd = fd;
|
||||
pfd.revents = 0;
|
||||
pfd.events = POLLOUT;
|
||||
retval = poll(&pfd, 1, timeout);
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
+ THANKS
|
||||
--------------------------------------------------------------------------------
|
||||
|
@ -36,7 +36,7 @@ Phonet packets have a common header as follows:
|
||||
On Linux, the link-layer header includes the pn_media byte (see below).
|
||||
The next 7 bytes are part of the network-layer header.
|
||||
|
||||
The device ID is split: the 6 higher-order bits consitute the device
|
||||
The device ID is split: the 6 higher-order bits constitute the device
|
||||
address, while the 2 lower-order bits are used for multiplexing, as are
|
||||
the 8-bit object identifiers. As such, Phonet can be considered as a
|
||||
network layer with 6 bits of address space and 10 bits for transport
|
||||
|
@ -89,7 +89,7 @@ added to this document when its support is enabled.
|
||||
Device drivers who provide their own built regulatory domain
|
||||
do not need a callback as the channels registered by them are
|
||||
the only ones that will be allowed and therefore *additional*
|
||||
cannels cannot be enabled.
|
||||
channels cannot be enabled.
|
||||
|
||||
Example code - drivers hinting an alpha2:
|
||||
------------------------------------------
|
||||
|
@ -75,9 +75,6 @@ may need to apply in domain-specific ways to their devices:
|
||||
struct bus_type {
|
||||
...
|
||||
int (*suspend)(struct device *dev, pm_message_t state);
|
||||
int (*suspend_late)(struct device *dev, pm_message_t state);
|
||||
|
||||
int (*resume_early)(struct device *dev);
|
||||
int (*resume)(struct device *dev);
|
||||
};
|
||||
|
||||
@ -226,20 +223,7 @@ The phases are seen by driver notifications issued in this order:
|
||||
|
||||
This call should handle parts of device suspend logic that require
|
||||
sleeping. It probably does work to quiesce the device which hasn't
|
||||
been abstracted into class.suspend() or bus.suspend_late().
|
||||
|
||||
3 bus.suspend_late(dev, message) is called with IRQs disabled, and
|
||||
with only one CPU active. Until the bus.resume_early() phase
|
||||
completes (see later), IRQs are not enabled again. This method
|
||||
won't be exposed by all busses; for message based busses like USB,
|
||||
I2C, or SPI, device interactions normally require IRQs. This bus
|
||||
call may be morphed into a driver call with bus-specific parameters.
|
||||
|
||||
This call might save low level hardware state that might otherwise
|
||||
be lost in the upcoming low power state, and actually put the
|
||||
device into a low power state ... so that in some cases the device
|
||||
may stay partly usable until this late. This "late" call may also
|
||||
help when coping with hardware that behaves badly.
|
||||
been abstracted into class.suspend().
|
||||
|
||||
The pm_message_t parameter is currently used to refine those semantics
|
||||
(described later).
|
||||
@ -351,19 +335,11 @@ devices processing each phase's calls before the next phase begins.
|
||||
|
||||
The phases are seen by driver notifications issued in this order:
|
||||
|
||||
1 bus.resume_early(dev) is called with IRQs disabled, and with
|
||||
only one CPU active. As with bus.suspend_late(), this method
|
||||
won't be supported on busses that require IRQs in order to
|
||||
interact with devices.
|
||||
1 bus.resume(dev) reverses the effects of bus.suspend(). This may
|
||||
be morphed into a device driver call with bus-specific parameters;
|
||||
implementations may sleep.
|
||||
|
||||
This reverses the effects of bus.suspend_late().
|
||||
|
||||
2 bus.resume(dev) is called next. This may be morphed into a device
|
||||
driver call with bus-specific parameters; implementations may sleep.
|
||||
|
||||
This reverses the effects of bus.suspend().
|
||||
|
||||
3 class.resume(dev) is called for devices associated with a class
|
||||
2 class.resume(dev) is called for devices associated with a class
|
||||
that has such a method. Implementations may sleep.
|
||||
|
||||
This reverses the effects of class.suspend(), and would usually
|
||||
|
@ -178,5 +178,5 @@ Consumers can uregister interest by calling :-
|
||||
int regulator_unregister_notifier(struct regulator *regulator,
|
||||
struct notifier_block *nb);
|
||||
|
||||
Regulators use the kernel notifier framework to send event to thier interested
|
||||
Regulators use the kernel notifier framework to send event to their interested
|
||||
consumers.
|
||||
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
x
Reference in New Issue
Block a user