mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2025-01-15 17:43:59 +00:00
Merge branch 'linus' into sched/core
Merge reason: we will merge a dependent patch. Signed-off-by: Ingo Molnar <mingo@elte.hu>
This commit is contained in:
commit
348b346b23
4
CREDITS
4
CREDITS
@ -1253,6 +1253,10 @@ S: 8124 Constitution Apt. 7
|
||||
S: Sterling Heights, Michigan 48313
|
||||
S: USA
|
||||
|
||||
N: Wolfgang Grandegger
|
||||
E: wg@grandegger.com
|
||||
D: Controller Area Network (device drivers)
|
||||
|
||||
N: William Greathouse
|
||||
E: wgreathouse@smva.com
|
||||
E: wgreathouse@myfavoritei.com
|
||||
|
@ -122,3 +122,10 @@ Description:
|
||||
This symbolic link appears when a device is a Virtual Function.
|
||||
The symbolic link points to the PCI device sysfs entry of the
|
||||
Physical Function this device associates with.
|
||||
|
||||
What: /sys/bus/pci/slots/.../module
|
||||
Date: June 2009
|
||||
Contact: linux-pci@vger.kernel.org
|
||||
Description:
|
||||
This symbolic link points to the PCI hotplug controller driver
|
||||
module that manages the hotplug slot.
|
||||
|
125
Documentation/ABI/testing/sysfs-class-mtd
Normal file
125
Documentation/ABI/testing/sysfs-class-mtd
Normal file
@ -0,0 +1,125 @@
|
||||
What: /sys/class/mtd/
|
||||
Date: April 2009
|
||||
KernelVersion: 2.6.29
|
||||
Contact: linux-mtd@lists.infradead.org
|
||||
Description:
|
||||
The mtd/ class subdirectory belongs to the MTD subsystem
|
||||
(MTD core).
|
||||
|
||||
What: /sys/class/mtd/mtdX/
|
||||
Date: April 2009
|
||||
KernelVersion: 2.6.29
|
||||
Contact: linux-mtd@lists.infradead.org
|
||||
Description:
|
||||
The /sys/class/mtd/mtd{0,1,2,3,...} directories correspond
|
||||
to each /dev/mtdX character device. These may represent
|
||||
physical/simulated flash devices, partitions on a flash
|
||||
device, or concatenated flash devices. They exist regardless
|
||||
of whether CONFIG_MTD_CHAR is actually enabled.
|
||||
|
||||
What: /sys/class/mtd/mtdXro/
|
||||
Date: April 2009
|
||||
KernelVersion: 2.6.29
|
||||
Contact: linux-mtd@lists.infradead.org
|
||||
Description:
|
||||
These directories provide the corresponding read-only device
|
||||
nodes for /sys/class/mtd/mtdX/ . They are only created
|
||||
(for the benefit of udev) if CONFIG_MTD_CHAR is enabled.
|
||||
|
||||
What: /sys/class/mtd/mtdX/dev
|
||||
Date: April 2009
|
||||
KernelVersion: 2.6.29
|
||||
Contact: linux-mtd@lists.infradead.org
|
||||
Description:
|
||||
Major and minor numbers of the character device corresponding
|
||||
to this MTD device (in <major>:<minor> format). This is the
|
||||
read-write device so <minor> will be even.
|
||||
|
||||
What: /sys/class/mtd/mtdXro/dev
|
||||
Date: April 2009
|
||||
KernelVersion: 2.6.29
|
||||
Contact: linux-mtd@lists.infradead.org
|
||||
Description:
|
||||
Major and minor numbers of the character device corresponding
|
||||
to the read-only variant of thie MTD device (in
|
||||
<major>:<minor> format). In this case <minor> will be odd.
|
||||
|
||||
What: /sys/class/mtd/mtdX/erasesize
|
||||
Date: April 2009
|
||||
KernelVersion: 2.6.29
|
||||
Contact: linux-mtd@lists.infradead.org
|
||||
Description:
|
||||
"Major" erase size for the device. If numeraseregions is
|
||||
zero, this is the eraseblock size for the entire device.
|
||||
Otherwise, the MEMGETREGIONCOUNT/MEMGETREGIONINFO ioctls
|
||||
can be used to determine the actual eraseblock layout.
|
||||
|
||||
What: /sys/class/mtd/mtdX/flags
|
||||
Date: April 2009
|
||||
KernelVersion: 2.6.29
|
||||
Contact: linux-mtd@lists.infradead.org
|
||||
Description:
|
||||
A hexadecimal value representing the device flags, ORed
|
||||
together:
|
||||
|
||||
0x0400: MTD_WRITEABLE - device is writable
|
||||
0x0800: MTD_BIT_WRITEABLE - single bits can be flipped
|
||||
0x1000: MTD_NO_ERASE - no erase necessary
|
||||
0x2000: MTD_POWERUP_LOCK - always locked after reset
|
||||
|
||||
What: /sys/class/mtd/mtdX/name
|
||||
Date: April 2009
|
||||
KernelVersion: 2.6.29
|
||||
Contact: linux-mtd@lists.infradead.org
|
||||
Description:
|
||||
A human-readable ASCII name for the device or partition.
|
||||
This will match the name in /proc/mtd .
|
||||
|
||||
What: /sys/class/mtd/mtdX/numeraseregions
|
||||
Date: April 2009
|
||||
KernelVersion: 2.6.29
|
||||
Contact: linux-mtd@lists.infradead.org
|
||||
Description:
|
||||
For devices that have variable eraseblock sizes, this
|
||||
provides the total number of erase regions. Otherwise,
|
||||
it will read back as zero.
|
||||
|
||||
What: /sys/class/mtd/mtdX/oobsize
|
||||
Date: April 2009
|
||||
KernelVersion: 2.6.29
|
||||
Contact: linux-mtd@lists.infradead.org
|
||||
Description:
|
||||
Number of OOB bytes per page.
|
||||
|
||||
What: /sys/class/mtd/mtdX/size
|
||||
Date: April 2009
|
||||
KernelVersion: 2.6.29
|
||||
Contact: linux-mtd@lists.infradead.org
|
||||
Description:
|
||||
Total size of the device/partition, in bytes.
|
||||
|
||||
What: /sys/class/mtd/mtdX/type
|
||||
Date: April 2009
|
||||
KernelVersion: 2.6.29
|
||||
Contact: linux-mtd@lists.infradead.org
|
||||
Description:
|
||||
One of the following ASCII strings, representing the device
|
||||
type:
|
||||
|
||||
absent, ram, rom, nor, nand, dataflash, ubi, unknown
|
||||
|
||||
What: /sys/class/mtd/mtdX/writesize
|
||||
Date: April 2009
|
||||
KernelVersion: 2.6.29
|
||||
Contact: linux-mtd@lists.infradead.org
|
||||
Description:
|
||||
Minimal writable flash unit size. This will always be
|
||||
a positive integer.
|
||||
|
||||
In the case of NOR flash it is 1 (even though individual
|
||||
bits can be cleared).
|
||||
|
||||
In the case of NAND flash it is one NAND page (or a
|
||||
half page, or a quarter page).
|
||||
|
||||
In the case of ECC NOR, it is the ECC block size.
|
@ -79,3 +79,13 @@ Description:
|
||||
This file is read-only and shows the number of
|
||||
kilobytes of data that have been written to this
|
||||
filesystem since it was mounted.
|
||||
|
||||
What: /sys/fs/ext4/<disk>/inode_goal
|
||||
Date: June 2008
|
||||
Contact: "Theodore Ts'o" <tytso@mit.edu>
|
||||
Description:
|
||||
Tuning parameter which (if non-zero) controls the goal
|
||||
inode used by the inode allocator in p0reference to
|
||||
all other allocation hueristics. This is intended for
|
||||
debugging use only, and should be 0 on production
|
||||
systems.
|
||||
|
73
Documentation/ABI/testing/sysfs-pps
Normal file
73
Documentation/ABI/testing/sysfs-pps
Normal file
@ -0,0 +1,73 @@
|
||||
What: /sys/class/pps/
|
||||
Date: February 2008
|
||||
Contact: Rodolfo Giometti <giometti@linux.it>
|
||||
Description:
|
||||
The /sys/class/pps/ directory will contain files and
|
||||
directories that will provide a unified interface to
|
||||
the PPS sources.
|
||||
|
||||
What: /sys/class/pps/ppsX/
|
||||
Date: February 2008
|
||||
Contact: Rodolfo Giometti <giometti@linux.it>
|
||||
Description:
|
||||
The /sys/class/pps/ppsX/ directory is related to X-th
|
||||
PPS source into the system. Each directory will
|
||||
contain files to manage and control its PPS source.
|
||||
|
||||
What: /sys/class/pps/ppsX/assert
|
||||
Date: February 2008
|
||||
Contact: Rodolfo Giometti <giometti@linux.it>
|
||||
Description:
|
||||
The /sys/class/pps/ppsX/assert file reports the assert events
|
||||
and the assert sequence number of the X-th source in the form:
|
||||
|
||||
<secs>.<nsec>#<sequence>
|
||||
|
||||
If the source has no assert events the content of this file
|
||||
is empty.
|
||||
|
||||
What: /sys/class/pps/ppsX/clear
|
||||
Date: February 2008
|
||||
Contact: Rodolfo Giometti <giometti@linux.it>
|
||||
Description:
|
||||
The /sys/class/pps/ppsX/clear file reports the clear events
|
||||
and the clear sequence number of the X-th source in the form:
|
||||
|
||||
<secs>.<nsec>#<sequence>
|
||||
|
||||
If the source has no clear events the content of this file
|
||||
is empty.
|
||||
|
||||
What: /sys/class/pps/ppsX/mode
|
||||
Date: February 2008
|
||||
Contact: Rodolfo Giometti <giometti@linux.it>
|
||||
Description:
|
||||
The /sys/class/pps/ppsX/mode file reports the functioning
|
||||
mode of the X-th source in hexadecimal encoding.
|
||||
|
||||
Please, refer to linux/include/linux/pps.h for further
|
||||
info.
|
||||
|
||||
What: /sys/class/pps/ppsX/echo
|
||||
Date: February 2008
|
||||
Contact: Rodolfo Giometti <giometti@linux.it>
|
||||
Description:
|
||||
The /sys/class/pps/ppsX/echo file reports if the X-th does
|
||||
or does not support an "echo" function.
|
||||
|
||||
What: /sys/class/pps/ppsX/name
|
||||
Date: February 2008
|
||||
Contact: Rodolfo Giometti <giometti@linux.it>
|
||||
Description:
|
||||
The /sys/class/pps/ppsX/name file reports the name of the
|
||||
X-th source.
|
||||
|
||||
What: /sys/class/pps/ppsX/path
|
||||
Date: February 2008
|
||||
Contact: Rodolfo Giometti <giometti@linux.it>
|
||||
Description:
|
||||
The /sys/class/pps/ppsX/path file reports the path name of
|
||||
the device connected with the X-th source.
|
||||
|
||||
If the source is not connected with any device the content
|
||||
of this file is empty.
|
@ -72,6 +72,13 @@ assembling the 16-bit boot code, removing the need for as86 to compile
|
||||
your kernel. This change does, however, mean that you need a recent
|
||||
release of binutils.
|
||||
|
||||
Perl
|
||||
----
|
||||
|
||||
You will need perl 5 and the following modules: Getopt::Long, Getopt::Std,
|
||||
File::Basename, and File::Find to build the kernel.
|
||||
|
||||
|
||||
System utilities
|
||||
================
|
||||
|
||||
|
@ -106,7 +106,7 @@
|
||||
number of errors are printk'ed including a full stack trace.
|
||||
</para>
|
||||
<para>
|
||||
The statistics are available via debugfs/debug_objects/stats.
|
||||
The statistics are available via /sys/kernel/debug/debug_objects/stats.
|
||||
They provide information about the number of warnings and the
|
||||
number of successful fixups along with information about the
|
||||
usage of the internal tracking objects and the state of the
|
||||
|
@ -145,7 +145,6 @@ usage should require reading the full document.
|
||||
interface in STA mode at first!
|
||||
</para>
|
||||
!Finclude/net/mac80211.h ieee80211_if_init_conf
|
||||
!Finclude/net/mac80211.h ieee80211_if_conf
|
||||
</chapter>
|
||||
|
||||
<chapter id="rx-tx">
|
||||
|
@ -61,6 +61,10 @@ be initiated although firmwares have no _OSC support. To enable the
|
||||
walkaround, pls. add aerdriver.forceload=y to kernel boot parameter line
|
||||
when booting kernel. Note that forceload=n by default.
|
||||
|
||||
nosourceid, another parameter of type bool, can be used when broken
|
||||
hardware (mostly chipsets) has root ports that cannot obtain the reporting
|
||||
source ID. nosourceid=n by default.
|
||||
|
||||
2.3 AER error output
|
||||
When a PCI-E AER error is captured, an error message will be outputed to
|
||||
console. If it's a correctable error, it is outputed as a warning.
|
||||
@ -246,3 +250,24 @@ with the PCI Express AER Root driver?
|
||||
A: It could call the helper functions to enable AER in devices and
|
||||
cleanup uncorrectable status register. Pls. refer to section 3.3.
|
||||
|
||||
|
||||
4. Software error injection
|
||||
|
||||
Debugging PCIE AER error recovery code is quite difficult because it
|
||||
is hard to trigger real hardware errors. Software based error
|
||||
injection can be used to fake various kinds of PCIE errors.
|
||||
|
||||
First you should enable PCIE AER software error injection in kernel
|
||||
configuration, that is, following item should be in your .config.
|
||||
|
||||
CONFIG_PCIEAER_INJECT=y or CONFIG_PCIEAER_INJECT=m
|
||||
|
||||
After reboot with new kernel or insert the module, a device file named
|
||||
/dev/aer_inject should be created.
|
||||
|
||||
Then, you need a user space tool named aer-inject, which can be gotten
|
||||
from:
|
||||
http://www.kernel.org/pub/linux/utils/pci/aer-inject/
|
||||
|
||||
More information about aer-inject can be found in the document comes
|
||||
with its source code.
|
||||
|
@ -54,7 +54,7 @@ kernel patches.
|
||||
CONFIG_PREEMPT.
|
||||
|
||||
14: If the patch affects IO/Disk, etc: has been tested with and without
|
||||
CONFIG_LBD.
|
||||
CONFIG_LBDAF.
|
||||
|
||||
15: All codepaths have been exercised with all lockdep features enabled.
|
||||
|
||||
|
@ -246,7 +246,8 @@ void print_ioacct(struct taskstats *t)
|
||||
|
||||
int main(int argc, char *argv[])
|
||||
{
|
||||
int c, rc, rep_len, aggr_len, len2, cmd_type;
|
||||
int c, rc, rep_len, aggr_len, len2;
|
||||
int cmd_type = TASKSTATS_CMD_ATTR_UNSPEC;
|
||||
__u16 id;
|
||||
__u32 mypid;
|
||||
|
||||
|
@ -229,10 +229,10 @@ kernel. It is the use of atomic counters to implement reference
|
||||
counting, and it works such that once the counter falls to zero it can
|
||||
be guaranteed that no other entity can be accessing the object:
|
||||
|
||||
static void obj_list_add(struct obj *obj)
|
||||
static void obj_list_add(struct obj *obj, struct list_head *head)
|
||||
{
|
||||
obj->active = 1;
|
||||
list_add(&obj->list);
|
||||
list_add(&obj->list, head);
|
||||
}
|
||||
|
||||
static void obj_list_del(struct obj *obj)
|
||||
|
@ -117,7 +117,7 @@ Using the pktcdvd debugfs interface
|
||||
|
||||
To read pktcdvd device infos in human readable form, do:
|
||||
|
||||
# cat /debug/pktcdvd/pktcdvd[0-7]/info
|
||||
# cat /sys/kernel/debug/pktcdvd/pktcdvd[0-7]/info
|
||||
|
||||
For a description of the debugfs interface look into the file:
|
||||
|
||||
|
@ -152,14 +152,19 @@ When swap is accounted, following files are added.
|
||||
|
||||
usage of mem+swap is limited by memsw.limit_in_bytes.
|
||||
|
||||
Note: why 'mem+swap' rather than swap.
|
||||
* why 'mem+swap' rather than swap.
|
||||
The global LRU(kswapd) can swap out arbitrary pages. Swap-out means
|
||||
to move account from memory to swap...there is no change in usage of
|
||||
mem+swap.
|
||||
mem+swap. In other words, when we want to limit the usage of swap without
|
||||
affecting global LRU, mem+swap limit is better than just limiting swap from
|
||||
OS point of view.
|
||||
|
||||
In other words, when we want to limit the usage of swap without affecting
|
||||
global LRU, mem+swap limit is better than just limiting swap from OS point
|
||||
of view.
|
||||
* What happens when a cgroup hits memory.memsw.limit_in_bytes
|
||||
When a cgroup his memory.memsw.limit_in_bytes, it's useless to do swap-out
|
||||
in this cgroup. Then, swap-out will not be done by cgroup routine and file
|
||||
caches are dropped. But as mentioned above, global LRU can do swapout memory
|
||||
from it for sanity of the system's memory management state. You can't forbid
|
||||
it by cgroup.
|
||||
|
||||
2.5 Reclaim
|
||||
|
||||
@ -204,6 +209,7 @@ We can alter the memory limit:
|
||||
|
||||
NOTE: We can use a suffix (k, K, m, M, g or G) to indicate values in kilo,
|
||||
mega or gigabytes.
|
||||
NOTE: We can write "-1" to reset the *.limit_in_bytes(unlimited).
|
||||
|
||||
# cat /cgroups/0/memory.limit_in_bytes
|
||||
4194304
|
||||
|
@ -41,6 +41,12 @@ void cn_test_callback(void *data)
|
||||
msg->seq, msg->ack, msg->len, (char *)msg->data);
|
||||
}
|
||||
|
||||
/*
|
||||
* Do not remove this function even if no one is using it as
|
||||
* this is an example of how to get notifications about new
|
||||
* connector user registration
|
||||
*/
|
||||
#if 0
|
||||
static int cn_test_want_notify(void)
|
||||
{
|
||||
struct cn_ctl_msg *ctl;
|
||||
@ -117,6 +123,7 @@ nlmsg_failure:
|
||||
kfree_skb(skb);
|
||||
return -EINVAL;
|
||||
}
|
||||
#endif
|
||||
|
||||
static u32 cn_test_timer_counter;
|
||||
static void cn_test_timer_func(unsigned long __data)
|
||||
|
@ -155,7 +155,7 @@ actual frequency must be determined using the following rules:
|
||||
- if relation==CPUFREQ_REL_H, try to select a new_freq lower than or equal
|
||||
target_freq. ("H for highest, but no higher than")
|
||||
|
||||
Here again the frequency table helper might assist you - see section 3
|
||||
Here again the frequency table helper might assist you - see section 2
|
||||
for details.
|
||||
|
||||
|
||||
|
@ -119,10 +119,6 @@ want the kernel to look at the CPU usage and to make decisions on
|
||||
what to do about the frequency. Typically this is set to values of
|
||||
around '10000' or more. It's default value is (cmp. with users-guide.txt):
|
||||
transition_latency * 1000
|
||||
The lowest value you can set is:
|
||||
transition_latency * 100 or it may get restricted to a value where it
|
||||
makes not sense for the kernel anymore to poll that often which depends
|
||||
on your HZ config variable (HZ=1000: max=20000us, HZ=250: max=5000).
|
||||
Be aware that transition latency is in ns and sampling_rate is in us, so you
|
||||
get the same sysfs value by default.
|
||||
Sampling rate should always get adjusted considering the transition latency
|
||||
@ -131,14 +127,20 @@ in the bash (as said, 1000 is default), do:
|
||||
echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) \
|
||||
>ondemand/sampling_rate
|
||||
|
||||
show_sampling_rate_(min|max): THIS INTERFACE IS DEPRECATED, DON'T USE IT.
|
||||
You can use wider ranges now and the general
|
||||
cpuinfo_transition_latency variable (cmp. with user-guide.txt) can be
|
||||
used to obtain exactly the same info:
|
||||
show_sampling_rate_min = transtition_latency * 500 / 1000
|
||||
show_sampling_rate_max = transtition_latency * 500000 / 1000
|
||||
(divided by 1000 is to illustrate that sampling rate is in us and
|
||||
transition latency is exported ns).
|
||||
show_sampling_rate_min:
|
||||
The sampling rate is limited by the HW transition latency:
|
||||
transition_latency * 100
|
||||
Or by kernel restrictions:
|
||||
If CONFIG_NO_HZ is set, the limit is 10ms fixed.
|
||||
If CONFIG_NO_HZ is not set or no_hz=off boot parameter is used, the
|
||||
limits depend on the CONFIG_HZ option:
|
||||
HZ=1000: min=20000us (20ms)
|
||||
HZ=250: min=80000us (80ms)
|
||||
HZ=100: min=200000us (200ms)
|
||||
The highest value of kernel and HW latency restrictions is shown and
|
||||
used as the minimum sampling rate.
|
||||
|
||||
show_sampling_rate_max: THIS INTERFACE IS DEPRECATED, DON'T USE IT.
|
||||
|
||||
up_threshold: defines what the average CPU usage between the samplings
|
||||
of 'sampling_rate' needs to be for the kernel to make a decision on
|
||||
|
@ -31,7 +31,6 @@ Contents:
|
||||
|
||||
3. How to change the CPU cpufreq policy and/or speed
|
||||
3.1 Preferred interface: sysfs
|
||||
3.2 Deprecated interfaces
|
||||
|
||||
|
||||
|
||||
|
54
Documentation/device-mapper/dm-log.txt
Normal file
54
Documentation/device-mapper/dm-log.txt
Normal file
@ -0,0 +1,54 @@
|
||||
Device-Mapper Logging
|
||||
=====================
|
||||
The device-mapper logging code is used by some of the device-mapper
|
||||
RAID targets to track regions of the disk that are not consistent.
|
||||
A region (or portion of the address space) of the disk may be
|
||||
inconsistent because a RAID stripe is currently being operated on or
|
||||
a machine died while the region was being altered. In the case of
|
||||
mirrors, a region would be considered dirty/inconsistent while you
|
||||
are writing to it because the writes need to be replicated for all
|
||||
the legs of the mirror and may not reach the legs at the same time.
|
||||
Once all writes are complete, the region is considered clean again.
|
||||
|
||||
There is a generic logging interface that the device-mapper RAID
|
||||
implementations use to perform logging operations (see
|
||||
dm_dirty_log_type in include/linux/dm-dirty-log.h). Various different
|
||||
logging implementations are available and provide different
|
||||
capabilities. The list includes:
|
||||
|
||||
Type Files
|
||||
==== =====
|
||||
disk drivers/md/dm-log.c
|
||||
core drivers/md/dm-log.c
|
||||
userspace drivers/md/dm-log-userspace* include/linux/dm-log-userspace.h
|
||||
|
||||
The "disk" log type
|
||||
-------------------
|
||||
This log implementation commits the log state to disk. This way, the
|
||||
logging state survives reboots/crashes.
|
||||
|
||||
The "core" log type
|
||||
-------------------
|
||||
This log implementation keeps the log state in memory. The log state
|
||||
will not survive a reboot or crash, but there may be a small boost in
|
||||
performance. This method can also be used if no storage device is
|
||||
available for storing log state.
|
||||
|
||||
The "userspace" log type
|
||||
------------------------
|
||||
This log type simply provides a way to export the log API to userspace,
|
||||
so log implementations can be done there. This is done by forwarding most
|
||||
logging requests to userspace, where a daemon receives and processes the
|
||||
request.
|
||||
|
||||
The structure used for communication between kernel and userspace are
|
||||
located in include/linux/dm-log-userspace.h. Due to the frequency,
|
||||
diversity, and 2-way communication nature of the exchanges between
|
||||
kernel and userspace, 'connector' is used as the interface for
|
||||
communication.
|
||||
|
||||
There are currently two userspace log implementations that leverage this
|
||||
framework - "clustered_disk" and "clustered_core". These implementations
|
||||
provide a cluster-coherent log for shared-storage. Device-mapper mirroring
|
||||
can be used in a shared-storage environment when the cluster log implementations
|
||||
are employed.
|
39
Documentation/device-mapper/dm-queue-length.txt
Normal file
39
Documentation/device-mapper/dm-queue-length.txt
Normal file
@ -0,0 +1,39 @@
|
||||
dm-queue-length
|
||||
===============
|
||||
|
||||
dm-queue-length is a path selector module for device-mapper targets,
|
||||
which selects a path with the least number of in-flight I/Os.
|
||||
The path selector name is 'queue-length'.
|
||||
|
||||
Table parameters for each path: [<repeat_count>]
|
||||
<repeat_count>: The number of I/Os to dispatch using the selected
|
||||
path before switching to the next path.
|
||||
If not given, internal default is used. To check
|
||||
the default value, see the activated table.
|
||||
|
||||
Status for each path: <status> <fail-count> <in-flight>
|
||||
<status>: 'A' if the path is active, 'F' if the path is failed.
|
||||
<fail-count>: The number of path failures.
|
||||
<in-flight>: The number of in-flight I/Os on the path.
|
||||
|
||||
|
||||
Algorithm
|
||||
=========
|
||||
|
||||
dm-queue-length increments/decrements 'in-flight' when an I/O is
|
||||
dispatched/completed respectively.
|
||||
dm-queue-length selects a path with the minimum 'in-flight'.
|
||||
|
||||
|
||||
Examples
|
||||
========
|
||||
In case that 2 paths (sda and sdb) are used with repeat_count == 128.
|
||||
|
||||
# echo "0 10 multipath 0 0 1 1 queue-length 0 2 1 8:0 128 8:16 128" \
|
||||
dmsetup create test
|
||||
#
|
||||
# dmsetup table
|
||||
test: 0 10 multipath 0 0 1 1 queue-length 0 2 1 8:0 128 8:16 128
|
||||
#
|
||||
# dmsetup status
|
||||
test: 0 10 multipath 2 0 0 0 1 1 E 0 2 1 8:0 A 0 0 8:16 A 0 0
|
91
Documentation/device-mapper/dm-service-time.txt
Normal file
91
Documentation/device-mapper/dm-service-time.txt
Normal file
@ -0,0 +1,91 @@
|
||||
dm-service-time
|
||||
===============
|
||||
|
||||
dm-service-time is a path selector module for device-mapper targets,
|
||||
which selects a path with the shortest estimated service time for
|
||||
the incoming I/O.
|
||||
|
||||
The service time for each path is estimated by dividing the total size
|
||||
of in-flight I/Os on a path with the performance value of the path.
|
||||
The performance value is a relative throughput value among all paths
|
||||
in a path-group, and it can be specified as a table argument.
|
||||
|
||||
The path selector name is 'service-time'.
|
||||
|
||||
Table parameters for each path: [<repeat_count> [<relative_throughput>]]
|
||||
<repeat_count>: The number of I/Os to dispatch using the selected
|
||||
path before switching to the next path.
|
||||
If not given, internal default is used. To check
|
||||
the default value, see the activated table.
|
||||
<relative_throughput>: The relative throughput value of the path
|
||||
among all paths in the path-group.
|
||||
The valid range is 0-100.
|
||||
If not given, minimum value '1' is used.
|
||||
If '0' is given, the path isn't selected while
|
||||
other paths having a positive value are available.
|
||||
|
||||
Status for each path: <status> <fail-count> <in-flight-size> \
|
||||
<relative_throughput>
|
||||
<status>: 'A' if the path is active, 'F' if the path is failed.
|
||||
<fail-count>: The number of path failures.
|
||||
<in-flight-size>: The size of in-flight I/Os on the path.
|
||||
<relative_throughput>: The relative throughput value of the path
|
||||
among all paths in the path-group.
|
||||
|
||||
|
||||
Algorithm
|
||||
=========
|
||||
|
||||
dm-service-time adds the I/O size to 'in-flight-size' when the I/O is
|
||||
dispatched and substracts when completed.
|
||||
Basically, dm-service-time selects a path having minimum service time
|
||||
which is calculated by:
|
||||
|
||||
('in-flight-size' + 'size-of-incoming-io') / 'relative_throughput'
|
||||
|
||||
However, some optimizations below are used to reduce the calculation
|
||||
as much as possible.
|
||||
|
||||
1. If the paths have the same 'relative_throughput', skip
|
||||
the division and just compare the 'in-flight-size'.
|
||||
|
||||
2. If the paths have the same 'in-flight-size', skip the division
|
||||
and just compare the 'relative_throughput'.
|
||||
|
||||
3. If some paths have non-zero 'relative_throughput' and others
|
||||
have zero 'relative_throughput', ignore those paths with zero
|
||||
'relative_throughput'.
|
||||
|
||||
If such optimizations can't be applied, calculate service time, and
|
||||
compare service time.
|
||||
If calculated service time is equal, the path having maximum
|
||||
'relative_throughput' may be better. So compare 'relative_throughput'
|
||||
then.
|
||||
|
||||
|
||||
Examples
|
||||
========
|
||||
In case that 2 paths (sda and sdb) are used with repeat_count == 128
|
||||
and sda has an average throughput 1GB/s and sdb has 4GB/s,
|
||||
'relative_throughput' value may be '1' for sda and '4' for sdb.
|
||||
|
||||
# echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4" \
|
||||
dmsetup create test
|
||||
#
|
||||
# dmsetup table
|
||||
test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4
|
||||
#
|
||||
# dmsetup status
|
||||
test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 1 8:16 A 0 0 4
|
||||
|
||||
|
||||
Or '2' for sda and '8' for sdb would be also true.
|
||||
|
||||
# echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8" \
|
||||
dmsetup create test
|
||||
#
|
||||
# dmsetup table
|
||||
test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8
|
||||
#
|
||||
# dmsetup status
|
||||
test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 2 8:16 A 0 0 8
|
@ -162,3 +162,35 @@ device_remove_file(dev,&dev_attr_power);
|
||||
|
||||
The file name will be 'power' with a mode of 0644 (-rw-r--r--).
|
||||
|
||||
Word of warning: While the kernel allows device_create_file() and
|
||||
device_remove_file() to be called on a device at any time, userspace has
|
||||
strict expectations on when attributes get created. When a new device is
|
||||
registered in the kernel, a uevent is generated to notify userspace (like
|
||||
udev) that a new device is available. If attributes are added after the
|
||||
device is registered, then userspace won't get notified and userspace will
|
||||
not know about the new attributes.
|
||||
|
||||
This is important for device driver that need to publish additional
|
||||
attributes for a device at driver probe time. If the device driver simply
|
||||
calls device_create_file() on the device structure passed to it, then
|
||||
userspace will never be notified of the new attributes. Instead, it should
|
||||
probably use class_create() and class->dev_attrs to set up a list of
|
||||
desired attributes in the modules_init function, and then in the .probe()
|
||||
hook, and then use device_create() to create a new device as a child
|
||||
of the probed device. The new device will generate a new uevent and
|
||||
properly advertise the new attributes to userspace.
|
||||
|
||||
For example, if a driver wanted to add the following attributes:
|
||||
struct device_attribute mydriver_attribs[] = {
|
||||
__ATTR(port_count, 0444, port_count_show),
|
||||
__ATTR(serial_number, 0444, serial_number_show),
|
||||
NULL
|
||||
};
|
||||
|
||||
Then in the module init function is would do:
|
||||
mydriver_class = class_create(THIS_MODULE, "my_attrs");
|
||||
mydriver_class.dev_attr = mydriver_attribs;
|
||||
|
||||
And assuming 'dev' is the struct device passed into the probe hook, the driver
|
||||
probe function would do something like:
|
||||
create_device(&mydriver_class, dev, chrdev, &private_data, "my_name");
|
||||
|
@ -112,7 +112,7 @@ sub tda10045 {
|
||||
|
||||
sub tda10046 {
|
||||
my $sourcefile = "TT_PCI_2.19h_28_11_2006.zip";
|
||||
my $url = "http://technotrend-online.com/download/software/219/$sourcefile";
|
||||
my $url = "http://www.tt-download.com/download/updates/219/$sourcefile";
|
||||
my $hash = "6a7e1e2f2644b162ff0502367553c72d";
|
||||
my $outfile = "dvb-fe-tda10046.fw";
|
||||
my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1);
|
||||
@ -129,8 +129,8 @@ sub tda10046 {
|
||||
}
|
||||
|
||||
sub tda10046lifeview {
|
||||
my $sourcefile = "Drv_2.11.02.zip";
|
||||
my $url = "http://www.lifeview.com.tw/drivers/pci_card/FlyDVB-T/$sourcefile";
|
||||
my $sourcefile = "7%5Cdrv_2.11.02.zip";
|
||||
my $url = "http://www.lifeview.hk/dbimages/document/$sourcefile";
|
||||
my $hash = "1ea24dee4eea8fe971686981f34fd2e0";
|
||||
my $outfile = "dvb-fe-tda10046.fw";
|
||||
my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1);
|
||||
@ -317,7 +317,7 @@ sub nxt2002 {
|
||||
|
||||
sub nxt2004 {
|
||||
my $sourcefile = "AVerTVHD_MCE_A180_Drv_v1.2.2.16.zip";
|
||||
my $url = "http://www.aver.com/support/Drivers/$sourcefile";
|
||||
my $url = "http://www.avermedia-usa.com/support/Drivers/$sourcefile";
|
||||
my $hash = "111cb885b1e009188346d72acfed024c";
|
||||
my $outfile = "dvb-fe-nxt2004.fw";
|
||||
my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1);
|
||||
|
@ -29,16 +29,16 @@ o debugfs entries
|
||||
fault-inject-debugfs kernel module provides some debugfs entries for runtime
|
||||
configuration of fault-injection capabilities.
|
||||
|
||||
- /debug/fail*/probability:
|
||||
- /sys/kernel/debug/fail*/probability:
|
||||
|
||||
likelihood of failure injection, in percent.
|
||||
Format: <percent>
|
||||
|
||||
Note that one-failure-per-hundred is a very high error rate
|
||||
for some testcases. Consider setting probability=100 and configure
|
||||
/debug/fail*/interval for such testcases.
|
||||
/sys/kernel/debug/fail*/interval for such testcases.
|
||||
|
||||
- /debug/fail*/interval:
|
||||
- /sys/kernel/debug/fail*/interval:
|
||||
|
||||
specifies the interval between failures, for calls to
|
||||
should_fail() that pass all the other tests.
|
||||
@ -46,18 +46,18 @@ configuration of fault-injection capabilities.
|
||||
Note that if you enable this, by setting interval>1, you will
|
||||
probably want to set probability=100.
|
||||
|
||||
- /debug/fail*/times:
|
||||
- /sys/kernel/debug/fail*/times:
|
||||
|
||||
specifies how many times failures may happen at most.
|
||||
A value of -1 means "no limit".
|
||||
|
||||
- /debug/fail*/space:
|
||||
- /sys/kernel/debug/fail*/space:
|
||||
|
||||
specifies an initial resource "budget", decremented by "size"
|
||||
on each call to should_fail(,size). Failure injection is
|
||||
suppressed until "space" reaches zero.
|
||||
|
||||
- /debug/fail*/verbose
|
||||
- /sys/kernel/debug/fail*/verbose
|
||||
|
||||
Format: { 0 | 1 | 2 }
|
||||
specifies the verbosity of the messages when failure is
|
||||
@ -65,17 +65,17 @@ configuration of fault-injection capabilities.
|
||||
log line per failure; '2' will print a call trace too -- useful
|
||||
to debug the problems revealed by fault injection.
|
||||
|
||||
- /debug/fail*/task-filter:
|
||||
- /sys/kernel/debug/fail*/task-filter:
|
||||
|
||||
Format: { 'Y' | 'N' }
|
||||
A value of 'N' disables filtering by process (default).
|
||||
Any positive value limits failures to only processes indicated by
|
||||
/proc/<pid>/make-it-fail==1.
|
||||
|
||||
- /debug/fail*/require-start:
|
||||
- /debug/fail*/require-end:
|
||||
- /debug/fail*/reject-start:
|
||||
- /debug/fail*/reject-end:
|
||||
- /sys/kernel/debug/fail*/require-start:
|
||||
- /sys/kernel/debug/fail*/require-end:
|
||||
- /sys/kernel/debug/fail*/reject-start:
|
||||
- /sys/kernel/debug/fail*/reject-end:
|
||||
|
||||
specifies the range of virtual addresses tested during
|
||||
stacktrace walking. Failure is injected only if some caller
|
||||
@ -84,26 +84,26 @@ configuration of fault-injection capabilities.
|
||||
Default required range is [0,ULONG_MAX) (whole of virtual address space).
|
||||
Default rejected range is [0,0).
|
||||
|
||||
- /debug/fail*/stacktrace-depth:
|
||||
- /sys/kernel/debug/fail*/stacktrace-depth:
|
||||
|
||||
specifies the maximum stacktrace depth walked during search
|
||||
for a caller within [require-start,require-end) OR
|
||||
[reject-start,reject-end).
|
||||
|
||||
- /debug/fail_page_alloc/ignore-gfp-highmem:
|
||||
- /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem:
|
||||
|
||||
Format: { 'Y' | 'N' }
|
||||
default is 'N', setting it to 'Y' won't inject failures into
|
||||
highmem/user allocations.
|
||||
|
||||
- /debug/failslab/ignore-gfp-wait:
|
||||
- /debug/fail_page_alloc/ignore-gfp-wait:
|
||||
- /sys/kernel/debug/failslab/ignore-gfp-wait:
|
||||
- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
|
||||
|
||||
Format: { 'Y' | 'N' }
|
||||
default is 'N', setting it to 'Y' will inject failures
|
||||
only into non-sleep allocations (GFP_ATOMIC allocations).
|
||||
|
||||
- /debug/fail_page_alloc/min-order:
|
||||
- /sys/kernel/debug/fail_page_alloc/min-order:
|
||||
|
||||
specifies the minimum page allocation order to be injected
|
||||
failures.
|
||||
@ -166,13 +166,13 @@ o Inject slab allocation failures into module init/exit code
|
||||
#!/bin/bash
|
||||
|
||||
FAILTYPE=failslab
|
||||
echo Y > /debug/$FAILTYPE/task-filter
|
||||
echo 10 > /debug/$FAILTYPE/probability
|
||||
echo 100 > /debug/$FAILTYPE/interval
|
||||
echo -1 > /debug/$FAILTYPE/times
|
||||
echo 0 > /debug/$FAILTYPE/space
|
||||
echo 2 > /debug/$FAILTYPE/verbose
|
||||
echo 1 > /debug/$FAILTYPE/ignore-gfp-wait
|
||||
echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
|
||||
echo 10 > /sys/kernel/debug/$FAILTYPE/probability
|
||||
echo 100 > /sys/kernel/debug/$FAILTYPE/interval
|
||||
echo -1 > /sys/kernel/debug/$FAILTYPE/times
|
||||
echo 0 > /sys/kernel/debug/$FAILTYPE/space
|
||||
echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
|
||||
echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
|
||||
|
||||
faulty_system()
|
||||
{
|
||||
@ -217,20 +217,20 @@ then
|
||||
exit 1
|
||||
fi
|
||||
|
||||
cat /sys/module/$module/sections/.text > /debug/$FAILTYPE/require-start
|
||||
cat /sys/module/$module/sections/.data > /debug/$FAILTYPE/require-end
|
||||
cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start
|
||||
cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end
|
||||
|
||||
echo N > /debug/$FAILTYPE/task-filter
|
||||
echo 10 > /debug/$FAILTYPE/probability
|
||||
echo 100 > /debug/$FAILTYPE/interval
|
||||
echo -1 > /debug/$FAILTYPE/times
|
||||
echo 0 > /debug/$FAILTYPE/space
|
||||
echo 2 > /debug/$FAILTYPE/verbose
|
||||
echo 1 > /debug/$FAILTYPE/ignore-gfp-wait
|
||||
echo 1 > /debug/$FAILTYPE/ignore-gfp-highmem
|
||||
echo 10 > /debug/$FAILTYPE/stacktrace-depth
|
||||
echo N > /sys/kernel/debug/$FAILTYPE/task-filter
|
||||
echo 10 > /sys/kernel/debug/$FAILTYPE/probability
|
||||
echo 100 > /sys/kernel/debug/$FAILTYPE/interval
|
||||
echo -1 > /sys/kernel/debug/$FAILTYPE/times
|
||||
echo 0 > /sys/kernel/debug/$FAILTYPE/space
|
||||
echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
|
||||
echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
|
||||
echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
|
||||
echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
|
||||
|
||||
trap "echo 0 > /debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
|
||||
trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
|
||||
|
||||
echo "Injecting errors into the module $module... (interrupt to stop)"
|
||||
sleep 1000000
|
||||
|
@ -95,7 +95,7 @@ There is no way to change the vesafb video mode and/or timings after
|
||||
booting linux. If you are not happy with the 60 Hz refresh rate, you
|
||||
have these options:
|
||||
|
||||
* configure and load the DOS-Tools for your the graphics board (if
|
||||
* configure and load the DOS-Tools for the graphics board (if
|
||||
available) and boot linux with loadlin.
|
||||
* use a native driver (matroxfb/atyfb) instead if vesafb. If none
|
||||
is available, write a new one!
|
||||
|
@ -6,6 +6,20 @@ be removed from this file.
|
||||
|
||||
---------------------------
|
||||
|
||||
What: IRQF_SAMPLE_RANDOM
|
||||
Check: IRQF_SAMPLE_RANDOM
|
||||
When: July 2009
|
||||
|
||||
Why: Many of IRQF_SAMPLE_RANDOM users are technically bogus as entropy
|
||||
sources in the kernel's current entropy model. To resolve this, every
|
||||
input point to the kernel's entropy pool needs to better document the
|
||||
type of entropy source it actually is. This will be replaced with
|
||||
additional add_*_randomness functions in drivers/char/random.c
|
||||
|
||||
Who: Robin Getz <rgetz@blackfin.uclinux.org> & Matt Mackall <mpm@selenic.com>
|
||||
|
||||
---------------------------
|
||||
|
||||
What: The ieee80211_regdom module parameter
|
||||
When: March 2010 / desktop catchup
|
||||
|
||||
@ -354,16 +368,6 @@ Who: Krzysztof Piotr Oledzki <ole@ans.pl>
|
||||
|
||||
---------------------------
|
||||
|
||||
What: i2c_attach_client(), i2c_detach_client(), i2c_driver->detach_client(),
|
||||
i2c_adapter->client_register(), i2c_adapter->client_unregister
|
||||
When: 2.6.30
|
||||
Check: i2c_attach_client i2c_detach_client
|
||||
Why: Deprecated by the new (standard) device driver binding model. Use
|
||||
i2c_driver->probe() and ->remove() instead.
|
||||
Who: Jean Delvare <khali@linux-fr.org>
|
||||
|
||||
---------------------------
|
||||
|
||||
What: fscher and fscpos drivers
|
||||
When: June 2009
|
||||
Why: Deprecated by the new fschmd driver.
|
||||
@ -438,6 +442,13 @@ Why: Superseded by tdfxfb. I2C/DDC support used to live in a separate
|
||||
Who: Jean Delvare <khali@linux-fr.org>
|
||||
Krzysztof Helt <krzysztof.h1@wp.pl>
|
||||
|
||||
---------------------------
|
||||
|
||||
What: CONFIG_RFKILL_INPUT
|
||||
When: 2.6.33
|
||||
Why: Should be implemented in userspace, policy daemon.
|
||||
Who: Johannes Berg <johannes@sipsolutions.net>
|
||||
|
||||
----------------------------
|
||||
|
||||
What: CONFIG_X86_OLD_MCE
|
||||
|
@ -66,6 +66,10 @@ mandatory-locking.txt
|
||||
- info on the Linux implementation of Sys V mandatory file locking.
|
||||
ncpfs.txt
|
||||
- info on Novell Netware(tm) filesystem using NCP protocol.
|
||||
nfs41-server.txt
|
||||
- info on the Linux server implementation of NFSv4 minor version 1.
|
||||
nfs-rdma.txt
|
||||
- how to install and setup the Linux NFS/RDMA client and server software.
|
||||
nfsroot.txt
|
||||
- short guide on setting up a diskless box with NFS root filesystem.
|
||||
nilfs2.txt
|
||||
|
@ -109,27 +109,28 @@ prototypes:
|
||||
|
||||
locking rules:
|
||||
All may block.
|
||||
BKL s_lock s_umount
|
||||
alloc_inode: no no no
|
||||
destroy_inode: no
|
||||
dirty_inode: no (must not sleep)
|
||||
write_inode: no
|
||||
drop_inode: no !!!inode_lock!!!
|
||||
delete_inode: no
|
||||
put_super: yes yes no
|
||||
write_super: no yes read
|
||||
sync_fs: no no read
|
||||
freeze_fs: ?
|
||||
unfreeze_fs: ?
|
||||
statfs: no no no
|
||||
remount_fs: yes yes maybe (see below)
|
||||
clear_inode: no
|
||||
umount_begin: yes no no
|
||||
show_options: no (vfsmount->sem)
|
||||
quota_read: no no no (see below)
|
||||
quota_write: no no no (see below)
|
||||
None have BKL
|
||||
s_umount
|
||||
alloc_inode:
|
||||
destroy_inode:
|
||||
dirty_inode: (must not sleep)
|
||||
write_inode:
|
||||
drop_inode: !!!inode_lock!!!
|
||||
delete_inode:
|
||||
put_super: write
|
||||
write_super: read
|
||||
sync_fs: read
|
||||
freeze_fs: read
|
||||
unfreeze_fs: read
|
||||
statfs: no
|
||||
remount_fs: maybe (see below)
|
||||
clear_inode:
|
||||
umount_begin: no
|
||||
show_options: no (namespace_sem)
|
||||
quota_read: no (see below)
|
||||
quota_write: no (see below)
|
||||
|
||||
->remount_fs() will have the s_umount lock if it's already mounted.
|
||||
->remount_fs() will have the s_umount exclusive lock if it's already mounted.
|
||||
When called from get_sb_single, it does NOT have the s_umount lock.
|
||||
->quota_read() and ->quota_write() functions are both guaranteed to
|
||||
be the only ones operating on the quota file by the quota code (via
|
||||
@ -187,7 +188,7 @@ readpages: no
|
||||
write_begin: no locks the page yes
|
||||
write_end: no yes, unlocks yes
|
||||
perform_write: no n/a yes
|
||||
bmap: yes
|
||||
bmap: no
|
||||
invalidatepage: no yes
|
||||
releasepage: no yes
|
||||
direct_IO: no
|
||||
|
@ -322,7 +322,7 @@ an upper limit on the block size imposed by the page size of the kernel,
|
||||
so 8kB blocks are only allowed on Alpha systems (and other architectures
|
||||
which support larger pages).
|
||||
|
||||
There is an upper limit of 32768 subdirectories in a single directory.
|
||||
There is an upper limit of 32000 subdirectories in a single directory.
|
||||
|
||||
There is a "soft" upper limit of about 10-15k files in a single directory
|
||||
with the current linear linked-list directory implementation. This limit
|
||||
|
@ -235,6 +235,10 @@ minixdf Make 'df' act like Minix.
|
||||
|
||||
debug Extra debugging information is sent to syslog.
|
||||
|
||||
abort Simulate the effects of calling ext4_abort() for
|
||||
debugging purposes. This is normally used while
|
||||
remounting a filesystem which is already mounted.
|
||||
|
||||
errors=remount-ro Remount the filesystem read-only on an error.
|
||||
errors=continue Keep going on a filesystem error.
|
||||
errors=panic Panic and halt the machine if an error occurs.
|
||||
|
@ -23,8 +23,13 @@ Mount options unique to the isofs filesystem.
|
||||
map=off Do not map non-Rock Ridge filenames to lower case
|
||||
map=normal Map non-Rock Ridge filenames to lower case
|
||||
map=acorn As map=normal but also apply Acorn extensions if present
|
||||
mode=xxx Sets the permissions on files to xxx
|
||||
dmode=xxx Sets the permissions on directories to xxx
|
||||
mode=xxx Sets the permissions on files to xxx unless Rock Ridge
|
||||
extensions set the permissions otherwise
|
||||
dmode=xxx Sets the permissions on directories to xxx unless Rock Ridge
|
||||
extensions set the permissions otherwise
|
||||
overriderockperm Set permissions on files and directories according to
|
||||
'mode' and 'dmode' even though Rock Ridge extensions are
|
||||
present.
|
||||
nojoliet Ignore Joliet extensions if they are present.
|
||||
norock Ignore Rock Ridge extensions if they are present.
|
||||
hide Completely strip hidden files from the file system.
|
||||
|
@ -39,9 +39,8 @@ Features which NILFS2 does not support yet:
|
||||
- extended attributes
|
||||
- POSIX ACLs
|
||||
- quotas
|
||||
- writable snapshots
|
||||
- remote backup (CDP)
|
||||
- data integrity
|
||||
- fsck
|
||||
- resize
|
||||
- defragmentation
|
||||
|
||||
Mount options
|
||||
|
@ -5,11 +5,12 @@
|
||||
Bodo Bauer <bb@ricochet.net>
|
||||
|
||||
2.4.x update Jorge Nerin <comandante@zaralinux.com> November 14 2000
|
||||
move /proc/sys Shen Feng <shen@cn.fujitsu.com> April 1 2009
|
||||
move /proc/sys Shen Feng <shen@cn.fujitsu.com> April 1 2009
|
||||
------------------------------------------------------------------------------
|
||||
Version 1.3 Kernel version 2.2.12
|
||||
Kernel version 2.4.0-test11-pre4
|
||||
------------------------------------------------------------------------------
|
||||
fixes/update part 1.1 Stefani Seibold <stefani@seibold.net> June 9 2009
|
||||
|
||||
Table of Contents
|
||||
-----------------
|
||||
@ -116,7 +117,7 @@ The link self points to the process reading the file system. Each process
|
||||
subdirectory has the entries listed in Table 1-1.
|
||||
|
||||
|
||||
Table 1-1: Process specific entries in /proc
|
||||
Table 1-1: Process specific entries in /proc
|
||||
..............................................................................
|
||||
File Content
|
||||
clear_refs Clears page referenced bits shown in smaps output
|
||||
@ -134,46 +135,103 @@ Table 1-1: Process specific entries in /proc
|
||||
status Process status in human readable form
|
||||
wchan If CONFIG_KALLSYMS is set, a pre-decoded wchan
|
||||
stack Report full stack trace, enable via CONFIG_STACKTRACE
|
||||
smaps Extension based on maps, the rss size for each mapped file
|
||||
smaps a extension based on maps, showing the memory consumption of
|
||||
each mapping
|
||||
..............................................................................
|
||||
|
||||
For example, to get the status information of a process, all you have to do is
|
||||
read the file /proc/PID/status:
|
||||
|
||||
>cat /proc/self/status
|
||||
Name: cat
|
||||
State: R (running)
|
||||
Pid: 5452
|
||||
PPid: 743
|
||||
>cat /proc/self/status
|
||||
Name: cat
|
||||
State: R (running)
|
||||
Tgid: 5452
|
||||
Pid: 5452
|
||||
PPid: 743
|
||||
TracerPid: 0 (2.4)
|
||||
Uid: 501 501 501 501
|
||||
Gid: 100 100 100 100
|
||||
Groups: 100 14 16
|
||||
VmSize: 1112 kB
|
||||
VmLck: 0 kB
|
||||
VmRSS: 348 kB
|
||||
VmData: 24 kB
|
||||
VmStk: 12 kB
|
||||
VmExe: 8 kB
|
||||
VmLib: 1044 kB
|
||||
SigPnd: 0000000000000000
|
||||
SigBlk: 0000000000000000
|
||||
SigIgn: 0000000000000000
|
||||
SigCgt: 0000000000000000
|
||||
CapInh: 00000000fffffeff
|
||||
CapPrm: 0000000000000000
|
||||
CapEff: 0000000000000000
|
||||
|
||||
Uid: 501 501 501 501
|
||||
Gid: 100 100 100 100
|
||||
FDSize: 256
|
||||
Groups: 100 14 16
|
||||
VmPeak: 5004 kB
|
||||
VmSize: 5004 kB
|
||||
VmLck: 0 kB
|
||||
VmHWM: 476 kB
|
||||
VmRSS: 476 kB
|
||||
VmData: 156 kB
|
||||
VmStk: 88 kB
|
||||
VmExe: 68 kB
|
||||
VmLib: 1412 kB
|
||||
VmPTE: 20 kb
|
||||
Threads: 1
|
||||
SigQ: 0/28578
|
||||
SigPnd: 0000000000000000
|
||||
ShdPnd: 0000000000000000
|
||||
SigBlk: 0000000000000000
|
||||
SigIgn: 0000000000000000
|
||||
SigCgt: 0000000000000000
|
||||
CapInh: 00000000fffffeff
|
||||
CapPrm: 0000000000000000
|
||||
CapEff: 0000000000000000
|
||||
CapBnd: ffffffffffffffff
|
||||
voluntary_ctxt_switches: 0
|
||||
nonvoluntary_ctxt_switches: 1
|
||||
|
||||
This shows you nearly the same information you would get if you viewed it with
|
||||
the ps command. In fact, ps uses the proc file system to obtain its
|
||||
information. The statm file contains more detailed information about the
|
||||
process memory usage. Its seven fields are explained in Table 1-2. The stat
|
||||
file contains details information about the process itself. Its fields are
|
||||
explained in Table 1-3.
|
||||
information. But you get a more detailed view of the process by reading the
|
||||
file /proc/PID/status. It fields are described in table 1-2.
|
||||
|
||||
The statm file contains more detailed information about the process
|
||||
memory usage. Its seven fields are explained in Table 1-3. The stat file
|
||||
contains details information about the process itself. Its fields are
|
||||
explained in Table 1-4.
|
||||
|
||||
Table 1-2: Contents of the statm files (as of 2.6.8-rc3)
|
||||
Table 1-2: Contents of the statm files (as of 2.6.30-rc7)
|
||||
..............................................................................
|
||||
Field Content
|
||||
Name filename of the executable
|
||||
State state (R is running, S is sleeping, D is sleeping
|
||||
in an uninterruptible wait, Z is zombie,
|
||||
T is traced or stopped)
|
||||
Tgid thread group ID
|
||||
Pid process id
|
||||
PPid process id of the parent process
|
||||
TracerPid PID of process tracing this process (0 if not)
|
||||
Uid Real, effective, saved set, and file system UIDs
|
||||
Gid Real, effective, saved set, and file system GIDs
|
||||
FDSize number of file descriptor slots currently allocated
|
||||
Groups supplementary group list
|
||||
VmPeak peak virtual memory size
|
||||
VmSize total program size
|
||||
VmLck locked memory size
|
||||
VmHWM peak resident set size ("high water mark")
|
||||
VmRSS size of memory portions
|
||||
VmData size of data, stack, and text segments
|
||||
VmStk size of data, stack, and text segments
|
||||
VmExe size of text segment
|
||||
VmLib size of shared library code
|
||||
VmPTE size of page table entries
|
||||
Threads number of threads
|
||||
SigQ number of signals queued/max. number for queue
|
||||
SigPnd bitmap of pending signals for the thread
|
||||
ShdPnd bitmap of shared pending signals for the process
|
||||
SigBlk bitmap of blocked signals
|
||||
SigIgn bitmap of ignored signals
|
||||
SigCgt bitmap of catched signals
|
||||
CapInh bitmap of inheritable capabilities
|
||||
CapPrm bitmap of permitted capabilities
|
||||
CapEff bitmap of effective capabilities
|
||||
CapBnd bitmap of capabilities bounding set
|
||||
Cpus_allowed mask of CPUs on which this process may run
|
||||
Cpus_allowed_list Same as previous, but in "list format"
|
||||
Mems_allowed mask of memory nodes allowed to this process
|
||||
Mems_allowed_list Same as previous, but in "list format"
|
||||
voluntary_ctxt_switches number of voluntary context switches
|
||||
nonvoluntary_ctxt_switches number of non voluntary context switches
|
||||
..............................................................................
|
||||
|
||||
Table 1-3: Contents of the statm files (as of 2.6.8-rc3)
|
||||
..............................................................................
|
||||
Field Content
|
||||
size total program size (pages) (same as VmSize in status)
|
||||
@ -188,7 +246,7 @@ Table 1-2: Contents of the statm files (as of 2.6.8-rc3)
|
||||
..............................................................................
|
||||
|
||||
|
||||
Table 1-3: Contents of the stat files (as of 2.6.22-rc3)
|
||||
Table 1-4: Contents of the stat files (as of 2.6.30-rc7)
|
||||
..............................................................................
|
||||
Field Content
|
||||
pid process id
|
||||
@ -222,10 +280,10 @@ Table 1-3: Contents of the stat files (as of 2.6.22-rc3)
|
||||
start_stack address of the start of the stack
|
||||
esp current value of ESP
|
||||
eip current value of EIP
|
||||
pending bitmap of pending signals (obsolete)
|
||||
blocked bitmap of blocked signals (obsolete)
|
||||
sigign bitmap of ignored signals (obsolete)
|
||||
sigcatch bitmap of catched signals (obsolete)
|
||||
pending bitmap of pending signals
|
||||
blocked bitmap of blocked signals
|
||||
sigign bitmap of ignored signals
|
||||
sigcatch bitmap of catched signals
|
||||
wchan address where process went to sleep
|
||||
0 (place holder)
|
||||
0 (place holder)
|
||||
@ -234,19 +292,99 @@ Table 1-3: Contents of the stat files (as of 2.6.22-rc3)
|
||||
rt_priority realtime priority
|
||||
policy scheduling policy (man sched_setscheduler)
|
||||
blkio_ticks time spent waiting for block IO
|
||||
gtime guest time of the task in jiffies
|
||||
cgtime guest time of the task children in jiffies
|
||||
..............................................................................
|
||||
|
||||
The /proc/PID/map file containing the currently mapped memory regions and
|
||||
their access permissions.
|
||||
|
||||
The format is:
|
||||
|
||||
address perms offset dev inode pathname
|
||||
|
||||
08048000-08049000 r-xp 00000000 03:00 8312 /opt/test
|
||||
08049000-0804a000 rw-p 00001000 03:00 8312 /opt/test
|
||||
0804a000-0806b000 rw-p 00000000 00:00 0 [heap]
|
||||
a7cb1000-a7cb2000 ---p 00000000 00:00 0
|
||||
a7cb2000-a7eb2000 rw-p 00000000 00:00 0
|
||||
a7eb2000-a7eb3000 ---p 00000000 00:00 0
|
||||
a7eb3000-a7ed5000 rw-p 00000000 00:00 0
|
||||
a7ed5000-a8008000 r-xp 00000000 03:00 4222 /lib/libc.so.6
|
||||
a8008000-a800a000 r--p 00133000 03:00 4222 /lib/libc.so.6
|
||||
a800a000-a800b000 rw-p 00135000 03:00 4222 /lib/libc.so.6
|
||||
a800b000-a800e000 rw-p 00000000 00:00 0
|
||||
a800e000-a8022000 r-xp 00000000 03:00 14462 /lib/libpthread.so.0
|
||||
a8022000-a8023000 r--p 00013000 03:00 14462 /lib/libpthread.so.0
|
||||
a8023000-a8024000 rw-p 00014000 03:00 14462 /lib/libpthread.so.0
|
||||
a8024000-a8027000 rw-p 00000000 00:00 0
|
||||
a8027000-a8043000 r-xp 00000000 03:00 8317 /lib/ld-linux.so.2
|
||||
a8043000-a8044000 r--p 0001b000 03:00 8317 /lib/ld-linux.so.2
|
||||
a8044000-a8045000 rw-p 0001c000 03:00 8317 /lib/ld-linux.so.2
|
||||
aff35000-aff4a000 rw-p 00000000 00:00 0 [stack]
|
||||
ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]
|
||||
|
||||
where "address" is the address space in the process that it occupies, "perms"
|
||||
is a set of permissions:
|
||||
|
||||
r = read
|
||||
w = write
|
||||
x = execute
|
||||
s = shared
|
||||
p = private (copy on write)
|
||||
|
||||
"offset" is the offset into the mapping, "dev" is the device (major:minor), and
|
||||
"inode" is the inode on that device. 0 indicates that no inode is associated
|
||||
with the memory region, as the case would be with BSS (uninitialized data).
|
||||
The "pathname" shows the name associated file for this mapping. If the mapping
|
||||
is not associated with a file:
|
||||
|
||||
[heap] = the heap of the program
|
||||
[stack] = the stack of the main process
|
||||
[vdso] = the "virtual dynamic shared object",
|
||||
the kernel system call handler
|
||||
|
||||
or if empty, the mapping is anonymous.
|
||||
|
||||
|
||||
The /proc/PID/smaps is an extension based on maps, showing the memory
|
||||
consumption for each of the process's mappings. For each of mappings there
|
||||
is a series of lines such as the following:
|
||||
|
||||
08048000-080bc000 r-xp 00000000 03:02 13130 /bin/bash
|
||||
Size: 1084 kB
|
||||
Rss: 892 kB
|
||||
Pss: 374 kB
|
||||
Shared_Clean: 892 kB
|
||||
Shared_Dirty: 0 kB
|
||||
Private_Clean: 0 kB
|
||||
Private_Dirty: 0 kB
|
||||
Referenced: 892 kB
|
||||
Swap: 0 kB
|
||||
KernelPageSize: 4 kB
|
||||
MMUPageSize: 4 kB
|
||||
|
||||
The first of these lines shows the same information as is displayed for the
|
||||
mapping in /proc/PID/maps. The remaining lines show the size of the mapping,
|
||||
the amount of the mapping that is currently resident in RAM, the "proportional
|
||||
set size” (divide each shared page by the number of processes sharing it), the
|
||||
number of clean and dirty shared pages in the mapping, and the number of clean
|
||||
and dirty private pages in the mapping. The "Referenced" indicates the amount
|
||||
of memory currently marked as referenced or accessed.
|
||||
|
||||
This file is only present if the CONFIG_MMU kernel configuration option is
|
||||
enabled.
|
||||
|
||||
1.2 Kernel data
|
||||
---------------
|
||||
|
||||
Similar to the process entries, the kernel data files give information about
|
||||
the running kernel. The files used to obtain this information are contained in
|
||||
/proc and are listed in Table 1-4. Not all of these will be present in your
|
||||
/proc and are listed in Table 1-5. Not all of these will be present in your
|
||||
system. It depends on the kernel configuration and the loaded modules, which
|
||||
files are there, and which are missing.
|
||||
|
||||
Table 1-4: Kernel info in /proc
|
||||
Table 1-5: Kernel info in /proc
|
||||
..............................................................................
|
||||
File Content
|
||||
apm Advanced power management info
|
||||
@ -283,6 +421,7 @@ Table 1-4: Kernel info in /proc
|
||||
rtc Real time clock
|
||||
scsi SCSI info (see text)
|
||||
slabinfo Slab pool info
|
||||
softirqs softirq usage
|
||||
stat Overall statistics
|
||||
swaps Swap space utilization
|
||||
sys See chapter 2
|
||||
@ -597,6 +736,25 @@ on the kind of area :
|
||||
0xffffffffa0017000-0xffffffffa0022000 45056 sys_init_module+0xc27/0x1d00 ...
|
||||
pages=10 vmalloc N0=10
|
||||
|
||||
..............................................................................
|
||||
|
||||
softirqs:
|
||||
|
||||
Provides counts of softirq handlers serviced since boot time, for each cpu.
|
||||
|
||||
> cat /proc/softirqs
|
||||
CPU0 CPU1 CPU2 CPU3
|
||||
HI: 0 0 0 0
|
||||
TIMER: 27166 27120 27097 27034
|
||||
NET_TX: 0 0 0 17
|
||||
NET_RX: 42 0 0 39
|
||||
BLOCK: 0 0 107 1121
|
||||
TASKLET: 0 0 0 290
|
||||
SCHED: 27035 26983 26971 26746
|
||||
HRTIMER: 0 0 0 0
|
||||
RCU: 1678 1769 2178 2250
|
||||
|
||||
|
||||
1.3 IDE devices in /proc/ide
|
||||
----------------------------
|
||||
|
||||
@ -614,10 +772,10 @@ IDE devices:
|
||||
|
||||
More detailed information can be found in the controller specific
|
||||
subdirectories. These are named ide0, ide1 and so on. Each of these
|
||||
directories contains the files shown in table 1-5.
|
||||
directories contains the files shown in table 1-6.
|
||||
|
||||
|
||||
Table 1-5: IDE controller info in /proc/ide/ide?
|
||||
Table 1-6: IDE controller info in /proc/ide/ide?
|
||||
..............................................................................
|
||||
File Content
|
||||
channel IDE channel (0 or 1)
|
||||
@ -627,11 +785,11 @@ Table 1-5: IDE controller info in /proc/ide/ide?
|
||||
..............................................................................
|
||||
|
||||
Each device connected to a controller has a separate subdirectory in the
|
||||
controllers directory. The files listed in table 1-6 are contained in these
|
||||
controllers directory. The files listed in table 1-7 are contained in these
|
||||
directories.
|
||||
|
||||
|
||||
Table 1-6: IDE device information
|
||||
Table 1-7: IDE device information
|
||||
..............................................................................
|
||||
File Content
|
||||
cache The cache
|
||||
@ -673,12 +831,12 @@ the drive parameters:
|
||||
1.4 Networking info in /proc/net
|
||||
--------------------------------
|
||||
|
||||
The subdirectory /proc/net follows the usual pattern. Table 1-6 shows the
|
||||
The subdirectory /proc/net follows the usual pattern. Table 1-8 shows the
|
||||
additional values you get for IP version 6 if you configure the kernel to
|
||||
support this. Table 1-7 lists the files and their meaning.
|
||||
support this. Table 1-9 lists the files and their meaning.
|
||||
|
||||
|
||||
Table 1-6: IPv6 info in /proc/net
|
||||
Table 1-8: IPv6 info in /proc/net
|
||||
..............................................................................
|
||||
File Content
|
||||
udp6 UDP sockets (IPv6)
|
||||
@ -693,7 +851,7 @@ Table 1-6: IPv6 info in /proc/net
|
||||
..............................................................................
|
||||
|
||||
|
||||
Table 1-7: Network info in /proc/net
|
||||
Table 1-9: Network info in /proc/net
|
||||
..............................................................................
|
||||
File Content
|
||||
arp Kernel ARP table
|
||||
@ -817,10 +975,10 @@ The directory /proc/parport contains information about the parallel ports of
|
||||
your system. It has one subdirectory for each port, named after the port
|
||||
number (0,1,2,...).
|
||||
|
||||
These directories contain the four files shown in Table 1-8.
|
||||
These directories contain the four files shown in Table 1-10.
|
||||
|
||||
|
||||
Table 1-8: Files in /proc/parport
|
||||
Table 1-10: Files in /proc/parport
|
||||
..............................................................................
|
||||
File Content
|
||||
autoprobe Any IEEE-1284 device ID information that has been acquired.
|
||||
@ -838,10 +996,10 @@ Table 1-8: Files in /proc/parport
|
||||
|
||||
Information about the available and actually used tty's can be found in the
|
||||
directory /proc/tty.You'll find entries for drivers and line disciplines in
|
||||
this directory, as shown in Table 1-9.
|
||||
this directory, as shown in Table 1-11.
|
||||
|
||||
|
||||
Table 1-9: Files in /proc/tty
|
||||
Table 1-11: Files in /proc/tty
|
||||
..............................................................................
|
||||
File Content
|
||||
drivers list of drivers and their usage
|
||||
@ -883,6 +1041,7 @@ since the system first booted. For a quick look, simply cat the file:
|
||||
processes 2915
|
||||
procs_running 1
|
||||
procs_blocked 0
|
||||
softirq 183433 0 21755 12 39 1137 231 21459 2263
|
||||
|
||||
The very first "cpu" line aggregates the numbers in all of the other "cpuN"
|
||||
lines. These numbers identify the amount of time the CPU has spent performing
|
||||
@ -918,6 +1077,11 @@ CPUs.
|
||||
The "procs_blocked" line gives the number of processes currently blocked,
|
||||
waiting for I/O to complete.
|
||||
|
||||
The "softirq" line gives counts of softirqs serviced since boot time, for each
|
||||
of the possible system softirqs. The first column is the total of all
|
||||
softirqs serviced; each subsequent column is the total for that particular
|
||||
softirq.
|
||||
|
||||
|
||||
1.9 Ext4 file system parameters
|
||||
------------------------------
|
||||
@ -926,9 +1090,9 @@ Information about mounted ext4 file systems can be found in
|
||||
/proc/fs/ext4. Each mounted filesystem will have a directory in
|
||||
/proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or
|
||||
/proc/fs/ext4/dm-0). The files in each per-device directory are shown
|
||||
in Table 1-10, below.
|
||||
in Table 1-12, below.
|
||||
|
||||
Table 1-10: Files in /proc/fs/ext4/<devname>
|
||||
Table 1-12: Files in /proc/fs/ext4/<devname>
|
||||
..............................................................................
|
||||
File Content
|
||||
mb_groups details of multiblock allocator buddy cache of free blocks
|
||||
@ -1003,11 +1167,13 @@ CHAPTER 3: PER-PROCESS PARAMETERS
|
||||
3.1 /proc/<pid>/oom_adj - Adjust the oom-killer score
|
||||
------------------------------------------------------
|
||||
|
||||
This file can be used to adjust the score used to select which processes
|
||||
should be killed in an out-of-memory situation. Giving it a high score will
|
||||
increase the likelihood of this process being killed by the oom-killer. Valid
|
||||
values are in the range -16 to +15, plus the special value -17, which disables
|
||||
oom-killing altogether for this process.
|
||||
This file can be used to adjust the score used to select which processes should
|
||||
be killed in an out-of-memory situation. The oom_adj value is a characteristic
|
||||
of the task's mm, so all threads that share an mm with pid will have the same
|
||||
oom_adj value. A high value will increase the likelihood of this process being
|
||||
killed by the oom-killer. Valid values are in the range -16 to +15 as
|
||||
explained below and a special value of -17, which disables oom-killing
|
||||
altogether for threads sharing pid's mm.
|
||||
|
||||
The process to be killed in an out-of-memory situation is selected among all others
|
||||
based on its badness score. This value equals the original memory size of the process
|
||||
@ -1021,6 +1187,9 @@ the parent's score if they do not share the same memory. Thus forking servers
|
||||
are the prime candidates to be killed. Having only one 'hungry' child will make
|
||||
parent less preferable than the child.
|
||||
|
||||
/proc/<pid>/oom_adj cannot be changed for kthreads since they are immune from
|
||||
oom-killing already.
|
||||
|
||||
/proc/<pid>/oom_score shows process' current badness score.
|
||||
|
||||
The following heuristics are then applied:
|
||||
|
@ -132,6 +132,11 @@ rodir -- FAT has the ATTR_RO (read-only) attribute. On Windows,
|
||||
If you want to use ATTR_RO as read-only flag even for
|
||||
the directory, set this option.
|
||||
|
||||
errors=panic|continue|remount-ro
|
||||
-- specify FAT behavior on critical errors: panic, continue
|
||||
without doing anything or remount the partition in
|
||||
read-only mode (default behavior).
|
||||
|
||||
<bool>: 0,1,yes,no,true,false
|
||||
|
||||
TODO
|
||||
|
@ -77,7 +77,8 @@
|
||||
seconds for the whole load operation.
|
||||
|
||||
- request_firmware_nowait() is also provided for convenience in
|
||||
non-user contexts.
|
||||
user contexts to request firmware asynchronously, but can't be called
|
||||
in atomic contexts.
|
||||
|
||||
|
||||
about in-kernel persistence:
|
||||
|
246
Documentation/gcov.txt
Normal file
246
Documentation/gcov.txt
Normal file
@ -0,0 +1,246 @@
|
||||
Using gcov with the Linux kernel
|
||||
================================
|
||||
|
||||
1. Introduction
|
||||
2. Preparation
|
||||
3. Customization
|
||||
4. Files
|
||||
5. Modules
|
||||
6. Separated build and test machines
|
||||
7. Troubleshooting
|
||||
Appendix A: sample script: gather_on_build.sh
|
||||
Appendix B: sample script: gather_on_test.sh
|
||||
|
||||
|
||||
1. Introduction
|
||||
===============
|
||||
|
||||
gcov profiling kernel support enables the use of GCC's coverage testing
|
||||
tool gcov [1] with the Linux kernel. Coverage data of a running kernel
|
||||
is exported in gcov-compatible format via the "gcov" debugfs directory.
|
||||
To get coverage data for a specific file, change to the kernel build
|
||||
directory and use gcov with the -o option as follows (requires root):
|
||||
|
||||
# cd /tmp/linux-out
|
||||
# gcov -o /sys/kernel/debug/gcov/tmp/linux-out/kernel spinlock.c
|
||||
|
||||
This will create source code files annotated with execution counts
|
||||
in the current directory. In addition, graphical gcov front-ends such
|
||||
as lcov [2] can be used to automate the process of collecting data
|
||||
for the entire kernel and provide coverage overviews in HTML format.
|
||||
|
||||
Possible uses:
|
||||
|
||||
* debugging (has this line been reached at all?)
|
||||
* test improvement (how do I change my test to cover these lines?)
|
||||
* minimizing kernel configurations (do I need this option if the
|
||||
associated code is never run?)
|
||||
|
||||
--
|
||||
|
||||
[1] http://gcc.gnu.org/onlinedocs/gcc/Gcov.html
|
||||
[2] http://ltp.sourceforge.net/coverage/lcov.php
|
||||
|
||||
|
||||
2. Preparation
|
||||
==============
|
||||
|
||||
Configure the kernel with:
|
||||
|
||||
CONFIG_DEBUGFS=y
|
||||
CONFIG_GCOV_KERNEL=y
|
||||
|
||||
and to get coverage data for the entire kernel:
|
||||
|
||||
CONFIG_GCOV_PROFILE_ALL=y
|
||||
|
||||
Note that kernels compiled with profiling flags will be significantly
|
||||
larger and run slower. Also CONFIG_GCOV_PROFILE_ALL may not be supported
|
||||
on all architectures.
|
||||
|
||||
Profiling data will only become accessible once debugfs has been
|
||||
mounted:
|
||||
|
||||
mount -t debugfs none /sys/kernel/debug
|
||||
|
||||
|
||||
3. Customization
|
||||
================
|
||||
|
||||
To enable profiling for specific files or directories, add a line
|
||||
similar to the following to the respective kernel Makefile:
|
||||
|
||||
For a single file (e.g. main.o):
|
||||
GCOV_PROFILE_main.o := y
|
||||
|
||||
For all files in one directory:
|
||||
GCOV_PROFILE := y
|
||||
|
||||
To exclude files from being profiled even when CONFIG_GCOV_PROFILE_ALL
|
||||
is specified, use:
|
||||
|
||||
GCOV_PROFILE_main.o := n
|
||||
and:
|
||||
GCOV_PROFILE := n
|
||||
|
||||
Only files which are linked to the main kernel image or are compiled as
|
||||
kernel modules are supported by this mechanism.
|
||||
|
||||
|
||||
4. Files
|
||||
========
|
||||
|
||||
The gcov kernel support creates the following files in debugfs:
|
||||
|
||||
/sys/kernel/debug/gcov
|
||||
Parent directory for all gcov-related files.
|
||||
|
||||
/sys/kernel/debug/gcov/reset
|
||||
Global reset file: resets all coverage data to zero when
|
||||
written to.
|
||||
|
||||
/sys/kernel/debug/gcov/path/to/compile/dir/file.gcda
|
||||
The actual gcov data file as understood by the gcov
|
||||
tool. Resets file coverage data to zero when written to.
|
||||
|
||||
/sys/kernel/debug/gcov/path/to/compile/dir/file.gcno
|
||||
Symbolic link to a static data file required by the gcov
|
||||
tool. This file is generated by gcc when compiling with
|
||||
option -ftest-coverage.
|
||||
|
||||
|
||||
5. Modules
|
||||
==========
|
||||
|
||||
Kernel modules may contain cleanup code which is only run during
|
||||
module unload time. The gcov mechanism provides a means to collect
|
||||
coverage data for such code by keeping a copy of the data associated
|
||||
with the unloaded module. This data remains available through debugfs.
|
||||
Once the module is loaded again, the associated coverage counters are
|
||||
initialized with the data from its previous instantiation.
|
||||
|
||||
This behavior can be deactivated by specifying the gcov_persist kernel
|
||||
parameter:
|
||||
|
||||
gcov_persist=0
|
||||
|
||||
At run-time, a user can also choose to discard data for an unloaded
|
||||
module by writing to its data file or the global reset file.
|
||||
|
||||
|
||||
6. Separated build and test machines
|
||||
====================================
|
||||
|
||||
The gcov kernel profiling infrastructure is designed to work out-of-the
|
||||
box for setups where kernels are built and run on the same machine. In
|
||||
cases where the kernel runs on a separate machine, special preparations
|
||||
must be made, depending on where the gcov tool is used:
|
||||
|
||||
a) gcov is run on the TEST machine
|
||||
|
||||
The gcov tool version on the test machine must be compatible with the
|
||||
gcc version used for kernel build. Also the following files need to be
|
||||
copied from build to test machine:
|
||||
|
||||
from the source tree:
|
||||
- all C source files + headers
|
||||
|
||||
from the build tree:
|
||||
- all C source files + headers
|
||||
- all .gcda and .gcno files
|
||||
- all links to directories
|
||||
|
||||
It is important to note that these files need to be placed into the
|
||||
exact same file system location on the test machine as on the build
|
||||
machine. If any of the path components is symbolic link, the actual
|
||||
directory needs to be used instead (due to make's CURDIR handling).
|
||||
|
||||
b) gcov is run on the BUILD machine
|
||||
|
||||
The following files need to be copied after each test case from test
|
||||
to build machine:
|
||||
|
||||
from the gcov directory in sysfs:
|
||||
- all .gcda files
|
||||
- all links to .gcno files
|
||||
|
||||
These files can be copied to any location on the build machine. gcov
|
||||
must then be called with the -o option pointing to that directory.
|
||||
|
||||
Example directory setup on the build machine:
|
||||
|
||||
/tmp/linux: kernel source tree
|
||||
/tmp/out: kernel build directory as specified by make O=
|
||||
/tmp/coverage: location of the files copied from the test machine
|
||||
|
||||
[user@build] cd /tmp/out
|
||||
[user@build] gcov -o /tmp/coverage/tmp/out/init main.c
|
||||
|
||||
|
||||
7. Troubleshooting
|
||||
==================
|
||||
|
||||
Problem: Compilation aborts during linker step.
|
||||
Cause: Profiling flags are specified for source files which are not
|
||||
linked to the main kernel or which are linked by a custom
|
||||
linker procedure.
|
||||
Solution: Exclude affected source files from profiling by specifying
|
||||
GCOV_PROFILE := n or GCOV_PROFILE_basename.o := n in the
|
||||
corresponding Makefile.
|
||||
|
||||
|
||||
Appendix A: gather_on_build.sh
|
||||
==============================
|
||||
|
||||
Sample script to gather coverage meta files on the build machine
|
||||
(see 6a):
|
||||
|
||||
#!/bin/bash
|
||||
|
||||
KSRC=$1
|
||||
KOBJ=$2
|
||||
DEST=$3
|
||||
|
||||
if [ -z "$KSRC" ] || [ -z "$KOBJ" ] || [ -z "$DEST" ]; then
|
||||
echo "Usage: $0 <ksrc directory> <kobj directory> <output.tar.gz>" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
KSRC=$(cd $KSRC; printf "all:\n\t@echo \${CURDIR}\n" | make -f -)
|
||||
KOBJ=$(cd $KOBJ; printf "all:\n\t@echo \${CURDIR}\n" | make -f -)
|
||||
|
||||
find $KSRC $KOBJ \( -name '*.gcno' -o -name '*.[ch]' -o -type l \) -a \
|
||||
-perm /u+r,g+r | tar cfz $DEST -P -T -
|
||||
|
||||
if [ $? -eq 0 ] ; then
|
||||
echo "$DEST successfully created, copy to test system and unpack with:"
|
||||
echo " tar xfz $DEST -P"
|
||||
else
|
||||
echo "Could not create file $DEST"
|
||||
fi
|
||||
|
||||
|
||||
Appendix B: gather_on_test.sh
|
||||
=============================
|
||||
|
||||
Sample script to gather coverage data files on the test machine
|
||||
(see 6b):
|
||||
|
||||
#!/bin/bash
|
||||
|
||||
DEST=$1
|
||||
GCDA=/sys/kernel/debug/gcov
|
||||
|
||||
if [ -z "$DEST" ] ; then
|
||||
echo "Usage: $0 <output.tar.gz>" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
find $GCDA -name '*.gcno' -o -name '*.gcda' | tar cfz $DEST -T -
|
||||
|
||||
if [ $? -eq 0 ] ; then
|
||||
echo "$DEST successfully created, copy to build system and unpack with:"
|
||||
echo " tar xfz $DEST"
|
||||
else
|
||||
echo "Could not create file $DEST"
|
||||
fi
|
@ -2,14 +2,18 @@ Kernel driver f71882fg
|
||||
======================
|
||||
|
||||
Supported chips:
|
||||
* Fintek F71882FG and F71883FG
|
||||
Prefix: 'f71882fg'
|
||||
* Fintek F71858FG
|
||||
Prefix: 'f71858fg'
|
||||
Addresses scanned: none, address read from Super I/O config space
|
||||
Datasheet: Available from the Fintek website
|
||||
* Fintek F71862FG and F71863FG
|
||||
Prefix: 'f71862fg'
|
||||
Addresses scanned: none, address read from Super I/O config space
|
||||
Datasheet: Available from the Fintek website
|
||||
* Fintek F71882FG and F71883FG
|
||||
Prefix: 'f71882fg'
|
||||
Addresses scanned: none, address read from Super I/O config space
|
||||
Datasheet: Available from the Fintek website
|
||||
* Fintek F8000
|
||||
Prefix: 'f8000'
|
||||
Addresses scanned: none, address read from Super I/O config space
|
||||
@ -66,13 +70,13 @@ printed when loading the driver.
|
||||
|
||||
Three different fan control modes are supported; the mode number is written
|
||||
to the pwm#_enable file. Note that not all modes are supported on all
|
||||
chips, and some modes may only be available in RPM / PWM mode on the F8000.
|
||||
chips, and some modes may only be available in RPM / PWM mode.
|
||||
Writing an unsupported mode will result in an invalid parameter error.
|
||||
|
||||
* 1: Manual mode
|
||||
You ask for a specific PWM duty cycle / DC voltage or a specific % of
|
||||
fan#_full_speed by writing to the pwm# file. This mode is only
|
||||
available on the F8000 if the fan channel is in RPM mode.
|
||||
available on the F71858FG / F8000 if the fan channel is in RPM mode.
|
||||
|
||||
* 2: Normal auto mode
|
||||
You can define a number of temperature/fan speed trip points, which % the
|
||||
|
@ -7,7 +7,7 @@ henceforth as AEM.
|
||||
Supported systems:
|
||||
* Any recent IBM System X server with AEM support.
|
||||
This includes the x3350, x3550, x3650, x3655, x3755, x3850 M2,
|
||||
x3950 M2, and certain HS2x/LS2x/QS2x blades. The IPMI host interface
|
||||
x3950 M2, and certain HC10/HS2x/LS2x/QS2x blades. The IPMI host interface
|
||||
driver ("ipmi-si") needs to be loaded for this driver to do anything.
|
||||
Prefix: 'ibmaem'
|
||||
Datasheet: Not available
|
||||
|
@ -70,6 +70,7 @@ are interpreted as 0! For more on how written strings are interpreted see the
|
||||
[0-*] denotes any positive number starting from 0
|
||||
[1-*] denotes any positive number starting from 1
|
||||
RO read only value
|
||||
WO write only value
|
||||
RW read/write value
|
||||
|
||||
Read/write values may be read-only for some chips, depending on the
|
||||
@ -295,6 +296,24 @@ temp[1-*]_label Suggested temperature channel label.
|
||||
user-space.
|
||||
RO
|
||||
|
||||
temp[1-*]_lowest
|
||||
Historical minimum temperature
|
||||
Unit: millidegree Celsius
|
||||
RO
|
||||
|
||||
temp[1-*]_highest
|
||||
Historical maximum temperature
|
||||
Unit: millidegree Celsius
|
||||
RO
|
||||
|
||||
temp[1-*]_reset_history
|
||||
Reset temp_lowest and temp_highest
|
||||
WO
|
||||
|
||||
temp_reset_history
|
||||
Reset temp_lowest and temp_highest for all sensors
|
||||
WO
|
||||
|
||||
Some chips measure temperature using external thermistors and an ADC, and
|
||||
report the temperature measurement as a voltage. Converting this voltage
|
||||
back to a temperature (or the other way around for limits) requires
|
||||
|
42
Documentation/hwmon/tmp401
Normal file
42
Documentation/hwmon/tmp401
Normal file
@ -0,0 +1,42 @@
|
||||
Kernel driver tmp401
|
||||
====================
|
||||
|
||||
Supported chips:
|
||||
* Texas Instruments TMP401
|
||||
Prefix: 'tmp401'
|
||||
Addresses scanned: I2C 0x4c
|
||||
Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp401.html
|
||||
* Texas Instruments TMP411
|
||||
Prefix: 'tmp411'
|
||||
Addresses scanned: I2C 0x4c
|
||||
Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp411.html
|
||||
|
||||
Authors:
|
||||
Hans de Goede <hdegoede@redhat.com>
|
||||
Andre Prendel <andre.prendel@gmx.de>
|
||||
|
||||
Description
|
||||
-----------
|
||||
|
||||
This driver implements support for Texas Instruments TMP401 and
|
||||
TMP411 chips. These chips implements one remote and one local
|
||||
temperature sensor. Temperature is measured in degrees
|
||||
Celsius. Resolution of the remote sensor is 0.0625 degree. Local
|
||||
sensor resolution can be set to 0.5, 0.25, 0.125 or 0.0625 degree (not
|
||||
supported by the driver so far, so using the default resolution of 0.5
|
||||
degree).
|
||||
|
||||
The driver provides the common sysfs-interface for temperatures (see
|
||||
/Documentation/hwmon/sysfs-interface under Temperatures).
|
||||
|
||||
The TMP411 chip is compatible with TMP401. It provides some additional
|
||||
features.
|
||||
|
||||
* Minimum and Maximum temperature measured since power-on, chip-reset
|
||||
|
||||
Exported via sysfs attributes tempX_lowest and tempX_highest.
|
||||
|
||||
* Reset of historical minimum/maximum temperature measurements
|
||||
|
||||
Exported via sysfs attribute temp_reset_history. Writing 1 to this
|
||||
file triggers a reset.
|
@ -12,6 +12,10 @@ Supported chips:
|
||||
Addresses scanned: ISA address retrieved from Super I/O registers
|
||||
Datasheet:
|
||||
http://www.nuvoton.com.tw/NR/rdonlyres/7885623D-A487-4CF9-A47F-30C5F73D6FE6/0/W83627DHG.pdf
|
||||
* Winbond W83627DHG-P
|
||||
Prefix: 'w83627dhg'
|
||||
Addresses scanned: ISA address retrieved from Super I/O registers
|
||||
Datasheet: not available
|
||||
* Winbond W83667HG
|
||||
Prefix: 'w83667hg'
|
||||
Addresses scanned: ISA address retrieved from Super I/O registers
|
||||
@ -28,8 +32,8 @@ Description
|
||||
-----------
|
||||
|
||||
This driver implements support for the Winbond W83627EHF, W83627EHG,
|
||||
W83627DHG and W83667HG super I/O chips. We will refer to them collectively
|
||||
as Winbond chips.
|
||||
W83627DHG, W83627DHG-P and W83667HG super I/O chips. We will refer to them
|
||||
collectively as Winbond chips.
|
||||
|
||||
The chips implement three temperature sensors, five fan rotation
|
||||
speed sensors, ten analog voltage sensors (only nine for the 627DHG), one
|
||||
@ -135,3 +139,6 @@ done in the driver for all register addresses.
|
||||
The DHG also supports PECI, where the DHG queries Intel CPU temperatures, and
|
||||
the ICH8 southbridge gets that data via PECI from the DHG, so that the
|
||||
southbridge drives the fans. And the DHG supports SST, a one-wire serial bus.
|
||||
|
||||
The DHG-P has an additional automatic fan speed control mode named Smart Fan
|
||||
(TM) III+. This mode is not yet supported by the driver.
|
||||
|
@ -19,6 +19,9 @@ Supported adapters:
|
||||
* VIA Technologies, Inc. VX800/VX820
|
||||
Datasheet: available on http://linux.via.com.tw
|
||||
|
||||
* VIA Technologies, Inc. VX855/VX875
|
||||
Datasheet: Availability unknown
|
||||
|
||||
Authors:
|
||||
Kyösti Mälkki <kmalkki@cc.hut.fi>,
|
||||
Mark D. Studebaker <mdsxyz123@yahoo.com>,
|
||||
@ -53,6 +56,7 @@ Your lspci -n listing must show one of these :
|
||||
device 1106:3287 (VT8251)
|
||||
device 1106:8324 (CX700)
|
||||
device 1106:8353 (VX800/VX820)
|
||||
device 1106:8409 (VX855/VX875)
|
||||
|
||||
If none of these show up, you should look in the BIOS for settings like
|
||||
enable ACPI / SMBus or even USB.
|
||||
|
@ -165,3 +165,47 @@ was done there. Two significant differences are:
|
||||
Once again, method 3 should be avoided wherever possible. Explicit device
|
||||
instantiation (methods 1 and 2) is much preferred for it is safer and
|
||||
faster.
|
||||
|
||||
|
||||
Method 4: Instantiate from user-space
|
||||
-------------------------------------
|
||||
|
||||
In general, the kernel should know which I2C devices are connected and
|
||||
what addresses they live at. However, in certain cases, it does not, so a
|
||||
sysfs interface was added to let the user provide the information. This
|
||||
interface is made of 2 attribute files which are created in every I2C bus
|
||||
directory: new_device and delete_device. Both files are write only and you
|
||||
must write the right parameters to them in order to properly instantiate,
|
||||
respectively delete, an I2C device.
|
||||
|
||||
File new_device takes 2 parameters: the name of the I2C device (a string)
|
||||
and the address of the I2C device (a number, typically expressed in
|
||||
hexadecimal starting with 0x, but can also be expressed in decimal.)
|
||||
|
||||
File delete_device takes a single parameter: the address of the I2C
|
||||
device. As no two devices can live at the same address on a given I2C
|
||||
segment, the address is sufficient to uniquely identify the device to be
|
||||
deleted.
|
||||
|
||||
Example:
|
||||
# echo eeprom 0x50 > /sys/class/i2c-adapter/i2c-3/new_device
|
||||
|
||||
While this interface should only be used when in-kernel device declaration
|
||||
can't be done, there is a variety of cases where it can be helpful:
|
||||
* The I2C driver usually detects devices (method 3 above) but the bus
|
||||
segment your device lives on doesn't have the proper class bit set and
|
||||
thus detection doesn't trigger.
|
||||
* The I2C driver usually detects devices, but your device lives at an
|
||||
unexpected address.
|
||||
* The I2C driver usually detects devices, but your device is not detected,
|
||||
either because the detection routine is too strict, or because your
|
||||
device is not officially supported yet but you know it is compatible.
|
||||
* You are developing a driver on a test board, where you soldered the I2C
|
||||
device yourself.
|
||||
|
||||
This interface is a replacement for the force_* module parameters some I2C
|
||||
drivers implement. Being implemented in i2c-core rather than in each
|
||||
device driver individually, it is much more efficient, and also has the
|
||||
advantage that you do not have to reload the driver to change a setting.
|
||||
You can also instantiate the device before the driver is loaded or even
|
||||
available, and you don't need to know what driver the device needs.
|
||||
|
@ -126,19 +126,9 @@ different) configuration information, as do drivers handling chip variants
|
||||
that can't be distinguished by protocol probing, or which need some board
|
||||
specific information to operate correctly.
|
||||
|
||||
Accordingly, the I2C stack now has two models for associating I2C devices
|
||||
with their drivers: the original "legacy" model, and a newer one that's
|
||||
fully compatible with the Linux 2.6 driver model. These models do not mix,
|
||||
since the "legacy" model requires drivers to create "i2c_client" device
|
||||
objects after SMBus style probing, while the Linux driver model expects
|
||||
drivers to be given such device objects in their probe() routines.
|
||||
|
||||
The legacy model is deprecated now and will soon be removed, so we no
|
||||
longer document it here.
|
||||
|
||||
|
||||
Standard Driver Model Binding ("New Style")
|
||||
-------------------------------------------
|
||||
Device/Driver Binding
|
||||
---------------------
|
||||
|
||||
System infrastructure, typically board-specific initialization code or
|
||||
boot firmware, reports what I2C devices exist. For example, there may be
|
||||
@ -201,7 +191,7 @@ a given I2C bus. This is for example the case of hardware monitoring
|
||||
devices on a PC's SMBus. In that case, you may want to let your driver
|
||||
detect supported devices automatically. This is how the legacy model
|
||||
was working, and is now available as an extension to the standard
|
||||
driver model (so that we can finally get rid of the legacy model.)
|
||||
driver model.
|
||||
|
||||
You simply have to define a detect callback which will attempt to
|
||||
identify supported devices (returning 0 for supported ones and -ENODEV
|
||||
|
@ -278,7 +278,7 @@ struct input_event {
|
||||
};
|
||||
|
||||
'time' is the timestamp, it returns the time at which the event happened.
|
||||
Type is for example EV_REL for relative moment, REL_KEY for a keypress or
|
||||
Type is for example EV_REL for relative moment, EV_KEY for a keypress or
|
||||
release. More types are defined in include/linux/input.h.
|
||||
|
||||
'code' is event code, for example REL_X or KEY_BACKSPACE, again a complete
|
||||
|
@ -67,7 +67,12 @@ data with it.
|
||||
struct rotary_encoder_platform_data is declared in
|
||||
include/linux/rotary-encoder.h and needs to be filled with the number of
|
||||
steps the encoder has and can carry information about externally inverted
|
||||
signals (because of used invertig buffer or other reasons).
|
||||
signals (because of an inverting buffer or other reasons). The encoder
|
||||
can be set up to deliver input information as either an absolute or relative
|
||||
axes. For relative axes the input event returns +/-1 for each step. For
|
||||
absolute axes the position of the encoder can either roll over between zero
|
||||
and the number of steps or will clamp at the maximum and zero depending on
|
||||
the configuration.
|
||||
|
||||
Because GPIO to IRQ mapping is platform specific, this information must
|
||||
be given in seperately to the driver. See the example below.
|
||||
@ -85,6 +90,8 @@ be given in seperately to the driver. See the example below.
|
||||
static struct rotary_encoder_platform_data my_rotary_encoder_info = {
|
||||
.steps = 24,
|
||||
.axis = ABS_X,
|
||||
.relative_axis = false,
|
||||
.rollover = false,
|
||||
.gpio_a = GPIO_ROTARY_A,
|
||||
.gpio_b = GPIO_ROTARY_B,
|
||||
.inverted_a = 0,
|
||||
|
@ -149,6 +149,8 @@ Code Seq# Include File Comments
|
||||
'p' 40-7F linux/nvram.h
|
||||
'p' 80-9F user-space parport
|
||||
<mailto:tim@cyberelk.net>
|
||||
'p' a1-a4 linux/pps.h LinuxPPS
|
||||
<mailto:giometti@linux.it>
|
||||
'q' 00-1F linux/serio.h
|
||||
'q' 80-FF Internet PhoneJACK, Internet LineJACK
|
||||
<http://www.quicknet.net>
|
||||
|
@ -14,39 +14,37 @@ README
|
||||
- general info on what you need and what to do for Linux ISDN.
|
||||
README.FAQ
|
||||
- general info for FAQ.
|
||||
README.audio
|
||||
- info for running audio over ISDN.
|
||||
README.fax
|
||||
- info for using Fax over ISDN.
|
||||
README.gigaset
|
||||
- info on the drivers for Siemens Gigaset ISDN adapters.
|
||||
README.icn
|
||||
- info on the ICN-ISDN-card and its driver.
|
||||
README.HiSax
|
||||
- info on the HiSax driver which replaces the old teles.
|
||||
README.hfc-pci
|
||||
- info on hfc-pci based cards.
|
||||
README.pcbit
|
||||
- info on the PCBIT-D ISDN adapter and driver.
|
||||
README.syncppp
|
||||
- info on running Sync PPP over ISDN.
|
||||
syncPPP.FAQ
|
||||
- frequently asked questions about running PPP over ISDN.
|
||||
README.avmb1
|
||||
- info on driver for AVM-B1 ISDN card.
|
||||
README.act2000
|
||||
- info on driver for IBM ACT-2000 card.
|
||||
README.eicon
|
||||
- info on driver for Eicon active cards.
|
||||
README.audio
|
||||
- info for running audio over ISDN.
|
||||
README.avmb1
|
||||
- info on driver for AVM-B1 ISDN card.
|
||||
README.concap
|
||||
- info on "CONCAP" encapsulation protocol interface used for X.25.
|
||||
README.diversion
|
||||
- info on module for isdn diversion services.
|
||||
README.fax
|
||||
- info for using Fax over ISDN.
|
||||
README.gigaset
|
||||
- info on the drivers for Siemens Gigaset ISDN adapters
|
||||
README.hfc-pci
|
||||
- info on hfc-pci based cards.
|
||||
README.hysdn
|
||||
- info on driver for Hypercope active HYSDN cards
|
||||
README.icn
|
||||
- info on the ICN-ISDN-card and its driver.
|
||||
README.mISDN
|
||||
- info on the Modular ISDN subsystem (mISDN)
|
||||
README.pcbit
|
||||
- info on the PCBIT-D ISDN adapter and driver.
|
||||
README.sc
|
||||
- info on driver for Spellcaster cards.
|
||||
README.syncppp
|
||||
- info on running Sync PPP over ISDN.
|
||||
README.x25
|
||||
- info for running X.25 over ISDN.
|
||||
README.hysdn
|
||||
- info on driver for Hypercope active HYSDN cards
|
||||
README.mISDN
|
||||
- info on the Modular ISDN subsystem (mISDN).
|
||||
syncPPP.FAQ
|
||||
- frequently asked questions about running PPP over ISDN.
|
||||
|
@ -45,7 +45,7 @@ From then on, Kernel CAPI may call the registered callback functions for the
|
||||
device.
|
||||
|
||||
If the device becomes unusable for any reason (shutdown, disconnect ...), the
|
||||
driver has to call capi_ctr_reseted(). This will prevent further calls to the
|
||||
driver has to call capi_ctr_down(). This will prevent further calls to the
|
||||
callback functions by Kernel CAPI.
|
||||
|
||||
|
||||
@ -114,20 +114,36 @@ char *driver_name
|
||||
int (*load_firmware)(struct capi_ctr *ctrlr, capiloaddata *ldata)
|
||||
(optional) pointer to a callback function for sending firmware and
|
||||
configuration data to the device
|
||||
Return value: 0 on success, error code on error
|
||||
Called in process context.
|
||||
|
||||
void (*reset_ctr)(struct capi_ctr *ctrlr)
|
||||
pointer to a callback function for performing a reset on the device,
|
||||
releasing all registered applications
|
||||
(optional) pointer to a callback function for performing a reset on
|
||||
the device, releasing all registered applications
|
||||
Called in process context.
|
||||
|
||||
void (*register_appl)(struct capi_ctr *ctrlr, u16 applid,
|
||||
capi_register_params *rparam)
|
||||
void (*release_appl)(struct capi_ctr *ctrlr, u16 applid)
|
||||
pointers to callback functions for registration and deregistration of
|
||||
applications with the device
|
||||
Calls to these functions are serialized by Kernel CAPI so that only
|
||||
one call to any of them is active at any time.
|
||||
|
||||
u16 (*send_message)(struct capi_ctr *ctrlr, struct sk_buff *skb)
|
||||
pointer to a callback function for sending a CAPI message to the
|
||||
device
|
||||
Return value: CAPI error code
|
||||
If the method returns 0 (CAPI_NOERROR) the driver has taken ownership
|
||||
of the skb and the caller may no longer access it. If it returns a
|
||||
non-zero (error) value then ownership of the skb returns to the caller
|
||||
who may reuse or free it.
|
||||
The return value should only be used to signal problems with respect
|
||||
to accepting or queueing the message. Errors occurring during the
|
||||
actual processing of the message should be signaled with an
|
||||
appropriate reply message.
|
||||
Calls to this function are not serialized by Kernel CAPI, ie. it must
|
||||
be prepared to be re-entered.
|
||||
|
||||
char *(*procinfo)(struct capi_ctr *ctrlr)
|
||||
pointer to a callback function returning the entry for the device in
|
||||
@ -138,6 +154,8 @@ read_proc_t *ctr_read_proc
|
||||
system entry, /proc/capi/controllers/<n>; will be called with a
|
||||
pointer to the device's capi_ctr structure as the last (data) argument
|
||||
|
||||
Note: Callback functions are never called in interrupt context.
|
||||
|
||||
- to be filled in before calling capi_ctr_ready():
|
||||
|
||||
u8 manu[CAPI_MANUFACTURER_LEN]
|
||||
@ -153,6 +171,45 @@ u8 serial[CAPI_SERIAL_LEN]
|
||||
value to return for CAPI_GET_SERIAL
|
||||
|
||||
|
||||
4.3 The _cmsg Structure
|
||||
|
||||
(declared in <linux/isdn/capiutil.h>)
|
||||
|
||||
The _cmsg structure stores the contents of a CAPI 2.0 message in an easily
|
||||
accessible form. It contains members for all possible CAPI 2.0 parameters, of
|
||||
which only those appearing in the message type currently being processed are
|
||||
actually used. Unused members should be set to zero.
|
||||
|
||||
Members are named after the CAPI 2.0 standard names of the parameters they
|
||||
represent. See <linux/isdn/capiutil.h> for the exact spelling. Member data
|
||||
types are:
|
||||
|
||||
u8 for CAPI parameters of type 'byte'
|
||||
|
||||
u16 for CAPI parameters of type 'word'
|
||||
|
||||
u32 for CAPI parameters of type 'dword'
|
||||
|
||||
_cstruct for CAPI parameters of type 'struct' not containing any
|
||||
variably-sized (struct) subparameters (eg. 'Called Party Number')
|
||||
The member is a pointer to a buffer containing the parameter in
|
||||
CAPI encoding (length + content). It may also be NULL, which will
|
||||
be taken to represent an empty (zero length) parameter.
|
||||
|
||||
_cmstruct for CAPI parameters of type 'struct' containing 'struct'
|
||||
subparameters ('Additional Info' and 'B Protocol')
|
||||
The representation is a single byte containing one of the values:
|
||||
CAPI_DEFAULT: the parameter is empty
|
||||
CAPI_COMPOSE: the values of the subparameters are stored
|
||||
individually in the corresponding _cmsg structure members
|
||||
|
||||
Functions capi_cmsg2message() and capi_message2cmsg() are provided to convert
|
||||
messages between their transport encoding described in the CAPI 2.0 standard
|
||||
and their _cmsg structure representation. Note that capi_cmsg2message() does
|
||||
not know or check the size of its destination buffer. The caller must make
|
||||
sure it is big enough to accomodate the resulting CAPI message.
|
||||
|
||||
|
||||
5. Lower Layer Interface Functions
|
||||
|
||||
(declared in <linux/isdn/capilli.h>)
|
||||
@ -166,7 +223,7 @@ int detach_capi_ctr(struct capi_ctr *ctrlr)
|
||||
register/unregister a device (controller) with Kernel CAPI
|
||||
|
||||
void capi_ctr_ready(struct capi_ctr *ctrlr)
|
||||
void capi_ctr_reseted(struct capi_ctr *ctrlr)
|
||||
void capi_ctr_down(struct capi_ctr *ctrlr)
|
||||
signal controller ready/not ready
|
||||
|
||||
void capi_ctr_suspend_output(struct capi_ctr *ctrlr)
|
||||
@ -211,3 +268,32 @@ CAPIMSG_CONTROL(m) CAPIMSG_SETCONTROL(m, contr) Controller/PLCI/NCCI
|
||||
(u32)
|
||||
CAPIMSG_DATALEN(m) CAPIMSG_SETDATALEN(m, len) Data Length (u16)
|
||||
|
||||
|
||||
Library functions for working with _cmsg structures
|
||||
(from <linux/isdn/capiutil.h>):
|
||||
|
||||
unsigned capi_cmsg2message(_cmsg *cmsg, u8 *msg)
|
||||
Assembles a CAPI 2.0 message from the parameters in *cmsg, storing the
|
||||
result in *msg.
|
||||
|
||||
unsigned capi_message2cmsg(_cmsg *cmsg, u8 *msg)
|
||||
Disassembles the CAPI 2.0 message in *msg, storing the parameters in
|
||||
*cmsg.
|
||||
|
||||
unsigned capi_cmsg_header(_cmsg *cmsg, u16 ApplId, u8 Command, u8 Subcommand,
|
||||
u16 Messagenumber, u32 Controller)
|
||||
Fills the header part and address field of the _cmsg structure *cmsg
|
||||
with the given values, zeroing the remainder of the structure so only
|
||||
parameters with non-default values need to be changed before sending
|
||||
the message.
|
||||
|
||||
void capi_cmsg_answer(_cmsg *cmsg)
|
||||
Sets the low bit of the Subcommand field in *cmsg, thereby converting
|
||||
_REQ to _CONF and _IND to _RESP.
|
||||
|
||||
char *capi_cmd2str(u8 Command, u8 Subcommand)
|
||||
Returns the CAPI 2.0 message name corresponding to the given command
|
||||
and subcommand values, as a static ASCII string. The return value may
|
||||
be NULL if the command/subcommand is not one of those defined in the
|
||||
CAPI 2.0 standard.
|
||||
|
||||
|
@ -149,10 +149,8 @@ GigaSet 307x Device Driver
|
||||
configuration files and chat scripts in the gigaset-VERSION/ppp directory
|
||||
in the driver packages from http://sourceforge.net/projects/gigaset307x/.
|
||||
Please note that the USB drivers are not able to change the state of the
|
||||
control lines (the M105 driver can be configured to use some undocumented
|
||||
control requests, if you really need the control lines, though). This means
|
||||
you must use "Stupid Mode" if you are using wvdial or you should use the
|
||||
nocrtscts option of pppd.
|
||||
control lines. This means you must use "Stupid Mode" if you are using
|
||||
wvdial or you should use the nocrtscts option of pppd.
|
||||
You must also assure that the ppp_async module is loaded with the parameter
|
||||
flag_time=0. You can do this e.g. by adding a line like
|
||||
|
||||
@ -190,20 +188,19 @@ GigaSet 307x Device Driver
|
||||
You can also use /sys/class/tty/ttyGxy/cidmode for changing the CID mode
|
||||
setting (ttyGxy is ttyGU0 or ttyGB0).
|
||||
|
||||
2.6. M105 Undocumented USB Requests
|
||||
------------------------------
|
||||
|
||||
The Gigaset M105 USB data box understands a couple of useful, but
|
||||
undocumented USB commands. These requests are not used in normal
|
||||
operation (for wireless access to the base), but are needed for access
|
||||
to the M105's own configuration mode (registration to the base, baudrate
|
||||
and line format settings, device status queries) via the gigacontr
|
||||
utility. Their use is controlled by the kernel configuration option
|
||||
"Support for undocumented USB requests" (CONFIG_GIGASET_UNDOCREQ). If you
|
||||
encounter error code -ENOTTY when trying to use some features of the
|
||||
M105, try setting that option to "y" via 'make {x,menu}config' and
|
||||
recompiling the driver.
|
||||
2.6. Unregistered Wireless Devices (M101/M105)
|
||||
-----------------------------------------
|
||||
The main purpose of the ser_gigaset and usb_gigaset drivers is to allow
|
||||
the M101 and M105 wireless devices to be used as ISDN devices for ISDN
|
||||
connections through a Gigaset base. Therefore they assume that the device
|
||||
is registered to a DECT base.
|
||||
|
||||
If the M101/M105 device is not registered to a base, initialization of
|
||||
the device fails, and a corresponding error message is logged by the
|
||||
driver. In that situation, a restricted set of functions is available
|
||||
which includes, in particular, those necessary for registering the device
|
||||
to a base or for switching it between Fixed Part and Portable Part
|
||||
modes.
|
||||
|
||||
3. Troubleshooting
|
||||
---------------
|
||||
@ -234,11 +231,12 @@ GigaSet 307x Device Driver
|
||||
Select Unimodem mode for all DECT data adapters. (see section 2.4.)
|
||||
|
||||
Problem:
|
||||
You want to configure your USB DECT data adapter (M105) but gigacontr
|
||||
reports an error: "/dev/ttyGU0: Inappropriate ioctl for device".
|
||||
Messages like this:
|
||||
usb_gigaset 3-2:1.0: Could not initialize the device.
|
||||
appear in your syslog.
|
||||
Solution:
|
||||
Recompile the usb_gigaset driver with the kernel configuration option
|
||||
CONFIG_GIGASET_UNDOCREQ set to 'y'. (see section 2.6.)
|
||||
Check whether your M10x wireless device is correctly registered to the
|
||||
Gigaset base. (see section 2.6.)
|
||||
|
||||
3.2. Telling the driver to provide more information
|
||||
----------------------------------------------
|
||||
|
@ -75,7 +75,7 @@ Linux カーネルパッチ投稿者向けチェックリスト
|
||||
ビルドした上、動作確認を行ってください。
|
||||
|
||||
14: もしパッチがディスクのI/O性能などに影響を与えるようであれば、
|
||||
'CONFIG_LBD'オプションを有効にした場合と無効にした場合の両方で
|
||||
'CONFIG_LBDAF'オプションを有効にした場合と無効にした場合の両方で
|
||||
テストを実施してみてください。
|
||||
|
||||
15: lockdepの機能を全て有効にした上で、全てのコードパスを評価してください。
|
||||
|
@ -48,6 +48,7 @@ parameter is applicable:
|
||||
EFI EFI Partitioning (GPT) is enabled
|
||||
EIDE EIDE/ATAPI support is enabled.
|
||||
FB The frame buffer device is enabled.
|
||||
GCOV GCOV profiling is enabled.
|
||||
HW Appropriate hardware is enabled.
|
||||
IA-64 IA-64 architecture is enabled.
|
||||
IMA Integrity measurement architecture is enabled.
|
||||
@ -228,14 +229,6 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
to assume that this machine's pmtimer latches its value
|
||||
and always returns good values.
|
||||
|
||||
acpi.power_nocheck= [HW,ACPI]
|
||||
Format: 1/0 enable/disable the check of power state.
|
||||
On some bogus BIOS the _PSC object/_STA object of
|
||||
power resource can't return the correct device power
|
||||
state. In such case it is unneccessary to check its
|
||||
power state again in power transition.
|
||||
1 : disable the power state check
|
||||
|
||||
acpi_sci= [HW,ACPI] ACPI System Control Interrupt trigger mode
|
||||
Format: { level | edge | high | low }
|
||||
|
||||
@ -491,6 +484,13 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
Also note the kernel might malfunction if you disable
|
||||
some critical bits.
|
||||
|
||||
cmo_free_hint= [PPC] Format: { yes | no }
|
||||
Specify whether pages are marked as being inactive
|
||||
when they are freed. This is used in CMO environments
|
||||
to determine OS memory pressure for page stealing by
|
||||
a hypervisor.
|
||||
Default: yes
|
||||
|
||||
code_bytes [X86] How many bytes of object code to print
|
||||
in an oops report.
|
||||
Range: 0 - 8192
|
||||
@ -539,6 +539,10 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
console=brl,ttyS0
|
||||
For now, only VisioBraille is supported.
|
||||
|
||||
consoleblank= [KNL] The console blank (screen saver) timeout in
|
||||
seconds. Defaults to 10*60 = 10mins. A value of 0
|
||||
disables the blank timer.
|
||||
|
||||
coredump_filter=
|
||||
[KNL] Change the default value for
|
||||
/proc/<pid>/coredump_filter.
|
||||
@ -785,6 +789,12 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
Format: off | on
|
||||
default: on
|
||||
|
||||
gcov_persist= [GCOV] When non-zero (default), profiling data for
|
||||
kernel modules is saved and remains accessible via
|
||||
debugfs, even when the module is unloaded/reloaded.
|
||||
When zero, profiling data is discarded and associated
|
||||
debugfs files are removed at module unload time.
|
||||
|
||||
gdth= [HW,SCSI]
|
||||
See header of drivers/scsi/gdth.c.
|
||||
|
||||
@ -988,6 +998,7 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
nomerge
|
||||
forcesac
|
||||
soft
|
||||
pt [x86, IA64]
|
||||
|
||||
io7= [HW] IO7 for Marvel based alpha systems
|
||||
See comment before marvel_specify_io7 in
|
||||
@ -1351,6 +1362,27 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
min_addr=nn[KMG] [KNL,BOOT,ia64] All physical memory below this
|
||||
physical address is ignored.
|
||||
|
||||
mini2440= [ARM,HW,KNL]
|
||||
Format:[0..2][b][c][t]
|
||||
Default: "0tb"
|
||||
MINI2440 configuration specification:
|
||||
0 - The attached screen is the 3.5" TFT
|
||||
1 - The attached screen is the 7" TFT
|
||||
2 - The VGA Shield is attached (1024x768)
|
||||
Leaving out the screen size parameter will not load
|
||||
the TFT driver, and the framebuffer will be left
|
||||
unconfigured.
|
||||
b - Enable backlight. The TFT backlight pin will be
|
||||
linked to the kernel VESA blanking code and a GPIO
|
||||
LED. This parameter is not necessary when using the
|
||||
VGA shield.
|
||||
c - Enable the s3c camera interface.
|
||||
t - Reserved for enabling touchscreen support. The
|
||||
touchscreen support is not enabled in the mainstream
|
||||
kernel as of 2.6.30, a preliminary port can be found
|
||||
in the "bleeding edge" mini2440 support kernel at
|
||||
http://repo.or.cz/w/linux-2.6/mini2440.git
|
||||
|
||||
mminit_loglevel=
|
||||
[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
|
||||
parameter allows control of the logging verbosity for
|
||||
@ -1392,6 +1424,16 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
mtdparts= [MTD]
|
||||
See drivers/mtd/cmdlinepart.c.
|
||||
|
||||
onenand.bdry= [HW,MTD] Flex-OneNAND Boundary Configuration
|
||||
|
||||
Format: [die0_boundary][,die0_lock][,die1_boundary][,die1_lock]
|
||||
|
||||
boundary - index of last SLC block on Flex-OneNAND.
|
||||
The remaining blocks are configured as MLC blocks.
|
||||
lock - Configure if Flex-OneNAND boundary should be locked.
|
||||
Once locked, the boundary cannot be changed.
|
||||
1 indicates lock status, 0 indicates unlock status.
|
||||
|
||||
mtdset= [ARM]
|
||||
ARM/S3C2412 JIVE boot control
|
||||
|
||||
@ -1758,6 +1800,9 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
root domains (aka PCI segments, in ACPI-speak).
|
||||
nommconf [X86] Disable use of MMCONFIG for PCI
|
||||
Configuration
|
||||
check_enable_amd_mmconf [X86] check for and enable
|
||||
properly configured MMIO access to PCI
|
||||
config space on AMD family 10h CPU
|
||||
nomsi [MSI] If the PCI_MSI kernel config parameter is
|
||||
enabled, this kernel boot option can be used to
|
||||
disable the use of MSI interrupts system-wide.
|
||||
@ -1847,6 +1892,12 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
PAGE_SIZE is used as alignment.
|
||||
PCI-PCI bridge can be specified, if resource
|
||||
windows need to be expanded.
|
||||
ecrc= Enable/disable PCIe ECRC (transaction layer
|
||||
end-to-end CRC checking).
|
||||
bios: Use BIOS/firmware settings. This is the
|
||||
the default.
|
||||
off: Turn ECRC off
|
||||
on: Turn ECRC on.
|
||||
|
||||
pcie_aspm= [PCIE] Forcibly enable or disable PCIe Active State Power
|
||||
Management.
|
||||
@ -1864,6 +1915,12 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
Format: { 0 | 1 }
|
||||
See arch/parisc/kernel/pdc_chassis.c
|
||||
|
||||
percpu_alloc= [X86] Select which percpu first chunk allocator to use.
|
||||
Allowed values are one of "lpage", "embed" and "4k".
|
||||
See comments in arch/x86/kernel/setup_percpu.c for
|
||||
details on each allocator. This parameter is primarily
|
||||
for debugging and performance comparison.
|
||||
|
||||
pf. [PARIDE]
|
||||
See Documentation/blockdev/paride.txt.
|
||||
|
||||
@ -2416,7 +2473,8 @@ and is between 256 and 4096 characters. It is defined in the file
|
||||
|
||||
tp720= [HW,PS2]
|
||||
|
||||
trace_buf_size=nn[KMG] [ftrace] will set tracing buffer size.
|
||||
trace_buf_size=nn[KMG]
|
||||
[FTRACE] will set tracing buffer size.
|
||||
|
||||
trix= [HW,OSS] MediaTrix AudioTrix Pro
|
||||
Format:
|
||||
|
773
Documentation/kmemcheck.txt
Normal file
773
Documentation/kmemcheck.txt
Normal file
@ -0,0 +1,773 @@
|
||||
GETTING STARTED WITH KMEMCHECK
|
||||
==============================
|
||||
|
||||
Vegard Nossum <vegardno@ifi.uio.no>
|
||||
|
||||
|
||||
Contents
|
||||
========
|
||||
0. Introduction
|
||||
1. Downloading
|
||||
2. Configuring and compiling
|
||||
3. How to use
|
||||
3.1. Booting
|
||||
3.2. Run-time enable/disable
|
||||
3.3. Debugging
|
||||
3.4. Annotating false positives
|
||||
4. Reporting errors
|
||||
5. Technical description
|
||||
|
||||
|
||||
0. Introduction
|
||||
===============
|
||||
|
||||
kmemcheck is a debugging feature for the Linux Kernel. More specifically, it
|
||||
is a dynamic checker that detects and warns about some uses of uninitialized
|
||||
memory.
|
||||
|
||||
Userspace programmers might be familiar with Valgrind's memcheck. The main
|
||||
difference between memcheck and kmemcheck is that memcheck works for userspace
|
||||
programs only, and kmemcheck works for the kernel only. The implementations
|
||||
are of course vastly different. Because of this, kmemcheck is not as accurate
|
||||
as memcheck, but it turns out to be good enough in practice to discover real
|
||||
programmer errors that the compiler is not able to find through static
|
||||
analysis.
|
||||
|
||||
Enabling kmemcheck on a kernel will probably slow it down to the extent that
|
||||
the machine will not be usable for normal workloads such as e.g. an
|
||||
interactive desktop. kmemcheck will also cause the kernel to use about twice
|
||||
as much memory as normal. For this reason, kmemcheck is strictly a debugging
|
||||
feature.
|
||||
|
||||
|
||||
1. Downloading
|
||||
==============
|
||||
|
||||
kmemcheck can only be downloaded using git. If you want to write patches
|
||||
against the current code, you should use the kmemcheck development branch of
|
||||
the tip tree. It is also possible to use the linux-next tree, which also
|
||||
includes the latest version of kmemcheck.
|
||||
|
||||
Assuming that you've already cloned the linux-2.6.git repository, all you
|
||||
have to do is add the -tip tree as a remote, like this:
|
||||
|
||||
$ git remote add tip git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git
|
||||
|
||||
To actually download the tree, fetch the remote:
|
||||
|
||||
$ git fetch tip
|
||||
|
||||
And to check out a new local branch with the kmemcheck code:
|
||||
|
||||
$ git checkout -b kmemcheck tip/kmemcheck
|
||||
|
||||
General instructions for the -tip tree can be found here:
|
||||
http://people.redhat.com/mingo/tip.git/readme.txt
|
||||
|
||||
|
||||
2. Configuring and compiling
|
||||
============================
|
||||
|
||||
kmemcheck only works for the x86 (both 32- and 64-bit) platform. A number of
|
||||
configuration variables must have specific settings in order for the kmemcheck
|
||||
menu to even appear in "menuconfig". These are:
|
||||
|
||||
o CONFIG_CC_OPTIMIZE_FOR_SIZE=n
|
||||
|
||||
This option is located under "General setup" / "Optimize for size".
|
||||
|
||||
Without this, gcc will use certain optimizations that usually lead to
|
||||
false positive warnings from kmemcheck. An example of this is a 16-bit
|
||||
field in a struct, where gcc may load 32 bits, then discard the upper
|
||||
16 bits. kmemcheck sees only the 32-bit load, and may trigger a
|
||||
warning for the upper 16 bits (if they're uninitialized).
|
||||
|
||||
o CONFIG_SLAB=y or CONFIG_SLUB=y
|
||||
|
||||
This option is located under "General setup" / "Choose SLAB
|
||||
allocator".
|
||||
|
||||
o CONFIG_FUNCTION_TRACER=n
|
||||
|
||||
This option is located under "Kernel hacking" / "Tracers" / "Kernel
|
||||
Function Tracer"
|
||||
|
||||
When function tracing is compiled in, gcc emits a call to another
|
||||
function at the beginning of every function. This means that when the
|
||||
page fault handler is called, the ftrace framework will be called
|
||||
before kmemcheck has had a chance to handle the fault. If ftrace then
|
||||
modifies memory that was tracked by kmemcheck, the result is an
|
||||
endless recursive page fault.
|
||||
|
||||
o CONFIG_DEBUG_PAGEALLOC=n
|
||||
|
||||
This option is located under "Kernel hacking" / "Debug page memory
|
||||
allocations".
|
||||
|
||||
In addition, I highly recommend turning on CONFIG_DEBUG_INFO=y. This is also
|
||||
located under "Kernel hacking". With this, you will be able to get line number
|
||||
information from the kmemcheck warnings, which is extremely valuable in
|
||||
debugging a problem. This option is not mandatory, however, because it slows
|
||||
down the compilation process and produces a much bigger kernel image.
|
||||
|
||||
Now the kmemcheck menu should be visible (under "Kernel hacking" / "kmemcheck:
|
||||
trap use of uninitialized memory"). Here follows a description of the
|
||||
kmemcheck configuration variables:
|
||||
|
||||
o CONFIG_KMEMCHECK
|
||||
|
||||
This must be enabled in order to use kmemcheck at all...
|
||||
|
||||
o CONFIG_KMEMCHECK_[DISABLED | ENABLED | ONESHOT]_BY_DEFAULT
|
||||
|
||||
This option controls the status of kmemcheck at boot-time. "Enabled"
|
||||
will enable kmemcheck right from the start, "disabled" will boot the
|
||||
kernel as normal (but with the kmemcheck code compiled in, so it can
|
||||
be enabled at run-time after the kernel has booted), and "one-shot" is
|
||||
a special mode which will turn kmemcheck off automatically after
|
||||
detecting the first use of uninitialized memory.
|
||||
|
||||
If you are using kmemcheck to actively debug a problem, then you
|
||||
probably want to choose "enabled" here.
|
||||
|
||||
The one-shot mode is mostly useful in automated test setups because it
|
||||
can prevent floods of warnings and increase the chances of the machine
|
||||
surviving in case something is really wrong. In other cases, the one-
|
||||
shot mode could actually be counter-productive because it would turn
|
||||
itself off at the very first error -- in the case of a false positive
|
||||
too -- and this would come in the way of debugging the specific
|
||||
problem you were interested in.
|
||||
|
||||
If you would like to use your kernel as normal, but with a chance to
|
||||
enable kmemcheck in case of some problem, it might be a good idea to
|
||||
choose "disabled" here. When kmemcheck is disabled, most of the run-
|
||||
time overhead is not incurred, and the kernel will be almost as fast
|
||||
as normal.
|
||||
|
||||
o CONFIG_KMEMCHECK_QUEUE_SIZE
|
||||
|
||||
Select the maximum number of error reports to store in an internal
|
||||
(fixed-size) buffer. Since errors can occur virtually anywhere and in
|
||||
any context, we need a temporary storage area which is guaranteed not
|
||||
to generate any other page faults when accessed. The queue will be
|
||||
emptied as soon as a tasklet may be scheduled. If the queue is full,
|
||||
new error reports will be lost.
|
||||
|
||||
The default value of 64 is probably fine. If some code produces more
|
||||
than 64 errors within an irqs-off section, then the code is likely to
|
||||
produce many, many more, too, and these additional reports seldom give
|
||||
any more information (the first report is usually the most valuable
|
||||
anyway).
|
||||
|
||||
This number might have to be adjusted if you are not using serial
|
||||
console or similar to capture the kernel log. If you are using the
|
||||
"dmesg" command to save the log, then getting a lot of kmemcheck
|
||||
warnings might overflow the kernel log itself, and the earlier reports
|
||||
will get lost in that way instead. Try setting this to 10 or so on
|
||||
such a setup.
|
||||
|
||||
o CONFIG_KMEMCHECK_SHADOW_COPY_SHIFT
|
||||
|
||||
Select the number of shadow bytes to save along with each entry of the
|
||||
error-report queue. These bytes indicate what parts of an allocation
|
||||
are initialized, uninitialized, etc. and will be displayed when an
|
||||
error is detected to help the debugging of a particular problem.
|
||||
|
||||
The number entered here is actually the logarithm of the number of
|
||||
bytes that will be saved. So if you pick for example 5 here, kmemcheck
|
||||
will save 2^5 = 32 bytes.
|
||||
|
||||
The default value should be fine for debugging most problems. It also
|
||||
fits nicely within 80 columns.
|
||||
|
||||
o CONFIG_KMEMCHECK_PARTIAL_OK
|
||||
|
||||
This option (when enabled) works around certain GCC optimizations that
|
||||
produce 32-bit reads from 16-bit variables where the upper 16 bits are
|
||||
thrown away afterwards.
|
||||
|
||||
The default value (enabled) is recommended. This may of course hide
|
||||
some real errors, but disabling it would probably produce a lot of
|
||||
false positives.
|
||||
|
||||
o CONFIG_KMEMCHECK_BITOPS_OK
|
||||
|
||||
This option silences warnings that would be generated for bit-field
|
||||
accesses where not all the bits are initialized at the same time. This
|
||||
may also hide some real bugs.
|
||||
|
||||
This option is probably obsolete, or it should be replaced with
|
||||
the kmemcheck-/bitfield-annotations for the code in question. The
|
||||
default value is therefore fine.
|
||||
|
||||
Now compile the kernel as usual.
|
||||
|
||||
|
||||
3. How to use
|
||||
=============
|
||||
|
||||
3.1. Booting
|
||||
============
|
||||
|
||||
First some information about the command-line options. There is only one
|
||||
option specific to kmemcheck, and this is called "kmemcheck". It can be used
|
||||
to override the default mode as chosen by the CONFIG_KMEMCHECK_*_BY_DEFAULT
|
||||
option. Its possible settings are:
|
||||
|
||||
o kmemcheck=0 (disabled)
|
||||
o kmemcheck=1 (enabled)
|
||||
o kmemcheck=2 (one-shot mode)
|
||||
|
||||
If SLUB debugging has been enabled in the kernel, it may take precedence over
|
||||
kmemcheck in such a way that the slab caches which are under SLUB debugging
|
||||
will not be tracked by kmemcheck. In order to ensure that this doesn't happen
|
||||
(even though it shouldn't by default), use SLUB's boot option "slub_debug",
|
||||
like this: slub_debug=-
|
||||
|
||||
In fact, this option may also be used for fine-grained control over SLUB vs.
|
||||
kmemcheck. For example, if the command line includes "kmemcheck=1
|
||||
slub_debug=,dentry", then SLUB debugging will be used only for the "dentry"
|
||||
slab cache, and with kmemcheck tracking all the other caches. This is advanced
|
||||
usage, however, and is not generally recommended.
|
||||
|
||||
|
||||
3.2. Run-time enable/disable
|
||||
============================
|
||||
|
||||
When the kernel has booted, it is possible to enable or disable kmemcheck at
|
||||
run-time. WARNING: This feature is still experimental and may cause false
|
||||
positive warnings to appear. Therefore, try not to use this. If you find that
|
||||
it doesn't work properly (e.g. you see an unreasonable amount of warnings), I
|
||||
will be happy to take bug reports.
|
||||
|
||||
Use the file /proc/sys/kernel/kmemcheck for this purpose, e.g.:
|
||||
|
||||
$ echo 0 > /proc/sys/kernel/kmemcheck # disables kmemcheck
|
||||
|
||||
The numbers are the same as for the kmemcheck= command-line option.
|
||||
|
||||
|
||||
3.3. Debugging
|
||||
==============
|
||||
|
||||
A typical report will look something like this:
|
||||
|
||||
WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88003e4a2024)
|
||||
80000000000000000000000000000000000000000088ffff0000000000000000
|
||||
i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u
|
||||
^
|
||||
|
||||
Pid: 1856, comm: ntpdate Not tainted 2.6.29-rc5 #264 945P-A
|
||||
RIP: 0010:[<ffffffff8104ede8>] [<ffffffff8104ede8>] __dequeue_signal+0xc8/0x190
|
||||
RSP: 0018:ffff88003cdf7d98 EFLAGS: 00210002
|
||||
RAX: 0000000000000030 RBX: ffff88003d4ea968 RCX: 0000000000000009
|
||||
RDX: ffff88003e5d6018 RSI: ffff88003e5d6024 RDI: ffff88003cdf7e84
|
||||
RBP: ffff88003cdf7db8 R08: ffff88003e5d6000 R09: 0000000000000000
|
||||
R10: 0000000000000080 R11: 0000000000000000 R12: 000000000000000e
|
||||
R13: ffff88003cdf7e78 R14: ffff88003d530710 R15: ffff88003d5a98c8
|
||||
FS: 0000000000000000(0000) GS:ffff880001982000(0063) knlGS:00000
|
||||
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
|
||||
CR2: ffff88003f806ea0 CR3: 000000003c036000 CR4: 00000000000006a0
|
||||
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
|
||||
DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
|
||||
[<ffffffff8104f04e>] dequeue_signal+0x8e/0x170
|
||||
[<ffffffff81050bd8>] get_signal_to_deliver+0x98/0x390
|
||||
[<ffffffff8100b87d>] do_notify_resume+0xad/0x7d0
|
||||
[<ffffffff8100c7b5>] int_signal+0x12/0x17
|
||||
[<ffffffffffffffff>] 0xffffffffffffffff
|
||||
|
||||
The single most valuable information in this report is the RIP (or EIP on 32-
|
||||
bit) value. This will help us pinpoint exactly which instruction that caused
|
||||
the warning.
|
||||
|
||||
If your kernel was compiled with CONFIG_DEBUG_INFO=y, then all we have to do
|
||||
is give this address to the addr2line program, like this:
|
||||
|
||||
$ addr2line -e vmlinux -i ffffffff8104ede8
|
||||
arch/x86/include/asm/string_64.h:12
|
||||
include/asm-generic/siginfo.h:287
|
||||
kernel/signal.c:380
|
||||
kernel/signal.c:410
|
||||
|
||||
The "-e vmlinux" tells addr2line which file to look in. IMPORTANT: This must
|
||||
be the vmlinux of the kernel that produced the warning in the first place! If
|
||||
not, the line number information will almost certainly be wrong.
|
||||
|
||||
The "-i" tells addr2line to also print the line numbers of inlined functions.
|
||||
In this case, the flag was very important, because otherwise, it would only
|
||||
have printed the first line, which is just a call to memcpy(), which could be
|
||||
called from a thousand places in the kernel, and is therefore not very useful.
|
||||
These inlined functions would not show up in the stack trace above, simply
|
||||
because the kernel doesn't load the extra debugging information. This
|
||||
technique can of course be used with ordinary kernel oopses as well.
|
||||
|
||||
In this case, it's the caller of memcpy() that is interesting, and it can be
|
||||
found in include/asm-generic/siginfo.h, line 287:
|
||||
|
||||
281 static inline void copy_siginfo(struct siginfo *to, struct siginfo *from)
|
||||
282 {
|
||||
283 if (from->si_code < 0)
|
||||
284 memcpy(to, from, sizeof(*to));
|
||||
285 else
|
||||
286 /* _sigchld is currently the largest know union member */
|
||||
287 memcpy(to, from, __ARCH_SI_PREAMBLE_SIZE + sizeof(from->_sifields._sigchld));
|
||||
288 }
|
||||
|
||||
Since this was a read (kmemcheck usually warns about reads only, though it can
|
||||
warn about writes to unallocated or freed memory as well), it was probably the
|
||||
"from" argument which contained some uninitialized bytes. Following the chain
|
||||
of calls, we move upwards to see where "from" was allocated or initialized,
|
||||
kernel/signal.c, line 380:
|
||||
|
||||
359 static void collect_signal(int sig, struct sigpending *list, siginfo_t *info)
|
||||
360 {
|
||||
...
|
||||
367 list_for_each_entry(q, &list->list, list) {
|
||||
368 if (q->info.si_signo == sig) {
|
||||
369 if (first)
|
||||
370 goto still_pending;
|
||||
371 first = q;
|
||||
...
|
||||
377 if (first) {
|
||||
378 still_pending:
|
||||
379 list_del_init(&first->list);
|
||||
380 copy_siginfo(info, &first->info);
|
||||
381 __sigqueue_free(first);
|
||||
...
|
||||
392 }
|
||||
393 }
|
||||
|
||||
Here, it is &first->info that is being passed on to copy_siginfo(). The
|
||||
variable "first" was found on a list -- passed in as the second argument to
|
||||
collect_signal(). We continue our journey through the stack, to figure out
|
||||
where the item on "list" was allocated or initialized. We move to line 410:
|
||||
|
||||
395 static int __dequeue_signal(struct sigpending *pending, sigset_t *mask,
|
||||
396 siginfo_t *info)
|
||||
397 {
|
||||
...
|
||||
410 collect_signal(sig, pending, info);
|
||||
...
|
||||
414 }
|
||||
|
||||
Now we need to follow the "pending" pointer, since that is being passed on to
|
||||
collect_signal() as "list". At this point, we've run out of lines from the
|
||||
"addr2line" output. Not to worry, we just paste the next addresses from the
|
||||
kmemcheck stack dump, i.e.:
|
||||
|
||||
[<ffffffff8104f04e>] dequeue_signal+0x8e/0x170
|
||||
[<ffffffff81050bd8>] get_signal_to_deliver+0x98/0x390
|
||||
[<ffffffff8100b87d>] do_notify_resume+0xad/0x7d0
|
||||
[<ffffffff8100c7b5>] int_signal+0x12/0x17
|
||||
|
||||
$ addr2line -e vmlinux -i ffffffff8104f04e ffffffff81050bd8 \
|
||||
ffffffff8100b87d ffffffff8100c7b5
|
||||
kernel/signal.c:446
|
||||
kernel/signal.c:1806
|
||||
arch/x86/kernel/signal.c:805
|
||||
arch/x86/kernel/signal.c:871
|
||||
arch/x86/kernel/entry_64.S:694
|
||||
|
||||
Remember that since these addresses were found on the stack and not as the
|
||||
RIP value, they actually point to the _next_ instruction (they are return
|
||||
addresses). This becomes obvious when we look at the code for line 446:
|
||||
|
||||
422 int dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info)
|
||||
423 {
|
||||
...
|
||||
431 signr = __dequeue_signal(&tsk->signal->shared_pending,
|
||||
432 mask, info);
|
||||
433 /*
|
||||
434 * itimer signal ?
|
||||
435 *
|
||||
436 * itimers are process shared and we restart periodic
|
||||
437 * itimers in the signal delivery path to prevent DoS
|
||||
438 * attacks in the high resolution timer case. This is
|
||||
439 * compliant with the old way of self restarting
|
||||
440 * itimers, as the SIGALRM is a legacy signal and only
|
||||
441 * queued once. Changing the restart behaviour to
|
||||
442 * restart the timer in the signal dequeue path is
|
||||
443 * reducing the timer noise on heavy loaded !highres
|
||||
444 * systems too.
|
||||
445 */
|
||||
446 if (unlikely(signr == SIGALRM)) {
|
||||
...
|
||||
489 }
|
||||
|
||||
So instead of looking at 446, we should be looking at 431, which is the line
|
||||
that executes just before 446. Here we see that what we are looking for is
|
||||
&tsk->signal->shared_pending.
|
||||
|
||||
Our next task is now to figure out which function that puts items on this
|
||||
"shared_pending" list. A crude, but efficient tool, is git grep:
|
||||
|
||||
$ git grep -n 'shared_pending' kernel/
|
||||
...
|
||||
kernel/signal.c:828: pending = group ? &t->signal->shared_pending : &t->pending;
|
||||
kernel/signal.c:1339: pending = group ? &t->signal->shared_pending : &t->pending;
|
||||
...
|
||||
|
||||
There were more results, but none of them were related to list operations,
|
||||
and these were the only assignments. We inspect the line numbers more closely
|
||||
and find that this is indeed where items are being added to the list:
|
||||
|
||||
816 static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
|
||||
817 int group)
|
||||
818 {
|
||||
...
|
||||
828 pending = group ? &t->signal->shared_pending : &t->pending;
|
||||
...
|
||||
851 q = __sigqueue_alloc(t, GFP_ATOMIC, (sig < SIGRTMIN &&
|
||||
852 (is_si_special(info) ||
|
||||
853 info->si_code >= 0)));
|
||||
854 if (q) {
|
||||
855 list_add_tail(&q->list, &pending->list);
|
||||
...
|
||||
890 }
|
||||
|
||||
and:
|
||||
|
||||
1309 int send_sigqueue(struct sigqueue *q, struct task_struct *t, int group)
|
||||
1310 {
|
||||
....
|
||||
1339 pending = group ? &t->signal->shared_pending : &t->pending;
|
||||
1340 list_add_tail(&q->list, &pending->list);
|
||||
....
|
||||
1347 }
|
||||
|
||||
In the first case, the list element we are looking for, "q", is being returned
|
||||
from the function __sigqueue_alloc(), which looks like an allocation function.
|
||||
Let's take a look at it:
|
||||
|
||||
187 static struct sigqueue *__sigqueue_alloc(struct task_struct *t, gfp_t flags,
|
||||
188 int override_rlimit)
|
||||
189 {
|
||||
190 struct sigqueue *q = NULL;
|
||||
191 struct user_struct *user;
|
||||
192
|
||||
193 /*
|
||||
194 * We won't get problems with the target's UID changing under us
|
||||
195 * because changing it requires RCU be used, and if t != current, the
|
||||
196 * caller must be holding the RCU readlock (by way of a spinlock) and
|
||||
197 * we use RCU protection here
|
||||
198 */
|
||||
199 user = get_uid(__task_cred(t)->user);
|
||||
200 atomic_inc(&user->sigpending);
|
||||
201 if (override_rlimit ||
|
||||
202 atomic_read(&user->sigpending) <=
|
||||
203 t->signal->rlim[RLIMIT_SIGPENDING].rlim_cur)
|
||||
204 q = kmem_cache_alloc(sigqueue_cachep, flags);
|
||||
205 if (unlikely(q == NULL)) {
|
||||
206 atomic_dec(&user->sigpending);
|
||||
207 free_uid(user);
|
||||
208 } else {
|
||||
209 INIT_LIST_HEAD(&q->list);
|
||||
210 q->flags = 0;
|
||||
211 q->user = user;
|
||||
212 }
|
||||
213
|
||||
214 return q;
|
||||
215 }
|
||||
|
||||
We see that this function initializes q->list, q->flags, and q->user. It seems
|
||||
that now is the time to look at the definition of "struct sigqueue", e.g.:
|
||||
|
||||
14 struct sigqueue {
|
||||
15 struct list_head list;
|
||||
16 int flags;
|
||||
17 siginfo_t info;
|
||||
18 struct user_struct *user;
|
||||
19 };
|
||||
|
||||
And, you might remember, it was a memcpy() on &first->info that caused the
|
||||
warning, so this makes perfect sense. It also seems reasonable to assume that
|
||||
it is the caller of __sigqueue_alloc() that has the responsibility of filling
|
||||
out (initializing) this member.
|
||||
|
||||
But just which fields of the struct were uninitialized? Let's look at
|
||||
kmemcheck's report again:
|
||||
|
||||
WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88003e4a2024)
|
||||
80000000000000000000000000000000000000000088ffff0000000000000000
|
||||
i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u
|
||||
^
|
||||
|
||||
These first two lines are the memory dump of the memory object itself, and the
|
||||
shadow bytemap, respectively. The memory object itself is in this case
|
||||
&first->info. Just beware that the start of this dump is NOT the start of the
|
||||
object itself! The position of the caret (^) corresponds with the address of
|
||||
the read (ffff88003e4a2024).
|
||||
|
||||
The shadow bytemap dump legend is as follows:
|
||||
|
||||
i - initialized
|
||||
u - uninitialized
|
||||
a - unallocated (memory has been allocated by the slab layer, but has not
|
||||
yet been handed off to anybody)
|
||||
f - freed (memory has been allocated by the slab layer, but has been freed
|
||||
by the previous owner)
|
||||
|
||||
In order to figure out where (relative to the start of the object) the
|
||||
uninitialized memory was located, we have to look at the disassembly. For
|
||||
that, we'll need the RIP address again:
|
||||
|
||||
RIP: 0010:[<ffffffff8104ede8>] [<ffffffff8104ede8>] __dequeue_signal+0xc8/0x190
|
||||
|
||||
$ objdump -d --no-show-raw-insn vmlinux | grep -C 8 ffffffff8104ede8:
|
||||
ffffffff8104edc8: mov %r8,0x8(%r8)
|
||||
ffffffff8104edcc: test %r10d,%r10d
|
||||
ffffffff8104edcf: js ffffffff8104ee88 <__dequeue_signal+0x168>
|
||||
ffffffff8104edd5: mov %rax,%rdx
|
||||
ffffffff8104edd8: mov $0xc,%ecx
|
||||
ffffffff8104eddd: mov %r13,%rdi
|
||||
ffffffff8104ede0: mov $0x30,%eax
|
||||
ffffffff8104ede5: mov %rdx,%rsi
|
||||
ffffffff8104ede8: rep movsl %ds:(%rsi),%es:(%rdi)
|
||||
ffffffff8104edea: test $0x2,%al
|
||||
ffffffff8104edec: je ffffffff8104edf0 <__dequeue_signal+0xd0>
|
||||
ffffffff8104edee: movsw %ds:(%rsi),%es:(%rdi)
|
||||
ffffffff8104edf0: test $0x1,%al
|
||||
ffffffff8104edf2: je ffffffff8104edf5 <__dequeue_signal+0xd5>
|
||||
ffffffff8104edf4: movsb %ds:(%rsi),%es:(%rdi)
|
||||
ffffffff8104edf5: mov %r8,%rdi
|
||||
ffffffff8104edf8: callq ffffffff8104de60 <__sigqueue_free>
|
||||
|
||||
As expected, it's the "rep movsl" instruction from the memcpy() that causes
|
||||
the warning. We know about REP MOVSL that it uses the register RCX to count
|
||||
the number of remaining iterations. By taking a look at the register dump
|
||||
again (from the kmemcheck report), we can figure out how many bytes were left
|
||||
to copy:
|
||||
|
||||
RAX: 0000000000000030 RBX: ffff88003d4ea968 RCX: 0000000000000009
|
||||
|
||||
By looking at the disassembly, we also see that %ecx is being loaded with the
|
||||
value $0xc just before (ffffffff8104edd8), so we are very lucky. Keep in mind
|
||||
that this is the number of iterations, not bytes. And since this is a "long"
|
||||
operation, we need to multiply by 4 to get the number of bytes. So this means
|
||||
that the uninitialized value was encountered at 4 * (0xc - 0x9) = 12 bytes
|
||||
from the start of the object.
|
||||
|
||||
We can now try to figure out which field of the "struct siginfo" that was not
|
||||
initialized. This is the beginning of the struct:
|
||||
|
||||
40 typedef struct siginfo {
|
||||
41 int si_signo;
|
||||
42 int si_errno;
|
||||
43 int si_code;
|
||||
44
|
||||
45 union {
|
||||
..
|
||||
92 } _sifields;
|
||||
93 } siginfo_t;
|
||||
|
||||
On 64-bit, the int is 4 bytes long, so it must the the union member that has
|
||||
not been initialized. We can verify this using gdb:
|
||||
|
||||
$ gdb vmlinux
|
||||
...
|
||||
(gdb) p &((struct siginfo *) 0)->_sifields
|
||||
$1 = (union {...} *) 0x10
|
||||
|
||||
Actually, it seems that the union member is located at offset 0x10 -- which
|
||||
means that gcc has inserted 4 bytes of padding between the members si_code
|
||||
and _sifields. We can now get a fuller picture of the memory dump:
|
||||
|
||||
_----------------------------=> si_code
|
||||
/ _--------------------=> (padding)
|
||||
| / _------------=> _sifields(._kill._pid)
|
||||
| | / _----=> _sifields(._kill._uid)
|
||||
| | | /
|
||||
-------|-------|-------|-------|
|
||||
80000000000000000000000000000000000000000088ffff0000000000000000
|
||||
i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u
|
||||
|
||||
This allows us to realize another important fact: si_code contains the value
|
||||
0x80. Remember that x86 is little endian, so the first 4 bytes "80000000" are
|
||||
really the number 0x00000080. With a bit of research, we find that this is
|
||||
actually the constant SI_KERNEL defined in include/asm-generic/siginfo.h:
|
||||
|
||||
144 #define SI_KERNEL 0x80 /* sent by the kernel from somewhere */
|
||||
|
||||
This macro is used in exactly one place in the x86 kernel: In send_signal()
|
||||
in kernel/signal.c:
|
||||
|
||||
816 static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
|
||||
817 int group)
|
||||
818 {
|
||||
...
|
||||
828 pending = group ? &t->signal->shared_pending : &t->pending;
|
||||
...
|
||||
851 q = __sigqueue_alloc(t, GFP_ATOMIC, (sig < SIGRTMIN &&
|
||||
852 (is_si_special(info) ||
|
||||
853 info->si_code >= 0)));
|
||||
854 if (q) {
|
||||
855 list_add_tail(&q->list, &pending->list);
|
||||
856 switch ((unsigned long) info) {
|
||||
...
|
||||
865 case (unsigned long) SEND_SIG_PRIV:
|
||||
866 q->info.si_signo = sig;
|
||||
867 q->info.si_errno = 0;
|
||||
868 q->info.si_code = SI_KERNEL;
|
||||
869 q->info.si_pid = 0;
|
||||
870 q->info.si_uid = 0;
|
||||
871 break;
|
||||
...
|
||||
890 }
|
||||
|
||||
Not only does this match with the .si_code member, it also matches the place
|
||||
we found earlier when looking for where siginfo_t objects are enqueued on the
|
||||
"shared_pending" list.
|
||||
|
||||
So to sum up: It seems that it is the padding introduced by the compiler
|
||||
between two struct fields that is uninitialized, and this gets reported when
|
||||
we do a memcpy() on the struct. This means that we have identified a false
|
||||
positive warning.
|
||||
|
||||
Normally, kmemcheck will not report uninitialized accesses in memcpy() calls
|
||||
when both the source and destination addresses are tracked. (Instead, we copy
|
||||
the shadow bytemap as well). In this case, the destination address clearly
|
||||
was not tracked. We can dig a little deeper into the stack trace from above:
|
||||
|
||||
arch/x86/kernel/signal.c:805
|
||||
arch/x86/kernel/signal.c:871
|
||||
arch/x86/kernel/entry_64.S:694
|
||||
|
||||
And we clearly see that the destination siginfo object is located on the
|
||||
stack:
|
||||
|
||||
782 static void do_signal(struct pt_regs *regs)
|
||||
783 {
|
||||
784 struct k_sigaction ka;
|
||||
785 siginfo_t info;
|
||||
...
|
||||
804 signr = get_signal_to_deliver(&info, &ka, regs, NULL);
|
||||
...
|
||||
854 }
|
||||
|
||||
And this &info is what eventually gets passed to copy_siginfo() as the
|
||||
destination argument.
|
||||
|
||||
Now, even though we didn't find an actual error here, the example is still a
|
||||
good one, because it shows how one would go about to find out what the report
|
||||
was all about.
|
||||
|
||||
|
||||
3.4. Annotating false positives
|
||||
===============================
|
||||
|
||||
There are a few different ways to make annotations in the source code that
|
||||
will keep kmemcheck from checking and reporting certain allocations. Here
|
||||
they are:
|
||||
|
||||
o __GFP_NOTRACK_FALSE_POSITIVE
|
||||
|
||||
This flag can be passed to kmalloc() or kmem_cache_alloc() (therefore
|
||||
also to other functions that end up calling one of these) to indicate
|
||||
that the allocation should not be tracked because it would lead to
|
||||
a false positive report. This is a "big hammer" way of silencing
|
||||
kmemcheck; after all, even if the false positive pertains to
|
||||
particular field in a struct, for example, we will now lose the
|
||||
ability to find (real) errors in other parts of the same struct.
|
||||
|
||||
Example:
|
||||
|
||||
/* No warnings will ever trigger on accessing any part of x */
|
||||
x = kmalloc(sizeof *x, GFP_KERNEL | __GFP_NOTRACK_FALSE_POSITIVE);
|
||||
|
||||
o kmemcheck_bitfield_begin(name)/kmemcheck_bitfield_end(name) and
|
||||
kmemcheck_annotate_bitfield(ptr, name)
|
||||
|
||||
The first two of these three macros can be used inside struct
|
||||
definitions to signal, respectively, the beginning and end of a
|
||||
bitfield. Additionally, this will assign the bitfield a name, which
|
||||
is given as an argument to the macros.
|
||||
|
||||
Having used these markers, one can later use
|
||||
kmemcheck_annotate_bitfield() at the point of allocation, to indicate
|
||||
which parts of the allocation is part of a bitfield.
|
||||
|
||||
Example:
|
||||
|
||||
struct foo {
|
||||
int x;
|
||||
|
||||
kmemcheck_bitfield_begin(flags);
|
||||
int flag_a:1;
|
||||
int flag_b:1;
|
||||
kmemcheck_bitfield_end(flags);
|
||||
|
||||
int y;
|
||||
};
|
||||
|
||||
struct foo *x = kmalloc(sizeof *x);
|
||||
|
||||
/* No warnings will trigger on accessing the bitfield of x */
|
||||
kmemcheck_annotate_bitfield(x, flags);
|
||||
|
||||
Note that kmemcheck_annotate_bitfield() can be used even before the
|
||||
return value of kmalloc() is checked -- in other words, passing NULL
|
||||
as the first argument is legal (and will do nothing).
|
||||
|
||||
|
||||
4. Reporting errors
|
||||
===================
|
||||
|
||||
As we have seen, kmemcheck will produce false positive reports. Therefore, it
|
||||
is not very wise to blindly post kmemcheck warnings to mailing lists and
|
||||
maintainers. Instead, I encourage maintainers and developers to find errors
|
||||
in their own code. If you get a warning, you can try to work around it, try
|
||||
to figure out if it's a real error or not, or simply ignore it. Most
|
||||
developers know their own code and will quickly and efficiently determine the
|
||||
root cause of a kmemcheck report. This is therefore also the most efficient
|
||||
way to work with kmemcheck.
|
||||
|
||||
That said, we (the kmemcheck maintainers) will always be on the lookout for
|
||||
false positives that we can annotate and silence. So whatever you find,
|
||||
please drop us a note privately! Kernel configs and steps to reproduce (if
|
||||
available) are of course a great help too.
|
||||
|
||||
Happy hacking!
|
||||
|
||||
|
||||
5. Technical description
|
||||
========================
|
||||
|
||||
kmemcheck works by marking memory pages non-present. This means that whenever
|
||||
somebody attempts to access the page, a page fault is generated. The page
|
||||
fault handler notices that the page was in fact only hidden, and so it calls
|
||||
on the kmemcheck code to make further investigations.
|
||||
|
||||
When the investigations are completed, kmemcheck "shows" the page by marking
|
||||
it present (as it would be under normal circumstances). This way, the
|
||||
interrupted code can continue as usual.
|
||||
|
||||
But after the instruction has been executed, we should hide the page again, so
|
||||
that we can catch the next access too! Now kmemcheck makes use of a debugging
|
||||
feature of the processor, namely single-stepping. When the processor has
|
||||
finished the one instruction that generated the memory access, a debug
|
||||
exception is raised. From here, we simply hide the page again and continue
|
||||
execution, this time with the single-stepping feature turned off.
|
||||
|
||||
kmemcheck requires some assistance from the memory allocator in order to work.
|
||||
The memory allocator needs to
|
||||
|
||||
1. Tell kmemcheck about newly allocated pages and pages that are about to
|
||||
be freed. This allows kmemcheck to set up and tear down the shadow memory
|
||||
for the pages in question. The shadow memory stores the status of each
|
||||
byte in the allocation proper, e.g. whether it is initialized or
|
||||
uninitialized.
|
||||
|
||||
2. Tell kmemcheck which parts of memory should be marked uninitialized.
|
||||
There are actually a few more states, such as "not yet allocated" and
|
||||
"recently freed".
|
||||
|
||||
If a slab cache is set up using the SLAB_NOTRACK flag, it will never return
|
||||
memory that can take page faults because of kmemcheck.
|
||||
|
||||
If a slab cache is NOT set up using the SLAB_NOTRACK flag, callers can still
|
||||
request memory with the __GFP_NOTRACK or __GFP_NOTRACK_FALSE_POSITIVE flags.
|
||||
This does not prevent the page faults from occurring, however, but marks the
|
||||
object in question as being initialized so that no warnings will ever be
|
||||
produced for this object.
|
||||
|
||||
Currently, the SLAB and SLUB allocators are supported by kmemcheck.
|
@ -507,9 +507,9 @@ http://www.linuxsymposium.org/2006/linuxsymposium_procv2.pdf (pages 101-115)
|
||||
Appendix A: The kprobes debugfs interface
|
||||
|
||||
With recent kernels (> 2.6.20) the list of registered kprobes is visible
|
||||
under the /debug/kprobes/ directory (assuming debugfs is mounted at /debug).
|
||||
under the /sys/kernel/debug/kprobes/ directory (assuming debugfs is mounted at //sys/kernel/debug).
|
||||
|
||||
/debug/kprobes/list: Lists all registered probes on the system
|
||||
/sys/kernel/debug/kprobes/list: Lists all registered probes on the system
|
||||
|
||||
c015d71a k vfs_read+0x0
|
||||
c011a316 j do_fork+0x0
|
||||
@ -525,7 +525,7 @@ virtual addresses that correspond to modules that've been unloaded),
|
||||
such probes are marked with [GONE]. If the probe is temporarily disabled,
|
||||
such probes are marked with [DISABLED].
|
||||
|
||||
/debug/kprobes/enabled: Turn kprobes ON/OFF forcibly.
|
||||
/sys/kernel/debug/kprobes/enabled: Turn kprobes ON/OFF forcibly.
|
||||
|
||||
Provides a knob to globally and forcibly turn registered kprobes ON or OFF.
|
||||
By default, all kprobes are enabled. By echoing "0" to this file, all
|
||||
|
@ -920,7 +920,7 @@ The available commands are:
|
||||
echo '<LED number> off' >/proc/acpi/ibm/led
|
||||
echo '<LED number> blink' >/proc/acpi/ibm/led
|
||||
|
||||
The <LED number> range is 0 to 7. The set of LEDs that can be
|
||||
The <LED number> range is 0 to 15. The set of LEDs that can be
|
||||
controlled varies from model to model. Here is the common ThinkPad
|
||||
mapping:
|
||||
|
||||
@ -932,6 +932,11 @@ mapping:
|
||||
5 - UltraBase battery slot
|
||||
6 - (unknown)
|
||||
7 - standby
|
||||
8 - dock status 1
|
||||
9 - dock status 2
|
||||
10, 11 - (unknown)
|
||||
12 - thinkvantage
|
||||
13, 14, 15 - (unknown)
|
||||
|
||||
All of the above can be turned on and off and can be made to blink.
|
||||
|
||||
@ -940,10 +945,12 @@ sysfs notes:
|
||||
The ThinkPad LED sysfs interface is described in detail by the LED class
|
||||
documentation, in Documentation/leds-class.txt.
|
||||
|
||||
The leds are named (in LED ID order, from 0 to 7):
|
||||
The LEDs are named (in LED ID order, from 0 to 12):
|
||||
"tpacpi::power", "tpacpi:orange:batt", "tpacpi:green:batt",
|
||||
"tpacpi::dock_active", "tpacpi::bay_active", "tpacpi::dock_batt",
|
||||
"tpacpi::unknown_led", "tpacpi::standby".
|
||||
"tpacpi::unknown_led", "tpacpi::standby", "tpacpi::dock_status1",
|
||||
"tpacpi::dock_status2", "tpacpi::unknown_led2", "tpacpi::unknown_led3",
|
||||
"tpacpi::thinkvantage".
|
||||
|
||||
Due to limitations in the sysfs LED class, if the status of the LED
|
||||
indicators cannot be read due to an error, thinkpad-acpi will report it as
|
||||
@ -958,6 +965,12 @@ ThinkPad indicator LED should blink in hardware accelerated mode, use the
|
||||
"timer" trigger, and leave the delay_on and delay_off parameters set to
|
||||
zero (to request hardware acceleration autodetection).
|
||||
|
||||
LEDs that are known not to exist in a given ThinkPad model are not
|
||||
made available through the sysfs interface. If you have a dock and you
|
||||
notice there are LEDs listed for your ThinkPad that do not exist (and
|
||||
are not in the dock), or if you notice that there are missing LEDs,
|
||||
a report to ibm-acpi-devel@lists.sourceforge.net is appreciated.
|
||||
|
||||
|
||||
ACPI sounds -- /proc/acpi/ibm/beep
|
||||
----------------------------------
|
||||
@ -1156,17 +1169,19 @@ may not be distinct. Later Lenovo models that implement the ACPI
|
||||
display backlight brightness control methods have 16 levels, ranging
|
||||
from 0 to 15.
|
||||
|
||||
There are two interfaces to the firmware for direct brightness control,
|
||||
EC and UCMS (or CMOS). To select which one should be used, use the
|
||||
brightness_mode module parameter: brightness_mode=1 selects EC mode,
|
||||
brightness_mode=2 selects UCMS mode, brightness_mode=3 selects EC
|
||||
mode with NVRAM backing (so that brightness changes are remembered
|
||||
across shutdown/reboot).
|
||||
For IBM ThinkPads, there are two interfaces to the firmware for direct
|
||||
brightness control, EC and UCMS (or CMOS). To select which one should be
|
||||
used, use the brightness_mode module parameter: brightness_mode=1 selects
|
||||
EC mode, brightness_mode=2 selects UCMS mode, brightness_mode=3 selects EC
|
||||
mode with NVRAM backing (so that brightness changes are remembered across
|
||||
shutdown/reboot).
|
||||
|
||||
The driver tries to select which interface to use from a table of
|
||||
defaults for each ThinkPad model. If it makes a wrong choice, please
|
||||
report this as a bug, so that we can fix it.
|
||||
|
||||
Lenovo ThinkPads only support brightness_mode=2 (UCMS).
|
||||
|
||||
When display backlight brightness controls are available through the
|
||||
standard ACPI interface, it is best to use it instead of this direct
|
||||
ThinkPad-specific interface. The driver will disable its native
|
||||
@ -1254,7 +1269,7 @@ Fan control and monitoring: fan speed, fan enable/disable
|
||||
|
||||
procfs: /proc/acpi/ibm/fan
|
||||
sysfs device attributes: (hwmon "thinkpad") fan1_input, pwm1,
|
||||
pwm1_enable
|
||||
pwm1_enable, fan2_input
|
||||
sysfs hwmon driver attributes: fan_watchdog
|
||||
|
||||
NOTE NOTE NOTE: fan control operations are disabled by default for
|
||||
@ -1267,6 +1282,9 @@ from the hardware registers of the embedded controller. This is known
|
||||
to work on later R, T, X and Z series ThinkPads but may show a bogus
|
||||
value on other models.
|
||||
|
||||
Some Lenovo ThinkPads support a secondary fan. This fan cannot be
|
||||
controlled separately, it shares the main fan control.
|
||||
|
||||
Fan levels:
|
||||
|
||||
Most ThinkPad fans work in "levels" at the firmware interface. Level 0
|
||||
@ -1397,6 +1415,11 @@ hwmon device attribute fan1_input:
|
||||
which can take up to two minutes. May return rubbish on older
|
||||
ThinkPads.
|
||||
|
||||
hwmon device attribute fan2_input:
|
||||
Fan tachometer reading, in RPM, for the secondary fan.
|
||||
Available only on some ThinkPads. If the secondary fan is
|
||||
not installed, will always read 0.
|
||||
|
||||
hwmon driver attribute fan_watchdog:
|
||||
Fan safety watchdog timer interval, in seconds. Minimum is
|
||||
1 second, maximum is 120 seconds. 0 disables the watchdog.
|
||||
@ -1555,3 +1578,7 @@ Sysfs interface changelog:
|
||||
0x020300: hotkey enable/disable support removed, attributes
|
||||
hotkey_bios_enabled and hotkey_enable deprecated and
|
||||
marked for removal.
|
||||
|
||||
0x020400: Marker for 16 LEDs support. Also, LEDs that are known
|
||||
to not exist in a given model are not registered with
|
||||
the LED sysfs class anymore.
|
||||
|
50
Documentation/leds-lp3944.txt
Normal file
50
Documentation/leds-lp3944.txt
Normal file
@ -0,0 +1,50 @@
|
||||
Kernel driver lp3944
|
||||
====================
|
||||
|
||||
* National Semiconductor LP3944 Fun-light Chip
|
||||
Prefix: 'lp3944'
|
||||
Addresses scanned: None (see the Notes section below)
|
||||
Datasheet: Publicly available at the National Semiconductor website
|
||||
http://www.national.com/pf/LP/LP3944.html
|
||||
|
||||
Authors:
|
||||
Antonio Ospite <ospite@studenti.unina.it>
|
||||
|
||||
|
||||
Description
|
||||
-----------
|
||||
The LP3944 is a helper chip that can drive up to 8 leds, with two programmable
|
||||
DIM modes; it could even be used as a gpio expander but this driver assumes it
|
||||
is used as a led controller.
|
||||
|
||||
The DIM modes are used to set _blink_ patterns for leds, the pattern is
|
||||
specified supplying two parameters:
|
||||
- period: from 0s to 1.6s
|
||||
- duty cycle: percentage of the period the led is on, from 0 to 100
|
||||
|
||||
Setting a led in DIM0 or DIM1 mode makes it blink according to the pattern.
|
||||
See the datasheet for details.
|
||||
|
||||
LP3944 can be found on Motorola A910 smartphone, where it drives the rgb
|
||||
leds, the camera flash light and the lcds power.
|
||||
|
||||
|
||||
Notes
|
||||
-----
|
||||
The chip is used mainly in embedded contexts, so this driver expects it is
|
||||
registered using the i2c_board_info mechanism.
|
||||
|
||||
To register the chip at address 0x60 on adapter 0, set the platform data
|
||||
according to include/linux/leds-lp3944.h, set the i2c board info:
|
||||
|
||||
static struct i2c_board_info __initdata a910_i2c_board_info[] = {
|
||||
{
|
||||
I2C_BOARD_INFO("lp3944", 0x60),
|
||||
.platform_data = &a910_lp3944_leds,
|
||||
},
|
||||
};
|
||||
|
||||
and register it in the platform init function
|
||||
|
||||
i2c_register_board_info(0, a910_i2c_board_info,
|
||||
ARRAY_SIZE(a910_i2c_board_info));
|
@ -36,10 +36,15 @@ This file contains
|
||||
6.2 local loopback of sent frames
|
||||
6.3 CAN controller hardware filters
|
||||
6.4 The virtual CAN driver (vcan)
|
||||
6.5 currently supported CAN hardware
|
||||
6.6 todo
|
||||
6.5 The CAN network device driver interface
|
||||
6.5.1 Netlink interface to set/get devices properties
|
||||
6.5.2 Setting the CAN bit-timing
|
||||
6.5.3 Starting and stopping the CAN network device
|
||||
6.6 supported CAN hardware
|
||||
|
||||
7 Credits
|
||||
7 Socket CAN resources
|
||||
|
||||
8 Credits
|
||||
|
||||
============================================================================
|
||||
|
||||
@ -234,6 +239,8 @@ solution for a couple of reasons:
|
||||
the user application using the common CAN filter mechanisms. Inside
|
||||
this filter definition the (interested) type of errors may be
|
||||
selected. The reception of error frames is disabled by default.
|
||||
The format of the CAN error frame is briefly decribed in the Linux
|
||||
header file "include/linux/can/error.h".
|
||||
|
||||
4. How to use Socket CAN
|
||||
------------------------
|
||||
@ -605,61 +612,213 @@ solution for a couple of reasons:
|
||||
removal of vcan network devices can be managed with the ip(8) tool:
|
||||
|
||||
- Create a virtual CAN network interface:
|
||||
ip link add type vcan
|
||||
$ ip link add type vcan
|
||||
|
||||
- Create a virtual CAN network interface with a specific name 'vcan42':
|
||||
ip link add dev vcan42 type vcan
|
||||
$ ip link add dev vcan42 type vcan
|
||||
|
||||
- Remove a (virtual CAN) network interface 'vcan42':
|
||||
ip link del vcan42
|
||||
$ ip link del vcan42
|
||||
|
||||
The tool 'vcan' from the SocketCAN SVN repository on BerliOS is obsolete.
|
||||
6.5 The CAN network device driver interface
|
||||
|
||||
Virtual CAN network device creation in older Kernels:
|
||||
In Linux Kernel versions < 2.6.24 the vcan driver creates 4 vcan
|
||||
netdevices at module load time by default. This value can be changed
|
||||
with the module parameter 'numdev'. E.g. 'modprobe vcan numdev=8'
|
||||
The CAN network device driver interface provides a generic interface
|
||||
to setup, configure and monitor CAN network devices. The user can then
|
||||
configure the CAN device, like setting the bit-timing parameters, via
|
||||
the netlink interface using the program "ip" from the "IPROUTE2"
|
||||
utility suite. The following chapter describes briefly how to use it.
|
||||
Furthermore, the interface uses a common data structure and exports a
|
||||
set of common functions, which all real CAN network device drivers
|
||||
should use. Please have a look to the SJA1000 or MSCAN driver to
|
||||
understand how to use them. The name of the module is can-dev.ko.
|
||||
|
||||
6.5 currently supported CAN hardware
|
||||
6.5.1 Netlink interface to set/get devices properties
|
||||
|
||||
On the project website http://developer.berlios.de/projects/socketcan
|
||||
there are different drivers available:
|
||||
The CAN device must be configured via netlink interface. The supported
|
||||
netlink message types are defined and briefly described in
|
||||
"include/linux/can/netlink.h". CAN link support for the program "ip"
|
||||
of the IPROUTE2 utility suite is avaiable and it can be used as shown
|
||||
below:
|
||||
|
||||
vcan: Virtual CAN interface driver (if no real hardware is available)
|
||||
sja1000: Philips SJA1000 CAN controller (recommended)
|
||||
i82527: Intel i82527 CAN controller
|
||||
mscan: Motorola/Freescale CAN controller (e.g. inside SOC MPC5200)
|
||||
ccan: CCAN controller core (e.g. inside SOC h7202)
|
||||
slcan: For a bunch of CAN adaptors that are attached via a
|
||||
serial line ASCII protocol (for serial / USB adaptors)
|
||||
- Setting CAN device properties:
|
||||
|
||||
Additionally the different CAN adaptors (ISA/PCI/PCMCIA/USB/Parport)
|
||||
from PEAK Systemtechnik support the CAN netdevice driver model
|
||||
since Linux driver v6.0: http://www.peak-system.com/linux/index.htm
|
||||
$ ip link set can0 type can help
|
||||
Usage: ip link set DEVICE type can
|
||||
[ bitrate BITRATE [ sample-point SAMPLE-POINT] ] |
|
||||
[ tq TQ prop-seg PROP_SEG phase-seg1 PHASE-SEG1
|
||||
phase-seg2 PHASE-SEG2 [ sjw SJW ] ]
|
||||
|
||||
Please check the Mailing Lists on the berlios OSS project website.
|
||||
[ loopback { on | off } ]
|
||||
[ listen-only { on | off } ]
|
||||
[ triple-sampling { on | off } ]
|
||||
|
||||
6.6 todo
|
||||
[ restart-ms TIME-MS ]
|
||||
[ restart ]
|
||||
|
||||
The configuration interface for CAN network drivers is still an open
|
||||
issue that has not been finalized in the socketcan project. Also the
|
||||
idea of having a library module (candev.ko) that holds functions
|
||||
that are needed by all CAN netdevices is not ready to ship.
|
||||
Your contribution is welcome.
|
||||
Where: BITRATE := { 1..1000000 }
|
||||
SAMPLE-POINT := { 0.000..0.999 }
|
||||
TQ := { NUMBER }
|
||||
PROP-SEG := { 1..8 }
|
||||
PHASE-SEG1 := { 1..8 }
|
||||
PHASE-SEG2 := { 1..8 }
|
||||
SJW := { 1..4 }
|
||||
RESTART-MS := { 0 | NUMBER }
|
||||
|
||||
7. Credits
|
||||
- Display CAN device details and statistics:
|
||||
|
||||
$ ip -details -statistics link show can0
|
||||
2: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP qlen 10
|
||||
link/can
|
||||
can <TRIPLE-SAMPLING> state ERROR-ACTIVE restart-ms 100
|
||||
bitrate 125000 sample_point 0.875
|
||||
tq 125 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1
|
||||
sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
|
||||
clock 8000000
|
||||
re-started bus-errors arbit-lost error-warn error-pass bus-off
|
||||
41 17457 0 41 42 41
|
||||
RX: bytes packets errors dropped overrun mcast
|
||||
140859 17608 17457 0 0 0
|
||||
TX: bytes packets errors dropped carrier collsns
|
||||
861 112 0 41 0 0
|
||||
|
||||
More info to the above output:
|
||||
|
||||
"<TRIPLE-SAMPLING>"
|
||||
Shows the list of selected CAN controller modes: LOOPBACK,
|
||||
LISTEN-ONLY, or TRIPLE-SAMPLING.
|
||||
|
||||
"state ERROR-ACTIVE"
|
||||
The current state of the CAN controller: "ERROR-ACTIVE",
|
||||
"ERROR-WARNING", "ERROR-PASSIVE", "BUS-OFF" or "STOPPED"
|
||||
|
||||
"restart-ms 100"
|
||||
Automatic restart delay time. If set to a non-zero value, a
|
||||
restart of the CAN controller will be triggered automatically
|
||||
in case of a bus-off condition after the specified delay time
|
||||
in milliseconds. By default it's off.
|
||||
|
||||
"bitrate 125000 sample_point 0.875"
|
||||
Shows the real bit-rate in bits/sec and the sample-point in the
|
||||
range 0.000..0.999. If the calculation of bit-timing parameters
|
||||
is enabled in the kernel (CONFIG_CAN_CALC_BITTIMING=y), the
|
||||
bit-timing can be defined by setting the "bitrate" argument.
|
||||
Optionally the "sample-point" can be specified. By default it's
|
||||
0.000 assuming CIA-recommended sample-points.
|
||||
|
||||
"tq 125 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1"
|
||||
Shows the time quanta in ns, propagation segment, phase buffer
|
||||
segment 1 and 2 and the synchronisation jump width in units of
|
||||
tq. They allow to define the CAN bit-timing in a hardware
|
||||
independent format as proposed by the Bosch CAN 2.0 spec (see
|
||||
chapter 8 of http://www.semiconductors.bosch.de/pdf/can2spec.pdf).
|
||||
|
||||
"sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
|
||||
clock 8000000"
|
||||
Shows the bit-timing constants of the CAN controller, here the
|
||||
"sja1000". The minimum and maximum values of the time segment 1
|
||||
and 2, the synchronisation jump width in units of tq, the
|
||||
bitrate pre-scaler and the CAN system clock frequency in Hz.
|
||||
These constants could be used for user-defined (non-standard)
|
||||
bit-timing calculation algorithms in user-space.
|
||||
|
||||
"re-started bus-errors arbit-lost error-warn error-pass bus-off"
|
||||
Shows the number of restarts, bus and arbitration lost errors,
|
||||
and the state changes to the error-warning, error-passive and
|
||||
bus-off state. RX overrun errors are listed in the "overrun"
|
||||
field of the standard network statistics.
|
||||
|
||||
6.5.2 Setting the CAN bit-timing
|
||||
|
||||
The CAN bit-timing parameters can always be defined in a hardware
|
||||
independent format as proposed in the Bosch CAN 2.0 specification
|
||||
specifying the arguments "tq", "prop_seg", "phase_seg1", "phase_seg2"
|
||||
and "sjw":
|
||||
|
||||
$ ip link set canX type can tq 125 prop-seg 6 \
|
||||
phase-seg1 7 phase-seg2 2 sjw 1
|
||||
|
||||
If the kernel option CONFIG_CAN_CALC_BITTIMING is enabled, CIA
|
||||
recommended CAN bit-timing parameters will be calculated if the bit-
|
||||
rate is specified with the argument "bitrate":
|
||||
|
||||
$ ip link set canX type can bitrate 125000
|
||||
|
||||
Note that this works fine for the most common CAN controllers with
|
||||
standard bit-rates but may *fail* for exotic bit-rates or CAN system
|
||||
clock frequencies. Disabling CONFIG_CAN_CALC_BITTIMING saves some
|
||||
space and allows user-space tools to solely determine and set the
|
||||
bit-timing parameters. The CAN controller specific bit-timing
|
||||
constants can be used for that purpose. They are listed by the
|
||||
following command:
|
||||
|
||||
$ ip -details link show can0
|
||||
...
|
||||
sja1000: clock 8000000 tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
|
||||
|
||||
6.5.3 Starting and stopping the CAN network device
|
||||
|
||||
A CAN network device is started or stopped as usual with the command
|
||||
"ifconfig canX up/down" or "ip link set canX up/down". Be aware that
|
||||
you *must* define proper bit-timing parameters for real CAN devices
|
||||
before you can start it to avoid error-prone default settings:
|
||||
|
||||
$ ip link set canX up type can bitrate 125000
|
||||
|
||||
A device may enter the "bus-off" state if too much errors occurred on
|
||||
the CAN bus. Then no more messages are received or sent. An automatic
|
||||
bus-off recovery can be enabled by setting the "restart-ms" to a
|
||||
non-zero value, e.g.:
|
||||
|
||||
$ ip link set canX type can restart-ms 100
|
||||
|
||||
Alternatively, the application may realize the "bus-off" condition
|
||||
by monitoring CAN error frames and do a restart when appropriate with
|
||||
the command:
|
||||
|
||||
$ ip link set canX type can restart
|
||||
|
||||
Note that a restart will also create a CAN error frame (see also
|
||||
chapter 3.4).
|
||||
|
||||
6.6 Supported CAN hardware
|
||||
|
||||
Please check the "Kconfig" file in "drivers/net/can" to get an actual
|
||||
list of the support CAN hardware. On the Socket CAN project website
|
||||
(see chapter 7) there might be further drivers available, also for
|
||||
older kernel versions.
|
||||
|
||||
7. Socket CAN resources
|
||||
-----------------------
|
||||
|
||||
You can find further resources for Socket CAN like user space tools,
|
||||
support for old kernel versions, more drivers, mailing lists, etc.
|
||||
at the BerliOS OSS project website for Socket CAN:
|
||||
|
||||
http://developer.berlios.de/projects/socketcan
|
||||
|
||||
If you have questions, bug fixes, etc., don't hesitate to post them to
|
||||
the Socketcan-Users mailing list. But please search the archives first.
|
||||
|
||||
8. Credits
|
||||
----------
|
||||
|
||||
Oliver Hartkopp (PF_CAN core, filters, drivers, bcm)
|
||||
Oliver Hartkopp (PF_CAN core, filters, drivers, bcm, SJA1000 driver)
|
||||
Urs Thuermann (PF_CAN core, kernel integration, socket interfaces, raw, vcan)
|
||||
Jan Kizka (RT-SocketCAN core, Socket-API reconciliation)
|
||||
Wolfgang Grandegger (RT-SocketCAN core & drivers, Raw Socket-API reviews)
|
||||
Wolfgang Grandegger (RT-SocketCAN core & drivers, Raw Socket-API reviews,
|
||||
CAN device driver interface, MSCAN driver)
|
||||
Robert Schwebel (design reviews, PTXdist integration)
|
||||
Marc Kleine-Budde (design reviews, Kernel 2.6 cleanups, drivers)
|
||||
Benedikt Spranger (reviews)
|
||||
Thomas Gleixner (LKML reviews, coding style, posting hints)
|
||||
Andrey Volkov (kernel subtree structure, ioctls, mscan driver)
|
||||
Andrey Volkov (kernel subtree structure, ioctls, MSCAN driver)
|
||||
Matthias Brukner (first SJA1000 CAN netdevice implementation Q2/2003)
|
||||
Klaus Hitschler (PEAK driver integration)
|
||||
Uwe Koppe (CAN netdevices with PF_PACKET approach)
|
||||
Michael Schulze (driver layer loopback requirement, RT CAN drivers review)
|
||||
Pavel Pisa (Bit-timing calculation)
|
||||
Sascha Hauer (SJA1000 platform driver)
|
||||
Sebastian Haas (SJA1000 EMS PCI driver)
|
||||
Markus Plessing (SJA1000 EMS PCI driver)
|
||||
Per Dalen (SJA1000 Kvaser PCI driver)
|
||||
Sam Ravnborg (reviews, coding style, kbuild help)
|
||||
|
76
Documentation/networking/ieee802154.txt
Normal file
76
Documentation/networking/ieee802154.txt
Normal file
@ -0,0 +1,76 @@
|
||||
|
||||
Linux IEEE 802.15.4 implementation
|
||||
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
The Linux-ZigBee project goal is to provide complete implementation
|
||||
of IEEE 802.15.4 / ZigBee / 6LoWPAN protocols. IEEE 802.15.4 is a stack
|
||||
of protocols for organizing Low-Rate Wireless Personal Area Networks.
|
||||
|
||||
Currently only IEEE 802.15.4 layer is implemented. We have choosen
|
||||
to use plain Berkeley socket API, the generic Linux networking stack
|
||||
to transfer IEEE 802.15.4 messages and a special protocol over genetlink
|
||||
for configuration/management
|
||||
|
||||
|
||||
Socket API
|
||||
==========
|
||||
|
||||
int sd = socket(PF_IEEE802154, SOCK_DGRAM, 0);
|
||||
.....
|
||||
|
||||
The address family, socket addresses etc. are defined in the
|
||||
include/net/ieee802154/af_ieee802154.h header or in the special header
|
||||
in our userspace package (see either linux-zigbee sourceforge download page
|
||||
or git tree at git://linux-zigbee.git.sourceforge.net/gitroot/linux-zigbee).
|
||||
|
||||
One can use SOCK_RAW for passing raw data towards device xmit function. YMMV.
|
||||
|
||||
|
||||
MLME - MAC Level Management
|
||||
============================
|
||||
|
||||
Most of IEEE 802.15.4 MLME interfaces are directly mapped on netlink commands.
|
||||
See the include/net/ieee802154/nl802154.h header. Our userspace tools package
|
||||
(see above) provides CLI configuration utility for radio interfaces and simple
|
||||
coordinator for IEEE 802.15.4 networks as an example users of MLME protocol.
|
||||
|
||||
|
||||
Kernel side
|
||||
=============
|
||||
|
||||
Like with WiFi, there are several types of devices implementing IEEE 802.15.4.
|
||||
1) 'HardMAC'. The MAC layer is implemented in the device itself, the device
|
||||
exports MLME and data API.
|
||||
2) 'SoftMAC' or just radio. These types of devices are just radio transceivers
|
||||
possibly with some kinds of acceleration like automatic CRC computation and
|
||||
comparation, automagic ACK handling, address matching, etc.
|
||||
|
||||
Those types of devices require different approach to be hooked into Linux kernel.
|
||||
|
||||
|
||||
HardMAC
|
||||
=======
|
||||
|
||||
See the header include/net/ieee802154/netdevice.h. You have to implement Linux
|
||||
net_device, with .type = ARPHRD_IEEE802154. Data is exchanged with socket family
|
||||
code via plain sk_buffs. The control block of sk_buffs will contain additional
|
||||
info as described in the struct ieee802154_mac_cb.
|
||||
|
||||
To hook the MLME interface you have to populate the ml_priv field of your
|
||||
net_device with a pointer to struct ieee802154_mlme_ops instance. All fields are
|
||||
required.
|
||||
|
||||
We provide an example of simple HardMAC driver at drivers/ieee802154/fakehard.c
|
||||
|
||||
|
||||
SoftMAC
|
||||
=======
|
||||
|
||||
We are going to provide intermediate layer impelementing IEEE 802.15.4 MAC
|
||||
in software. This is currently WIP.
|
||||
|
||||
See header include/net/ieee802154/mac802154.h and several drivers in
|
||||
drivers/ieee802154/
|
@ -168,7 +168,16 @@ tcp_dsack - BOOLEAN
|
||||
Allows TCP to send "duplicate" SACKs.
|
||||
|
||||
tcp_ecn - BOOLEAN
|
||||
Enable Explicit Congestion Notification in TCP.
|
||||
Enable Explicit Congestion Notification (ECN) in TCP. ECN is only
|
||||
used when both ends of the TCP flow support it. It is useful to
|
||||
avoid losses due to congestion (when the bottleneck router supports
|
||||
ECN).
|
||||
Possible values are:
|
||||
0 disable ECN
|
||||
1 ECN enabled
|
||||
2 Only server-side ECN enabled. If the other end does
|
||||
not support ECN, behavior is like with ECN disabled.
|
||||
Default: 2
|
||||
|
||||
tcp_fack - BOOLEAN
|
||||
Enable FACK congestion avoidance and fast retransmission.
|
||||
@ -1048,6 +1057,13 @@ disable_ipv6 - BOOLEAN
|
||||
address.
|
||||
Default: FALSE (enable IPv6 operation)
|
||||
|
||||
When this value is changed from 1 to 0 (IPv6 is being enabled),
|
||||
it will dynamically create a link-local address on the given
|
||||
interface and start Duplicate Address Detection, if necessary.
|
||||
|
||||
When this value is changed from 0 to 1 (IPv6 is being disabled),
|
||||
it will dynamically delete all address on the given interface.
|
||||
|
||||
accept_dad - INTEGER
|
||||
Whether to accept DAD (Duplicate Address Detection).
|
||||
0: Disable DAD
|
||||
|
@ -33,3 +33,40 @@ disable
|
||||
|
||||
A reboot is required to enable IPv6.
|
||||
|
||||
autoconf
|
||||
|
||||
Specifies whether to enable IPv6 address autoconfiguration
|
||||
on all interfaces. This might be used when one does not wish
|
||||
for addresses to be automatically generated from prefixes
|
||||
received in Router Advertisements.
|
||||
|
||||
The possible values and their effects are:
|
||||
|
||||
0
|
||||
IPv6 address autoconfiguration is disabled on all interfaces.
|
||||
|
||||
Only the IPv6 loopback address (::1) and link-local addresses
|
||||
will be added to interfaces.
|
||||
|
||||
1
|
||||
IPv6 address autoconfiguration is enabled on all interfaces.
|
||||
|
||||
This is the default value.
|
||||
|
||||
disable_ipv6
|
||||
|
||||
Specifies whether to disable IPv6 on all interfaces.
|
||||
This might be used when no IPv6 addresses are desired.
|
||||
|
||||
The possible values and their effects are:
|
||||
|
||||
0
|
||||
IPv6 is enabled on all interfaces.
|
||||
|
||||
This is the default value.
|
||||
|
||||
1
|
||||
IPv6 is disabled on all interfaces.
|
||||
|
||||
No IPv6 addresses will be added to interfaces.
|
||||
|
||||
|
@ -12,38 +12,22 @@ following format:
|
||||
The radiotap format is discussed in
|
||||
./Documentation/networking/radiotap-headers.txt.
|
||||
|
||||
Despite 13 radiotap argument types are currently defined, most only make sense
|
||||
Despite many radiotap parameters being currently defined, most only make sense
|
||||
to appear on received packets. The following information is parsed from the
|
||||
radiotap headers and used to control injection:
|
||||
|
||||
* IEEE80211_RADIOTAP_RATE
|
||||
|
||||
rate in 500kbps units, automatic if invalid or not present
|
||||
|
||||
|
||||
* IEEE80211_RADIOTAP_ANTENNA
|
||||
|
||||
antenna to use, automatic if not present
|
||||
|
||||
|
||||
* IEEE80211_RADIOTAP_DBM_TX_POWER
|
||||
|
||||
transmit power in dBm, automatic if not present
|
||||
|
||||
|
||||
* IEEE80211_RADIOTAP_FLAGS
|
||||
|
||||
IEEE80211_RADIOTAP_F_FCS: FCS will be removed and recalculated
|
||||
IEEE80211_RADIOTAP_F_WEP: frame will be encrypted if key available
|
||||
IEEE80211_RADIOTAP_F_FRAG: frame will be fragmented if longer than the
|
||||
current fragmentation threshold. Note that
|
||||
this flag is only reliable when software
|
||||
fragmentation is enabled)
|
||||
current fragmentation threshold.
|
||||
|
||||
|
||||
The injection code can also skip all other currently defined radiotap fields
|
||||
facilitating replay of captured radiotap headers directly.
|
||||
|
||||
Here is an example valid radiotap header defining these three parameters
|
||||
Here is an example valid radiotap header defining some parameters
|
||||
|
||||
0x00, 0x00, // <-- radiotap version
|
||||
0x0b, 0x00, // <- radiotap header length
|
||||
@ -72,8 +56,8 @@ interface), along the following lines:
|
||||
...
|
||||
r = pcap_inject(ppcap, u8aSendBuffer, nLength);
|
||||
|
||||
You can also find sources for a complete inject test applet here:
|
||||
You can also find a link to a complete inject application here:
|
||||
|
||||
http://penumbra.warmcat.com/_twk/tiki-index.php?page=packetspammer
|
||||
http://wireless.kernel.org/en/users/Documentation/packetspammer
|
||||
|
||||
Andy Green <andy@warmcat.com>
|
||||
|
@ -38,9 +38,6 @@ ifinfomsg::if_flags & IFF_LOWER_UP:
|
||||
ifinfomsg::if_flags & IFF_DORMANT:
|
||||
Driver has signaled netif_dormant_on()
|
||||
|
||||
These interface flags can also be queried without netlink using the
|
||||
SIOCGIFFLAGS ioctl.
|
||||
|
||||
TLV IFLA_OPERSTATE
|
||||
|
||||
contains RFC2863 state of the interface in numeric representation:
|
||||
|
@ -4,16 +4,18 @@
|
||||
|
||||
This file documents the CONFIG_PACKET_MMAP option available with the PACKET
|
||||
socket interface on 2.4 and 2.6 kernels. This type of sockets is used for
|
||||
capture network traffic with utilities like tcpdump or any other that uses
|
||||
the libpcap library.
|
||||
|
||||
You can find the latest version of this document at
|
||||
capture network traffic with utilities like tcpdump or any other that needs
|
||||
raw access to network interface.
|
||||
|
||||
You can find the latest version of this document at:
|
||||
http://pusa.uv.es/~ulisses/packet_mmap/
|
||||
|
||||
Please send me your comments to
|
||||
Howto can be found at:
|
||||
http://wiki.gnu-log.net (packet_mmap)
|
||||
|
||||
Please send your comments to
|
||||
Ulisses Alonso Camaró <uaca@i.hate.spam.alumni.uv.es>
|
||||
Johann Baudy <johann.baudy@gnu-log.net>
|
||||
|
||||
-------------------------------------------------------------------------------
|
||||
+ Why use PACKET_MMAP
|
||||
@ -25,19 +27,24 @@ to capture each packet, it requires two if you want to get packet's
|
||||
timestamp (like libpcap always does).
|
||||
|
||||
In the other hand PACKET_MMAP is very efficient. PACKET_MMAP provides a size
|
||||
configurable circular buffer mapped in user space. This way reading packets just
|
||||
needs to wait for them, most of the time there is no need to issue a single
|
||||
system call. By using a shared buffer between the kernel and the user
|
||||
also has the benefit of minimizing packet copies.
|
||||
configurable circular buffer mapped in user space that can be used to either
|
||||
send or receive packets. This way reading packets just needs to wait for them,
|
||||
most of the time there is no need to issue a single system call. Concerning
|
||||
transmission, multiple packets can be sent through one system call to get the
|
||||
highest bandwidth.
|
||||
By using a shared buffer between the kernel and the user also has the benefit
|
||||
of minimizing packet copies.
|
||||
|
||||
It's fine to use PACKET_MMAP to improve the performance of the capture process,
|
||||
but it isn't everything. At least, if you are capturing at high speeds (this
|
||||
is relative to the cpu speed), you should check if the device driver of your
|
||||
network interface card supports some sort of interrupt load mitigation or
|
||||
(even better) if it supports NAPI, also make sure it is enabled.
|
||||
It's fine to use PACKET_MMAP to improve the performance of the capture and
|
||||
transmission process, but it isn't everything. At least, if you are capturing
|
||||
at high speeds (this is relative to the cpu speed), you should check if the
|
||||
device driver of your network interface card supports some sort of interrupt
|
||||
load mitigation or (even better) if it supports NAPI, also make sure it is
|
||||
enabled. For transmission, check the MTU (Maximum Transmission Unit) used and
|
||||
supported by devices of your network.
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
+ How to use CONFIG_PACKET_MMAP
|
||||
+ How to use CONFIG_PACKET_MMAP to improve capture process
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
From the user standpoint, you should use the higher level libpcap library, which
|
||||
@ -57,7 +64,7 @@ the low level details or want to improve libpcap by including PACKET_MMAP
|
||||
support.
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
+ How to use CONFIG_PACKET_MMAP directly
|
||||
+ How to use CONFIG_PACKET_MMAP directly to improve capture process
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
From the system calls stand point, the use of PACKET_MMAP involves
|
||||
@ -66,6 +73,7 @@ the following process:
|
||||
|
||||
[setup] socket() -------> creation of the capture socket
|
||||
setsockopt() ---> allocation of the circular buffer (ring)
|
||||
option: PACKET_RX_RING
|
||||
mmap() ---------> mapping of the allocated buffer to the
|
||||
user process
|
||||
|
||||
@ -96,6 +104,65 @@ Next I will describe PACKET_MMAP settings and it's constraints,
|
||||
also the mapping of the circular buffer in the user process and
|
||||
the use of this buffer.
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
+ How to use CONFIG_PACKET_MMAP directly to improve transmission process
|
||||
--------------------------------------------------------------------------------
|
||||
Transmission process is similar to capture as shown below.
|
||||
|
||||
[setup] socket() -------> creation of the transmission socket
|
||||
setsockopt() ---> allocation of the circular buffer (ring)
|
||||
option: PACKET_TX_RING
|
||||
bind() ---------> bind transmission socket with a network interface
|
||||
mmap() ---------> mapping of the allocated buffer to the
|
||||
user process
|
||||
|
||||
[transmission] poll() ---------> wait for free packets (optional)
|
||||
send() ---------> send all packets that are set as ready in
|
||||
the ring
|
||||
The flag MSG_DONTWAIT can be used to return
|
||||
before end of transfer.
|
||||
|
||||
[shutdown] close() --------> destruction of the transmission socket and
|
||||
deallocation of all associated resources.
|
||||
|
||||
Binding the socket to your network interface is mandatory (with zero copy) to
|
||||
know the header size of frames used in the circular buffer.
|
||||
|
||||
As capture, each frame contains two parts:
|
||||
|
||||
--------------------
|
||||
| struct tpacket_hdr | Header. It contains the status of
|
||||
| | of this frame
|
||||
|--------------------|
|
||||
| data buffer |
|
||||
. . Data that will be sent over the network interface.
|
||||
. .
|
||||
--------------------
|
||||
|
||||
bind() associates the socket to your network interface thanks to
|
||||
sll_ifindex parameter of struct sockaddr_ll.
|
||||
|
||||
Initialization example:
|
||||
|
||||
struct sockaddr_ll my_addr;
|
||||
struct ifreq s_ifr;
|
||||
...
|
||||
|
||||
strncpy (s_ifr.ifr_name, "eth0", sizeof(s_ifr.ifr_name));
|
||||
|
||||
/* get interface index of eth0 */
|
||||
ioctl(this->socket, SIOCGIFINDEX, &s_ifr);
|
||||
|
||||
/* fill sockaddr_ll struct to prepare binding */
|
||||
my_addr.sll_family = AF_PACKET;
|
||||
my_addr.sll_protocol = ETH_P_ALL;
|
||||
my_addr.sll_ifindex = s_ifr.ifr_ifindex;
|
||||
|
||||
/* bind socket to eth0 */
|
||||
bind(this->socket, (struct sockaddr *)&my_addr, sizeof(struct sockaddr_ll));
|
||||
|
||||
A complete tutorial is available at: http://wiki.gnu-log.net/
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
+ PACKET_MMAP settings
|
||||
--------------------------------------------------------------------------------
|
||||
@ -103,7 +170,10 @@ the use of this buffer.
|
||||
|
||||
To setup PACKET_MMAP from user level code is done with a call like
|
||||
|
||||
- Capture process
|
||||
setsockopt(fd, SOL_PACKET, PACKET_RX_RING, (void *) &req, sizeof(req))
|
||||
- Transmission process
|
||||
setsockopt(fd, SOL_PACKET, PACKET_TX_RING, (void *) &req, sizeof(req))
|
||||
|
||||
The most significant argument in the previous call is the req parameter,
|
||||
this parameter must to have the following structure:
|
||||
@ -117,11 +187,11 @@ this parameter must to have the following structure:
|
||||
};
|
||||
|
||||
This structure is defined in /usr/include/linux/if_packet.h and establishes a
|
||||
circular buffer (ring) of unswappable memory mapped in the capture process.
|
||||
circular buffer (ring) of unswappable memory.
|
||||
Being mapped in the capture process allows reading the captured frames and
|
||||
related meta-information like timestamps without requiring a system call.
|
||||
|
||||
Captured frames are grouped in blocks. Each block is a physically contiguous
|
||||
Frames are grouped in blocks. Each block is a physically contiguous
|
||||
region of memory and holds tp_block_size/tp_frame_size frames. The total number
|
||||
of blocks is tp_block_nr. Note that tp_frame_nr is a redundant parameter because
|
||||
|
||||
@ -336,6 +406,7 @@ struct tpacket_hdr). If this field is 0 means that the frame is ready
|
||||
to be used for the kernel, If not, there is a frame the user can read
|
||||
and the following flags apply:
|
||||
|
||||
+++ Capture process:
|
||||
from include/linux/if_packet.h
|
||||
|
||||
#define TP_STATUS_COPY 2
|
||||
@ -391,6 +462,37 @@ packets are in the ring:
|
||||
It doesn't incur in a race condition to first check the status value and
|
||||
then poll for frames.
|
||||
|
||||
|
||||
++ Transmission process
|
||||
Those defines are also used for transmission:
|
||||
|
||||
#define TP_STATUS_AVAILABLE 0 // Frame is available
|
||||
#define TP_STATUS_SEND_REQUEST 1 // Frame will be sent on next send()
|
||||
#define TP_STATUS_SENDING 2 // Frame is currently in transmission
|
||||
#define TP_STATUS_WRONG_FORMAT 4 // Frame format is not correct
|
||||
|
||||
First, the kernel initializes all frames to TP_STATUS_AVAILABLE. To send a
|
||||
packet, the user fills a data buffer of an available frame, sets tp_len to
|
||||
current data buffer size and sets its status field to TP_STATUS_SEND_REQUEST.
|
||||
This can be done on multiple frames. Once the user is ready to transmit, it
|
||||
calls send(). Then all buffers with status equal to TP_STATUS_SEND_REQUEST are
|
||||
forwarded to the network device. The kernel updates each status of sent
|
||||
frames with TP_STATUS_SENDING until the end of transfer.
|
||||
At the end of each transfer, buffer status returns to TP_STATUS_AVAILABLE.
|
||||
|
||||
header->tp_len = in_i_size;
|
||||
header->tp_status = TP_STATUS_SEND_REQUEST;
|
||||
retval = send(this->socket, NULL, 0, 0);
|
||||
|
||||
The user can also use poll() to check if a buffer is available:
|
||||
(status == TP_STATUS_SENDING)
|
||||
|
||||
struct pollfd pfd;
|
||||
pfd.fd = fd;
|
||||
pfd.revents = 0;
|
||||
pfd.events = POLLOUT;
|
||||
retval = poll(&pfd, 1, timeout);
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
+ THANKS
|
||||
--------------------------------------------------------------------------------
|
||||
|
File diff suppressed because it is too large
Load Diff
148
Documentation/powerpc/dts-bindings/4xx/emac.txt
Normal file
148
Documentation/powerpc/dts-bindings/4xx/emac.txt
Normal file
@ -0,0 +1,148 @@
|
||||
4xx/Axon EMAC ethernet nodes
|
||||
|
||||
The EMAC ethernet controller in IBM and AMCC 4xx chips, and also
|
||||
the Axon bridge. To operate this needs to interact with a ths
|
||||
special McMAL DMA controller, and sometimes an RGMII or ZMII
|
||||
interface. In addition to the nodes and properties described
|
||||
below, the node for the OPB bus on which the EMAC sits must have a
|
||||
correct clock-frequency property.
|
||||
|
||||
i) The EMAC node itself
|
||||
|
||||
Required properties:
|
||||
- device_type : "network"
|
||||
|
||||
- compatible : compatible list, contains 2 entries, first is
|
||||
"ibm,emac-CHIP" where CHIP is the host ASIC (440gx,
|
||||
405gp, Axon) and second is either "ibm,emac" or
|
||||
"ibm,emac4". For Axon, thus, we have: "ibm,emac-axon",
|
||||
"ibm,emac4"
|
||||
- interrupts : <interrupt mapping for EMAC IRQ and WOL IRQ>
|
||||
- interrupt-parent : optional, if needed for interrupt mapping
|
||||
- reg : <registers mapping>
|
||||
- local-mac-address : 6 bytes, MAC address
|
||||
- mal-device : phandle of the associated McMAL node
|
||||
- mal-tx-channel : 1 cell, index of the tx channel on McMAL associated
|
||||
with this EMAC
|
||||
- mal-rx-channel : 1 cell, index of the rx channel on McMAL associated
|
||||
with this EMAC
|
||||
- cell-index : 1 cell, hardware index of the EMAC cell on a given
|
||||
ASIC (typically 0x0 and 0x1 for EMAC0 and EMAC1 on
|
||||
each Axon chip)
|
||||
- max-frame-size : 1 cell, maximum frame size supported in bytes
|
||||
- rx-fifo-size : 1 cell, Rx fifo size in bytes for 10 and 100 Mb/sec
|
||||
operations.
|
||||
For Axon, 2048
|
||||
- tx-fifo-size : 1 cell, Tx fifo size in bytes for 10 and 100 Mb/sec
|
||||
operations.
|
||||
For Axon, 2048.
|
||||
- fifo-entry-size : 1 cell, size of a fifo entry (used to calculate
|
||||
thresholds).
|
||||
For Axon, 0x00000010
|
||||
- mal-burst-size : 1 cell, MAL burst size (used to calculate thresholds)
|
||||
in bytes.
|
||||
For Axon, 0x00000100 (I think ...)
|
||||
- phy-mode : string, mode of operations of the PHY interface.
|
||||
Supported values are: "mii", "rmii", "smii", "rgmii",
|
||||
"tbi", "gmii", rtbi", "sgmii".
|
||||
For Axon on CAB, it is "rgmii"
|
||||
- mdio-device : 1 cell, required iff using shared MDIO registers
|
||||
(440EP). phandle of the EMAC to use to drive the
|
||||
MDIO lines for the PHY used by this EMAC.
|
||||
- zmii-device : 1 cell, required iff connected to a ZMII. phandle of
|
||||
the ZMII device node
|
||||
- zmii-channel : 1 cell, required iff connected to a ZMII. Which ZMII
|
||||
channel or 0xffffffff if ZMII is only used for MDIO.
|
||||
- rgmii-device : 1 cell, required iff connected to an RGMII. phandle
|
||||
of the RGMII device node.
|
||||
For Axon: phandle of plb5/plb4/opb/rgmii
|
||||
- rgmii-channel : 1 cell, required iff connected to an RGMII. Which
|
||||
RGMII channel is used by this EMAC.
|
||||
Fox Axon: present, whatever value is appropriate for each
|
||||
EMAC, that is the content of the current (bogus) "phy-port"
|
||||
property.
|
||||
|
||||
Optional properties:
|
||||
- phy-address : 1 cell, optional, MDIO address of the PHY. If absent,
|
||||
a search is performed.
|
||||
- phy-map : 1 cell, optional, bitmap of addresses to probe the PHY
|
||||
for, used if phy-address is absent. bit 0x00000001 is
|
||||
MDIO address 0.
|
||||
For Axon it can be absent, though my current driver
|
||||
doesn't handle phy-address yet so for now, keep
|
||||
0x00ffffff in it.
|
||||
- rx-fifo-size-gige : 1 cell, Rx fifo size in bytes for 1000 Mb/sec
|
||||
operations (if absent the value is the same as
|
||||
rx-fifo-size). For Axon, either absent or 2048.
|
||||
- tx-fifo-size-gige : 1 cell, Tx fifo size in bytes for 1000 Mb/sec
|
||||
operations (if absent the value is the same as
|
||||
tx-fifo-size). For Axon, either absent or 2048.
|
||||
- tah-device : 1 cell, optional. If connected to a TAH engine for
|
||||
offload, phandle of the TAH device node.
|
||||
- tah-channel : 1 cell, optional. If appropriate, channel used on the
|
||||
TAH engine.
|
||||
|
||||
Example:
|
||||
|
||||
EMAC0: ethernet@40000800 {
|
||||
device_type = "network";
|
||||
compatible = "ibm,emac-440gp", "ibm,emac";
|
||||
interrupt-parent = <&UIC1>;
|
||||
interrupts = <1c 4 1d 4>;
|
||||
reg = <40000800 70>;
|
||||
local-mac-address = [00 04 AC E3 1B 1E];
|
||||
mal-device = <&MAL0>;
|
||||
mal-tx-channel = <0 1>;
|
||||
mal-rx-channel = <0>;
|
||||
cell-index = <0>;
|
||||
max-frame-size = <5dc>;
|
||||
rx-fifo-size = <1000>;
|
||||
tx-fifo-size = <800>;
|
||||
phy-mode = "rmii";
|
||||
phy-map = <00000001>;
|
||||
zmii-device = <&ZMII0>;
|
||||
zmii-channel = <0>;
|
||||
};
|
||||
|
||||
ii) McMAL node
|
||||
|
||||
Required properties:
|
||||
- device_type : "dma-controller"
|
||||
- compatible : compatible list, containing 2 entries, first is
|
||||
"ibm,mcmal-CHIP" where CHIP is the host ASIC (like
|
||||
emac) and the second is either "ibm,mcmal" or
|
||||
"ibm,mcmal2".
|
||||
For Axon, "ibm,mcmal-axon","ibm,mcmal2"
|
||||
- interrupts : <interrupt mapping for the MAL interrupts sources:
|
||||
5 sources: tx_eob, rx_eob, serr, txde, rxde>.
|
||||
For Axon: This is _different_ from the current
|
||||
firmware. We use the "delayed" interrupts for txeob
|
||||
and rxeob. Thus we end up with mapping those 5 MPIC
|
||||
interrupts, all level positive sensitive: 10, 11, 32,
|
||||
33, 34 (in decimal)
|
||||
- dcr-reg : < DCR registers range >
|
||||
- dcr-parent : if needed for dcr-reg
|
||||
- num-tx-chans : 1 cell, number of Tx channels
|
||||
- num-rx-chans : 1 cell, number of Rx channels
|
||||
|
||||
iii) ZMII node
|
||||
|
||||
Required properties:
|
||||
- compatible : compatible list, containing 2 entries, first is
|
||||
"ibm,zmii-CHIP" where CHIP is the host ASIC (like
|
||||
EMAC) and the second is "ibm,zmii".
|
||||
For Axon, there is no ZMII node.
|
||||
- reg : <registers mapping>
|
||||
|
||||
iv) RGMII node
|
||||
|
||||
Required properties:
|
||||
- compatible : compatible list, containing 2 entries, first is
|
||||
"ibm,rgmii-CHIP" where CHIP is the host ASIC (like
|
||||
EMAC) and the second is "ibm,rgmii".
|
||||
For Axon, "ibm,rgmii-axon","ibm,rgmii"
|
||||
- reg : <registers mapping>
|
||||
- revision : as provided by the RGMII new version register if
|
||||
available.
|
||||
For Axon: 0x0000012a
|
||||
|
53
Documentation/powerpc/dts-bindings/can/sja1000.txt
Normal file
53
Documentation/powerpc/dts-bindings/can/sja1000.txt
Normal file
@ -0,0 +1,53 @@
|
||||
Memory mapped SJA1000 CAN controller from NXP (formerly Philips)
|
||||
|
||||
Required properties:
|
||||
|
||||
- compatible : should be "nxp,sja1000".
|
||||
|
||||
- reg : should specify the chip select, address offset and size required
|
||||
to map the registers of the SJA1000. The size is usually 0x80.
|
||||
|
||||
- interrupts: property with a value describing the interrupt source
|
||||
(number and sensitivity) required for the SJA1000.
|
||||
|
||||
Optional properties:
|
||||
|
||||
- nxp,external-clock-frequency : Frequency of the external oscillator
|
||||
clock in Hz. Note that the internal clock frequency used by the
|
||||
SJA1000 is half of that value. If not specified, a default value
|
||||
of 16000000 (16 MHz) is used.
|
||||
|
||||
- nxp,tx-output-mode : operation mode of the TX output control logic:
|
||||
<0x0> : bi-phase output mode
|
||||
<0x1> : normal output mode (default)
|
||||
<0x2> : test output mode
|
||||
<0x3> : clock output mode
|
||||
|
||||
- nxp,tx-output-config : TX output pin configuration:
|
||||
<0x01> : TX0 invert
|
||||
<0x02> : TX0 pull-down (default)
|
||||
<0x04> : TX0 pull-up
|
||||
<0x06> : TX0 push-pull
|
||||
<0x08> : TX1 invert
|
||||
<0x10> : TX1 pull-down
|
||||
<0x20> : TX1 pull-up
|
||||
<0x30> : TX1 push-pull
|
||||
|
||||
- nxp,clock-out-frequency : clock frequency in Hz on the CLKOUT pin.
|
||||
If not specified or if the specified value is 0, the CLKOUT pin
|
||||
will be disabled.
|
||||
|
||||
- nxp,no-comparator-bypass : Allows to disable the CAN input comperator.
|
||||
|
||||
For futher information, please have a look to the SJA1000 data sheet.
|
||||
|
||||
Examples:
|
||||
|
||||
can@3,100 {
|
||||
compatible = "nxp,sja1000";
|
||||
reg = <3 0x100 0x80>;
|
||||
interrupts = <2 0>;
|
||||
interrupt-parent = <&mpic>;
|
||||
nxp,external-clock-frequency = <16000000>;
|
||||
};
|
||||
|
64
Documentation/powerpc/dts-bindings/ecm.txt
Normal file
64
Documentation/powerpc/dts-bindings/ecm.txt
Normal file
@ -0,0 +1,64 @@
|
||||
=====================================================================
|
||||
E500 LAW & Coherency Module Device Tree Binding
|
||||
Copyright (C) 2009 Freescale Semiconductor Inc.
|
||||
=====================================================================
|
||||
|
||||
Local Access Window (LAW) Node
|
||||
|
||||
The LAW node represents the region of CCSR space where local access
|
||||
windows are configured. For ECM based devices this is the first 4k
|
||||
of CCSR space that includes CCSRBAR, ALTCBAR, ALTCAR, BPTR, and some
|
||||
number of local access windows as specified by fsl,num-laws.
|
||||
|
||||
PROPERTIES
|
||||
|
||||
- compatible
|
||||
Usage: required
|
||||
Value type: <string>
|
||||
Definition: Must include "fsl,ecm-law"
|
||||
|
||||
- reg
|
||||
Usage: required
|
||||
Value type: <prop-encoded-array>
|
||||
Definition: A standard property. The value specifies the
|
||||
physical address offset and length of the CCSR space
|
||||
registers.
|
||||
|
||||
- fsl,num-laws
|
||||
Usage: required
|
||||
Value type: <u32>
|
||||
Definition: The value specifies the number of local access
|
||||
windows for this device.
|
||||
|
||||
=====================================================================
|
||||
|
||||
E500 Coherency Module Node
|
||||
|
||||
The E500 LAW node represents the region of CCSR space where ECM config
|
||||
and error reporting registers exist, this is the second 4k (0x1000)
|
||||
of CCSR space.
|
||||
|
||||
PROPERTIES
|
||||
|
||||
- compatible
|
||||
Usage: required
|
||||
Value type: <string>
|
||||
Definition: Must include "fsl,CHIP-ecm", "fsl,ecm" where
|
||||
CHIP is the processor (mpc8572, mpc8544, etc.)
|
||||
|
||||
- reg
|
||||
Usage: required
|
||||
Value type: <prop-encoded-array>
|
||||
Definition: A standard property. The value specifies the
|
||||
physical address offset and length of the CCSR space
|
||||
registers.
|
||||
|
||||
- interrupts
|
||||
Usage: required
|
||||
Value type: <prop-encoded-array>
|
||||
|
||||
- interrupt-parent
|
||||
Usage: required
|
||||
Value type: <phandle>
|
||||
|
||||
=====================================================================
|
@ -17,6 +17,9 @@ Required properties:
|
||||
- model : precise model of the QE, Can be "QE", "CPM", or "CPM2"
|
||||
- reg : offset and length of the device registers.
|
||||
- bus-frequency : the clock frequency for QUICC Engine.
|
||||
- fsl,qe-num-riscs: define how many RISC engines the QE has.
|
||||
- fsl,qe-num-snums: define how many serial number(SNUM) the QE can use for the
|
||||
threads.
|
||||
|
||||
Recommended properties
|
||||
- brg-frequency : the internal clock source frequency for baud-rate
|
||||
|
@ -5,17 +5,18 @@ for MMC, SD, and SDIO types of memory cards.
|
||||
|
||||
Required properties:
|
||||
- compatible : should be
|
||||
"fsl,<chip>-esdhc", "fsl,mpc8379-esdhc" for MPC83xx processors.
|
||||
"fsl,<chip>-esdhc", "fsl,mpc8536-esdhc" for MPC85xx processors.
|
||||
"fsl,<chip>-esdhc", "fsl,esdhc"
|
||||
- reg : should contain eSDHC registers location and length.
|
||||
- interrupts : should contain eSDHC interrupt.
|
||||
- interrupt-parent : interrupt source phandle.
|
||||
- clock-frequency : specifies eSDHC base clock frequency.
|
||||
- sdhci,1-bit-only : (optional) specifies that a controller can
|
||||
only handle 1-bit data transfers.
|
||||
|
||||
Example:
|
||||
|
||||
sdhci@2e000 {
|
||||
compatible = "fsl,mpc8378-esdhc", "fsl,mpc8379-esdhc";
|
||||
compatible = "fsl,mpc8378-esdhc", "fsl,esdhc";
|
||||
reg = <0x2e000 0x1000>;
|
||||
interrupts = <42 0x8>;
|
||||
interrupt-parent = <&ipic>;
|
||||
|
64
Documentation/powerpc/dts-bindings/fsl/mcm.txt
Normal file
64
Documentation/powerpc/dts-bindings/fsl/mcm.txt
Normal file
@ -0,0 +1,64 @@
|
||||
=====================================================================
|
||||
MPX LAW & Coherency Module Device Tree Binding
|
||||
Copyright (C) 2009 Freescale Semiconductor Inc.
|
||||
=====================================================================
|
||||
|
||||
Local Access Window (LAW) Node
|
||||
|
||||
The LAW node represents the region of CCSR space where local access
|
||||
windows are configured. For MCM based devices this is the first 4k
|
||||
of CCSR space that includes CCSRBAR, ALTCBAR, ALTCAR, BPTR, and some
|
||||
number of local access windows as specified by fsl,num-laws.
|
||||
|
||||
PROPERTIES
|
||||
|
||||
- compatible
|
||||
Usage: required
|
||||
Value type: <string>
|
||||
Definition: Must include "fsl,mcm-law"
|
||||
|
||||
- reg
|
||||
Usage: required
|
||||
Value type: <prop-encoded-array>
|
||||
Definition: A standard property. The value specifies the
|
||||
physical address offset and length of the CCSR space
|
||||
registers.
|
||||
|
||||
- fsl,num-laws
|
||||
Usage: required
|
||||
Value type: <u32>
|
||||
Definition: The value specifies the number of local access
|
||||
windows for this device.
|
||||
|
||||
=====================================================================
|
||||
|
||||
MPX Coherency Module Node
|
||||
|
||||
The MPX LAW node represents the region of CCSR space where MCM config
|
||||
and error reporting registers exist, this is the second 4k (0x1000)
|
||||
of CCSR space.
|
||||
|
||||
PROPERTIES
|
||||
|
||||
- compatible
|
||||
Usage: required
|
||||
Value type: <string>
|
||||
Definition: Must include "fsl,CHIP-mcm", "fsl,mcm" where
|
||||
CHIP is the processor (mpc8641, mpc8610, etc.)
|
||||
|
||||
- reg
|
||||
Usage: required
|
||||
Value type: <prop-encoded-array>
|
||||
Definition: A standard property. The value specifies the
|
||||
physical address offset and length of the CCSR space
|
||||
registers.
|
||||
|
||||
- interrupts
|
||||
Usage: required
|
||||
Value type: <prop-encoded-array>
|
||||
|
||||
- interrupt-parent
|
||||
Usage: required
|
||||
Value type: <phandle>
|
||||
|
||||
=====================================================================
|
50
Documentation/powerpc/dts-bindings/gpio/gpio.txt
Normal file
50
Documentation/powerpc/dts-bindings/gpio/gpio.txt
Normal file
@ -0,0 +1,50 @@
|
||||
Specifying GPIO information for devices
|
||||
============================================
|
||||
|
||||
1) gpios property
|
||||
-----------------
|
||||
|
||||
Nodes that makes use of GPIOs should define them using `gpios' property,
|
||||
format of which is: <&gpio-controller1-phandle gpio1-specifier
|
||||
&gpio-controller2-phandle gpio2-specifier
|
||||
0 /* holes are permitted, means no GPIO 3 */
|
||||
&gpio-controller4-phandle gpio4-specifier
|
||||
...>;
|
||||
|
||||
Note that gpio-specifier length is controller dependent.
|
||||
|
||||
gpio-specifier may encode: bank, pin position inside the bank,
|
||||
whether pin is open-drain and whether pin is logically inverted.
|
||||
|
||||
Example of the node using GPIOs:
|
||||
|
||||
node {
|
||||
gpios = <&qe_pio_e 18 0>;
|
||||
};
|
||||
|
||||
In this example gpio-specifier is "18 0" and encodes GPIO pin number,
|
||||
and empty GPIO flags as accepted by the "qe_pio_e" gpio-controller.
|
||||
|
||||
2) gpio-controller nodes
|
||||
------------------------
|
||||
|
||||
Every GPIO controller node must have #gpio-cells property defined,
|
||||
this information will be used to translate gpio-specifiers.
|
||||
|
||||
Example of two SOC GPIO banks defined as gpio-controller nodes:
|
||||
|
||||
qe_pio_a: gpio-controller@1400 {
|
||||
#gpio-cells = <2>;
|
||||
compatible = "fsl,qe-pario-bank-a", "fsl,qe-pario-bank";
|
||||
reg = <0x1400 0x18>;
|
||||
gpio-controller;
|
||||
};
|
||||
|
||||
qe_pio_e: gpio-controller@1460 {
|
||||
#gpio-cells = <2>;
|
||||
compatible = "fsl,qe-pario-bank-e", "fsl,qe-pario-bank";
|
||||
reg = <0x1460 0x18>;
|
||||
gpio-controller;
|
||||
};
|
||||
|
||||
|
@ -16,10 +16,17 @@ LED sub-node properties:
|
||||
string defining the trigger assigned to the LED. Current triggers are:
|
||||
"backlight" - LED will act as a back-light, controlled by the framebuffer
|
||||
system
|
||||
"default-on" - LED will turn on
|
||||
"default-on" - LED will turn on, but see "default-state" below
|
||||
"heartbeat" - LED "double" flashes at a load average based rate
|
||||
"ide-disk" - LED indicates disk activity
|
||||
"timer" - LED flashes at a fixed, configurable rate
|
||||
- default-state: (optional) The initial state of the LED. Valid
|
||||
values are "on", "off", and "keep". If the LED is already on or off
|
||||
and the default-state property is set the to same value, then no
|
||||
glitch should be produced where the LED momentarily turns off (or
|
||||
on). The "keep" setting will keep the LED at whatever its current
|
||||
state is, without producing a glitch. The default is off if this
|
||||
property is not present.
|
||||
|
||||
Examples:
|
||||
|
||||
@ -30,14 +37,22 @@ leds {
|
||||
gpios = <&mcu_pio 0 1>; /* Active low */
|
||||
linux,default-trigger = "ide-disk";
|
||||
};
|
||||
|
||||
fault {
|
||||
gpios = <&mcu_pio 1 0>;
|
||||
/* Keep LED on if BIOS detected hardware fault */
|
||||
default-state = "keep";
|
||||
};
|
||||
};
|
||||
|
||||
run-control {
|
||||
compatible = "gpio-leds";
|
||||
red {
|
||||
gpios = <&mpc8572 6 0>;
|
||||
default-state = "off";
|
||||
};
|
||||
green {
|
||||
gpios = <&mpc8572 7 0>;
|
||||
default-state = "on";
|
||||
};
|
||||
}
|
||||
|
19
Documentation/powerpc/dts-bindings/gpio/mdio.txt
Normal file
19
Documentation/powerpc/dts-bindings/gpio/mdio.txt
Normal file
@ -0,0 +1,19 @@
|
||||
MDIO on GPIOs
|
||||
|
||||
Currently defined compatibles:
|
||||
- virtual,gpio-mdio
|
||||
|
||||
MDC and MDIO lines connected to GPIO controllers are listed in the
|
||||
gpios property as described in section VIII.1 in the following order:
|
||||
|
||||
MDC, MDIO.
|
||||
|
||||
Example:
|
||||
|
||||
mdio {
|
||||
compatible = "virtual,mdio-gpio";
|
||||
#address-cells = <1>;
|
||||
#size-cells = <0>;
|
||||
gpios = <&qe_pio_a 11
|
||||
&qe_pio_c 6>;
|
||||
};
|
521
Documentation/powerpc/dts-bindings/marvell.txt
Normal file
521
Documentation/powerpc/dts-bindings/marvell.txt
Normal file
@ -0,0 +1,521 @@
|
||||
Marvell Discovery mv64[345]6x System Controller chips
|
||||
===========================================================
|
||||
|
||||
The Marvell mv64[345]60 series of system controller chips contain
|
||||
many of the peripherals needed to implement a complete computer
|
||||
system. In this section, we define device tree nodes to describe
|
||||
the system controller chip itself and each of the peripherals
|
||||
which it contains. Compatible string values for each node are
|
||||
prefixed with the string "marvell,", for Marvell Technology Group Ltd.
|
||||
|
||||
1) The /system-controller node
|
||||
|
||||
This node is used to represent the system-controller and must be
|
||||
present when the system uses a system controller chip. The top-level
|
||||
system-controller node contains information that is global to all
|
||||
devices within the system controller chip. The node name begins
|
||||
with "system-controller" followed by the unit address, which is
|
||||
the base address of the memory-mapped register set for the system
|
||||
controller chip.
|
||||
|
||||
Required properties:
|
||||
|
||||
- ranges : Describes the translation of system controller addresses
|
||||
for memory mapped registers.
|
||||
- clock-frequency: Contains the main clock frequency for the system
|
||||
controller chip.
|
||||
- reg : This property defines the address and size of the
|
||||
memory-mapped registers contained within the system controller
|
||||
chip. The address specified in the "reg" property should match
|
||||
the unit address of the system-controller node.
|
||||
- #address-cells : Address representation for system controller
|
||||
devices. This field represents the number of cells needed to
|
||||
represent the address of the memory-mapped registers of devices
|
||||
within the system controller chip.
|
||||
- #size-cells : Size representation for for the memory-mapped
|
||||
registers within the system controller chip.
|
||||
- #interrupt-cells : Defines the width of cells used to represent
|
||||
interrupts.
|
||||
|
||||
Optional properties:
|
||||
|
||||
- model : The specific model of the system controller chip. Such
|
||||
as, "mv64360", "mv64460", or "mv64560".
|
||||
- compatible : A string identifying the compatibility identifiers
|
||||
of the system controller chip.
|
||||
|
||||
The system-controller node contains child nodes for each system
|
||||
controller device that the platform uses. Nodes should not be created
|
||||
for devices which exist on the system controller chip but are not used
|
||||
|
||||
Example Marvell Discovery mv64360 system-controller node:
|
||||
|
||||
system-controller@f1000000 { /* Marvell Discovery mv64360 */
|
||||
#address-cells = <1>;
|
||||
#size-cells = <1>;
|
||||
model = "mv64360"; /* Default */
|
||||
compatible = "marvell,mv64360";
|
||||
clock-frequency = <133333333>;
|
||||
reg = <0xf1000000 0x10000>;
|
||||
virtual-reg = <0xf1000000>;
|
||||
ranges = <0x88000000 0x88000000 0x1000000 /* PCI 0 I/O Space */
|
||||
0x80000000 0x80000000 0x8000000 /* PCI 0 MEM Space */
|
||||
0xa0000000 0xa0000000 0x4000000 /* User FLASH */
|
||||
0x00000000 0xf1000000 0x0010000 /* Bridge's regs */
|
||||
0xf2000000 0xf2000000 0x0040000>;/* Integrated SRAM */
|
||||
|
||||
[ child node definitions... ]
|
||||
}
|
||||
|
||||
2) Child nodes of /system-controller
|
||||
|
||||
a) Marvell Discovery MDIO bus
|
||||
|
||||
The MDIO is a bus to which the PHY devices are connected. For each
|
||||
device that exists on this bus, a child node should be created. See
|
||||
the definition of the PHY node below for an example of how to define
|
||||
a PHY.
|
||||
|
||||
Required properties:
|
||||
- #address-cells : Should be <1>
|
||||
- #size-cells : Should be <0>
|
||||
- device_type : Should be "mdio"
|
||||
- compatible : Should be "marvell,mv64360-mdio"
|
||||
|
||||
Example:
|
||||
|
||||
mdio {
|
||||
#address-cells = <1>;
|
||||
#size-cells = <0>;
|
||||
device_type = "mdio";
|
||||
compatible = "marvell,mv64360-mdio";
|
||||
|
||||
ethernet-phy@0 {
|
||||
......
|
||||
};
|
||||
};
|
||||
|
||||
|
||||
b) Marvell Discovery ethernet controller
|
||||
|
||||
The Discover ethernet controller is described with two levels
|
||||
of nodes. The first level describes an ethernet silicon block
|
||||
and the second level describes up to 3 ethernet nodes within
|
||||
that block. The reason for the multiple levels is that the
|
||||
registers for the node are interleaved within a single set
|
||||
of registers. The "ethernet-block" level describes the
|
||||
shared register set, and the "ethernet" nodes describe ethernet
|
||||
port-specific properties.
|
||||
|
||||
Ethernet block node
|
||||
|
||||
Required properties:
|
||||
- #address-cells : <1>
|
||||
- #size-cells : <0>
|
||||
- compatible : "marvell,mv64360-eth-block"
|
||||
- reg : Offset and length of the register set for this block
|
||||
|
||||
Example Discovery Ethernet block node:
|
||||
ethernet-block@2000 {
|
||||
#address-cells = <1>;
|
||||
#size-cells = <0>;
|
||||
compatible = "marvell,mv64360-eth-block";
|
||||
reg = <0x2000 0x2000>;
|
||||
ethernet@0 {
|
||||
.......
|
||||
};
|
||||
};
|
||||
|
||||
Ethernet port node
|
||||
|
||||
Required properties:
|
||||
- device_type : Should be "network".
|
||||
- compatible : Should be "marvell,mv64360-eth".
|
||||
- reg : Should be <0>, <1>, or <2>, according to which registers
|
||||
within the silicon block the device uses.
|
||||
- interrupts : <a> where a is the interrupt number for the port.
|
||||
- interrupt-parent : the phandle for the interrupt controller
|
||||
that services interrupts for this device.
|
||||
- phy : the phandle for the PHY connected to this ethernet
|
||||
controller.
|
||||
- local-mac-address : 6 bytes, MAC address
|
||||
|
||||
Example Discovery Ethernet port node:
|
||||
ethernet@0 {
|
||||
device_type = "network";
|
||||
compatible = "marvell,mv64360-eth";
|
||||
reg = <0>;
|
||||
interrupts = <32>;
|
||||
interrupt-parent = <&PIC>;
|
||||
phy = <&PHY0>;
|
||||
local-mac-address = [ 00 00 00 00 00 00 ];
|
||||
};
|
||||
|
||||
|
||||
|
||||
c) Marvell Discovery PHY nodes
|
||||
|
||||
Required properties:
|
||||
- device_type : Should be "ethernet-phy"
|
||||
- interrupts : <a> where a is the interrupt number for this phy.
|
||||
- interrupt-parent : the phandle for the interrupt controller that
|
||||
services interrupts for this device.
|
||||
- reg : The ID number for the phy, usually a small integer
|
||||
|
||||
Example Discovery PHY node:
|
||||
ethernet-phy@1 {
|
||||
device_type = "ethernet-phy";
|
||||
compatible = "broadcom,bcm5421";
|
||||
interrupts = <76>; /* GPP 12 */
|
||||
interrupt-parent = <&PIC>;
|
||||
reg = <1>;
|
||||
};
|
||||
|
||||
|
||||
d) Marvell Discovery SDMA nodes
|
||||
|
||||
Represent DMA hardware associated with the MPSC (multiprotocol
|
||||
serial controllers).
|
||||
|
||||
Required properties:
|
||||
- compatible : "marvell,mv64360-sdma"
|
||||
- reg : Offset and length of the register set for this device
|
||||
- interrupts : <a> where a is the interrupt number for the DMA
|
||||
device.
|
||||
- interrupt-parent : the phandle for the interrupt controller
|
||||
that services interrupts for this device.
|
||||
|
||||
Example Discovery SDMA node:
|
||||
sdma@4000 {
|
||||
compatible = "marvell,mv64360-sdma";
|
||||
reg = <0x4000 0xc18>;
|
||||
virtual-reg = <0xf1004000>;
|
||||
interrupts = <36>;
|
||||
interrupt-parent = <&PIC>;
|
||||
};
|
||||
|
||||
|
||||
e) Marvell Discovery BRG nodes
|
||||
|
||||
Represent baud rate generator hardware associated with the MPSC
|
||||
(multiprotocol serial controllers).
|
||||
|
||||
Required properties:
|
||||
- compatible : "marvell,mv64360-brg"
|
||||
- reg : Offset and length of the register set for this device
|
||||
- clock-src : A value from 0 to 15 which selects the clock
|
||||
source for the baud rate generator. This value corresponds
|
||||
to the CLKS value in the BRGx configuration register. See
|
||||
the mv64x60 User's Manual.
|
||||
- clock-frequence : The frequency (in Hz) of the baud rate
|
||||
generator's input clock.
|
||||
- current-speed : The current speed setting (presumably by
|
||||
firmware) of the baud rate generator.
|
||||
|
||||
Example Discovery BRG node:
|
||||
brg@b200 {
|
||||
compatible = "marvell,mv64360-brg";
|
||||
reg = <0xb200 0x8>;
|
||||
clock-src = <8>;
|
||||
clock-frequency = <133333333>;
|
||||
current-speed = <9600>;
|
||||
};
|
||||
|
||||
|
||||
f) Marvell Discovery CUNIT nodes
|
||||
|
||||
Represent the Serial Communications Unit device hardware.
|
||||
|
||||
Required properties:
|
||||
- reg : Offset and length of the register set for this device
|
||||
|
||||
Example Discovery CUNIT node:
|
||||
cunit@f200 {
|
||||
reg = <0xf200 0x200>;
|
||||
};
|
||||
|
||||
|
||||
g) Marvell Discovery MPSCROUTING nodes
|
||||
|
||||
Represent the Discovery's MPSC routing hardware
|
||||
|
||||
Required properties:
|
||||
- reg : Offset and length of the register set for this device
|
||||
|
||||
Example Discovery CUNIT node:
|
||||
mpscrouting@b500 {
|
||||
reg = <0xb400 0xc>;
|
||||
};
|
||||
|
||||
|
||||
h) Marvell Discovery MPSCINTR nodes
|
||||
|
||||
Represent the Discovery's MPSC DMA interrupt hardware registers
|
||||
(SDMA cause and mask registers).
|
||||
|
||||
Required properties:
|
||||
- reg : Offset and length of the register set for this device
|
||||
|
||||
Example Discovery MPSCINTR node:
|
||||
mpsintr@b800 {
|
||||
reg = <0xb800 0x100>;
|
||||
};
|
||||
|
||||
|
||||
i) Marvell Discovery MPSC nodes
|
||||
|
||||
Represent the Discovery's MPSC (Multiprotocol Serial Controller)
|
||||
serial port.
|
||||
|
||||
Required properties:
|
||||
- device_type : "serial"
|
||||
- compatible : "marvell,mv64360-mpsc"
|
||||
- reg : Offset and length of the register set for this device
|
||||
- sdma : the phandle for the SDMA node used by this port
|
||||
- brg : the phandle for the BRG node used by this port
|
||||
- cunit : the phandle for the CUNIT node used by this port
|
||||
- mpscrouting : the phandle for the MPSCROUTING node used by this port
|
||||
- mpscintr : the phandle for the MPSCINTR node used by this port
|
||||
- cell-index : the hardware index of this cell in the MPSC core
|
||||
- max_idle : value needed for MPSC CHR3 (Maximum Frame Length)
|
||||
register
|
||||
- interrupts : <a> where a is the interrupt number for the MPSC.
|
||||
- interrupt-parent : the phandle for the interrupt controller
|
||||
that services interrupts for this device.
|
||||
|
||||
Example Discovery MPSCINTR node:
|
||||
mpsc@8000 {
|
||||
device_type = "serial";
|
||||
compatible = "marvell,mv64360-mpsc";
|
||||
reg = <0x8000 0x38>;
|
||||
virtual-reg = <0xf1008000>;
|
||||
sdma = <&SDMA0>;
|
||||
brg = <&BRG0>;
|
||||
cunit = <&CUNIT>;
|
||||
mpscrouting = <&MPSCROUTING>;
|
||||
mpscintr = <&MPSCINTR>;
|
||||
cell-index = <0>;
|
||||
max_idle = <40>;
|
||||
interrupts = <40>;
|
||||
interrupt-parent = <&PIC>;
|
||||
};
|
||||
|
||||
|
||||
j) Marvell Discovery Watch Dog Timer nodes
|
||||
|
||||
Represent the Discovery's watchdog timer hardware
|
||||
|
||||
Required properties:
|
||||
- compatible : "marvell,mv64360-wdt"
|
||||
- reg : Offset and length of the register set for this device
|
||||
|
||||
Example Discovery Watch Dog Timer node:
|
||||
wdt@b410 {
|
||||
compatible = "marvell,mv64360-wdt";
|
||||
reg = <0xb410 0x8>;
|
||||
};
|
||||
|
||||
|
||||
k) Marvell Discovery I2C nodes
|
||||
|
||||
Represent the Discovery's I2C hardware
|
||||
|
||||
Required properties:
|
||||
- device_type : "i2c"
|
||||
- compatible : "marvell,mv64360-i2c"
|
||||
- reg : Offset and length of the register set for this device
|
||||
- interrupts : <a> where a is the interrupt number for the I2C.
|
||||
- interrupt-parent : the phandle for the interrupt controller
|
||||
that services interrupts for this device.
|
||||
|
||||
Example Discovery I2C node:
|
||||
compatible = "marvell,mv64360-i2c";
|
||||
reg = <0xc000 0x20>;
|
||||
virtual-reg = <0xf100c000>;
|
||||
interrupts = <37>;
|
||||
interrupt-parent = <&PIC>;
|
||||
};
|
||||
|
||||
|
||||
l) Marvell Discovery PIC (Programmable Interrupt Controller) nodes
|
||||
|
||||
Represent the Discovery's PIC hardware
|
||||
|
||||
Required properties:
|
||||
- #interrupt-cells : <1>
|
||||
- #address-cells : <0>
|
||||
- compatible : "marvell,mv64360-pic"
|
||||
- reg : Offset and length of the register set for this device
|
||||
- interrupt-controller
|
||||
|
||||
Example Discovery PIC node:
|
||||
pic {
|
||||
#interrupt-cells = <1>;
|
||||
#address-cells = <0>;
|
||||
compatible = "marvell,mv64360-pic";
|
||||
reg = <0x0 0x88>;
|
||||
interrupt-controller;
|
||||
};
|
||||
|
||||
|
||||
m) Marvell Discovery MPP (Multipurpose Pins) multiplexing nodes
|
||||
|
||||
Represent the Discovery's MPP hardware
|
||||
|
||||
Required properties:
|
||||
- compatible : "marvell,mv64360-mpp"
|
||||
- reg : Offset and length of the register set for this device
|
||||
|
||||
Example Discovery MPP node:
|
||||
mpp@f000 {
|
||||
compatible = "marvell,mv64360-mpp";
|
||||
reg = <0xf000 0x10>;
|
||||
};
|
||||
|
||||
|
||||
n) Marvell Discovery GPP (General Purpose Pins) nodes
|
||||
|
||||
Represent the Discovery's GPP hardware
|
||||
|
||||
Required properties:
|
||||
- compatible : "marvell,mv64360-gpp"
|
||||
- reg : Offset and length of the register set for this device
|
||||
|
||||
Example Discovery GPP node:
|
||||
gpp@f000 {
|
||||
compatible = "marvell,mv64360-gpp";
|
||||
reg = <0xf100 0x20>;
|
||||
};
|
||||
|
||||
|
||||
o) Marvell Discovery PCI host bridge node
|
||||
|
||||
Represents the Discovery's PCI host bridge device. The properties
|
||||
for this node conform to Rev 2.1 of the PCI Bus Binding to IEEE
|
||||
1275-1994. A typical value for the compatible property is
|
||||
"marvell,mv64360-pci".
|
||||
|
||||
Example Discovery PCI host bridge node
|
||||
pci@80000000 {
|
||||
#address-cells = <3>;
|
||||
#size-cells = <2>;
|
||||
#interrupt-cells = <1>;
|
||||
device_type = "pci";
|
||||
compatible = "marvell,mv64360-pci";
|
||||
reg = <0xcf8 0x8>;
|
||||
ranges = <0x01000000 0x0 0x0
|
||||
0x88000000 0x0 0x01000000
|
||||
0x02000000 0x0 0x80000000
|
||||
0x80000000 0x0 0x08000000>;
|
||||
bus-range = <0 255>;
|
||||
clock-frequency = <66000000>;
|
||||
interrupt-parent = <&PIC>;
|
||||
interrupt-map-mask = <0xf800 0x0 0x0 0x7>;
|
||||
interrupt-map = <
|
||||
/* IDSEL 0x0a */
|
||||
0x5000 0 0 1 &PIC 80
|
||||
0x5000 0 0 2 &PIC 81
|
||||
0x5000 0 0 3 &PIC 91
|
||||
0x5000 0 0 4 &PIC 93
|
||||
|
||||
/* IDSEL 0x0b */
|
||||
0x5800 0 0 1 &PIC 91
|
||||
0x5800 0 0 2 &PIC 93
|
||||
0x5800 0 0 3 &PIC 80
|
||||
0x5800 0 0 4 &PIC 81
|
||||
|
||||
/* IDSEL 0x0c */
|
||||
0x6000 0 0 1 &PIC 91
|
||||
0x6000 0 0 2 &PIC 93
|
||||
0x6000 0 0 3 &PIC 80
|
||||
0x6000 0 0 4 &PIC 81
|
||||
|
||||
/* IDSEL 0x0d */
|
||||
0x6800 0 0 1 &PIC 93
|
||||
0x6800 0 0 2 &PIC 80
|
||||
0x6800 0 0 3 &PIC 81
|
||||
0x6800 0 0 4 &PIC 91
|
||||
>;
|
||||
};
|
||||
|
||||
|
||||
p) Marvell Discovery CPU Error nodes
|
||||
|
||||
Represent the Discovery's CPU error handler device.
|
||||
|
||||
Required properties:
|
||||
- compatible : "marvell,mv64360-cpu-error"
|
||||
- reg : Offset and length of the register set for this device
|
||||
- interrupts : the interrupt number for this device
|
||||
- interrupt-parent : the phandle for the interrupt controller
|
||||
that services interrupts for this device.
|
||||
|
||||
Example Discovery CPU Error node:
|
||||
cpu-error@0070 {
|
||||
compatible = "marvell,mv64360-cpu-error";
|
||||
reg = <0x70 0x10 0x128 0x28>;
|
||||
interrupts = <3>;
|
||||
interrupt-parent = <&PIC>;
|
||||
};
|
||||
|
||||
|
||||
q) Marvell Discovery SRAM Controller nodes
|
||||
|
||||
Represent the Discovery's SRAM controller device.
|
||||
|
||||
Required properties:
|
||||
- compatible : "marvell,mv64360-sram-ctrl"
|
||||
- reg : Offset and length of the register set for this device
|
||||
- interrupts : the interrupt number for this device
|
||||
- interrupt-parent : the phandle for the interrupt controller
|
||||
that services interrupts for this device.
|
||||
|
||||
Example Discovery SRAM Controller node:
|
||||
sram-ctrl@0380 {
|
||||
compatible = "marvell,mv64360-sram-ctrl";
|
||||
reg = <0x380 0x80>;
|
||||
interrupts = <13>;
|
||||
interrupt-parent = <&PIC>;
|
||||
};
|
||||
|
||||
|
||||
r) Marvell Discovery PCI Error Handler nodes
|
||||
|
||||
Represent the Discovery's PCI error handler device.
|
||||
|
||||
Required properties:
|
||||
- compatible : "marvell,mv64360-pci-error"
|
||||
- reg : Offset and length of the register set for this device
|
||||
- interrupts : the interrupt number for this device
|
||||
- interrupt-parent : the phandle for the interrupt controller
|
||||
that services interrupts for this device.
|
||||
|
||||
Example Discovery PCI Error Handler node:
|
||||
pci-error@1d40 {
|
||||
compatible = "marvell,mv64360-pci-error";
|
||||
reg = <0x1d40 0x40 0xc28 0x4>;
|
||||
interrupts = <12>;
|
||||
interrupt-parent = <&PIC>;
|
||||
};
|
||||
|
||||
|
||||
s) Marvell Discovery Memory Controller nodes
|
||||
|
||||
Represent the Discovery's memory controller device.
|
||||
|
||||
Required properties:
|
||||
- compatible : "marvell,mv64360-mem-ctrl"
|
||||
- reg : Offset and length of the register set for this device
|
||||
- interrupts : the interrupt number for this device
|
||||
- interrupt-parent : the phandle for the interrupt controller
|
||||
that services interrupts for this device.
|
||||
|
||||
Example Discovery Memory Controller node:
|
||||
mem-ctrl@1400 {
|
||||
compatible = "marvell,mv64360-mem-ctrl";
|
||||
reg = <0x1400 0x60>;
|
||||
interrupts = <17>;
|
||||
interrupt-parent = <&PIC>;
|
||||
};
|
||||
|
||||
|
25
Documentation/powerpc/dts-bindings/phy.txt
Normal file
25
Documentation/powerpc/dts-bindings/phy.txt
Normal file
@ -0,0 +1,25 @@
|
||||
PHY nodes
|
||||
|
||||
Required properties:
|
||||
|
||||
- device_type : Should be "ethernet-phy"
|
||||
- interrupts : <a b> where a is the interrupt number and b is a
|
||||
field that represents an encoding of the sense and level
|
||||
information for the interrupt. This should be encoded based on
|
||||
the information in section 2) depending on the type of interrupt
|
||||
controller you have.
|
||||
- interrupt-parent : the phandle for the interrupt controller that
|
||||
services interrupts for this device.
|
||||
- reg : The ID number for the phy, usually a small integer
|
||||
- linux,phandle : phandle for this node; likely referenced by an
|
||||
ethernet controller node.
|
||||
|
||||
Example:
|
||||
|
||||
ethernet-phy@0 {
|
||||
linux,phandle = <2452000>
|
||||
interrupt-parent = <40000>;
|
||||
interrupts = <35 1>;
|
||||
reg = <0>;
|
||||
device_type = "ethernet-phy";
|
||||
};
|
57
Documentation/powerpc/dts-bindings/spi-bus.txt
Normal file
57
Documentation/powerpc/dts-bindings/spi-bus.txt
Normal file
@ -0,0 +1,57 @@
|
||||
SPI (Serial Peripheral Interface) busses
|
||||
|
||||
SPI busses can be described with a node for the SPI master device
|
||||
and a set of child nodes for each SPI slave on the bus. For this
|
||||
discussion, it is assumed that the system's SPI controller is in
|
||||
SPI master mode. This binding does not describe SPI controllers
|
||||
in slave mode.
|
||||
|
||||
The SPI master node requires the following properties:
|
||||
- #address-cells - number of cells required to define a chip select
|
||||
address on the SPI bus.
|
||||
- #size-cells - should be zero.
|
||||
- compatible - name of SPI bus controller following generic names
|
||||
recommended practice.
|
||||
No other properties are required in the SPI bus node. It is assumed
|
||||
that a driver for an SPI bus device will understand that it is an SPI bus.
|
||||
However, the binding does not attempt to define the specific method for
|
||||
assigning chip select numbers. Since SPI chip select configuration is
|
||||
flexible and non-standardized, it is left out of this binding with the
|
||||
assumption that board specific platform code will be used to manage
|
||||
chip selects. Individual drivers can define additional properties to
|
||||
support describing the chip select layout.
|
||||
|
||||
SPI slave nodes must be children of the SPI master node and can
|
||||
contain the following properties.
|
||||
- reg - (required) chip select address of device.
|
||||
- compatible - (required) name of SPI device following generic names
|
||||
recommended practice
|
||||
- spi-max-frequency - (required) Maximum SPI clocking speed of device in Hz
|
||||
- spi-cpol - (optional) Empty property indicating device requires
|
||||
inverse clock polarity (CPOL) mode
|
||||
- spi-cpha - (optional) Empty property indicating device requires
|
||||
shifted clock phase (CPHA) mode
|
||||
- spi-cs-high - (optional) Empty property indicating device requires
|
||||
chip select active high
|
||||
|
||||
SPI example for an MPC5200 SPI bus:
|
||||
spi@f00 {
|
||||
#address-cells = <1>;
|
||||
#size-cells = <0>;
|
||||
compatible = "fsl,mpc5200b-spi","fsl,mpc5200-spi";
|
||||
reg = <0xf00 0x20>;
|
||||
interrupts = <2 13 0 2 14 0>;
|
||||
interrupt-parent = <&mpc5200_pic>;
|
||||
|
||||
ethernet-switch@0 {
|
||||
compatible = "micrel,ks8995m";
|
||||
spi-max-frequency = <1000000>;
|
||||
reg = <0>;
|
||||
};
|
||||
|
||||
codec@1 {
|
||||
compatible = "ti,tlv320aic26";
|
||||
spi-max-frequency = <100000>;
|
||||
reg = <1>;
|
||||
};
|
||||
};
|
25
Documentation/powerpc/dts-bindings/usb-ehci.txt
Normal file
25
Documentation/powerpc/dts-bindings/usb-ehci.txt
Normal file
@ -0,0 +1,25 @@
|
||||
USB EHCI controllers
|
||||
|
||||
Required properties:
|
||||
- compatible : should be "usb-ehci".
|
||||
- reg : should contain at least address and length of the standard EHCI
|
||||
register set for the device. Optional platform-dependent registers
|
||||
(debug-port or other) can be also specified here, but only after
|
||||
definition of standard EHCI registers.
|
||||
- interrupts : one EHCI interrupt should be described here.
|
||||
If device registers are implemented in big endian mode, the device
|
||||
node should have "big-endian-regs" property.
|
||||
If controller implementation operates with big endian descriptors,
|
||||
"big-endian-desc" property should be specified.
|
||||
If both big endian registers and descriptors are used by the controller
|
||||
implementation, "big-endian" property can be specified instead of having
|
||||
both "big-endian-regs" and "big-endian-desc".
|
||||
|
||||
Example (Sequoia 440EPx):
|
||||
ehci@e0000300 {
|
||||
compatible = "ibm,usb-ehci-440epx", "usb-ehci";
|
||||
interrupt-parent = <&UIC0>;
|
||||
interrupts = <1a 4>;
|
||||
reg = <0 e0000300 90 0 e0000390 70>;
|
||||
big-endian;
|
||||
};
|
295
Documentation/powerpc/dts-bindings/xilinx.txt
Normal file
295
Documentation/powerpc/dts-bindings/xilinx.txt
Normal file
@ -0,0 +1,295 @@
|
||||
d) Xilinx IP cores
|
||||
|
||||
The Xilinx EDK toolchain ships with a set of IP cores (devices) for use
|
||||
in Xilinx Spartan and Virtex FPGAs. The devices cover the whole range
|
||||
of standard device types (network, serial, etc.) and miscellaneous
|
||||
devices (gpio, LCD, spi, etc). Also, since these devices are
|
||||
implemented within the fpga fabric every instance of the device can be
|
||||
synthesised with different options that change the behaviour.
|
||||
|
||||
Each IP-core has a set of parameters which the FPGA designer can use to
|
||||
control how the core is synthesized. Historically, the EDK tool would
|
||||
extract the device parameters relevant to device drivers and copy them
|
||||
into an 'xparameters.h' in the form of #define symbols. This tells the
|
||||
device drivers how the IP cores are configured, but it requres the kernel
|
||||
to be recompiled every time the FPGA bitstream is resynthesized.
|
||||
|
||||
The new approach is to export the parameters into the device tree and
|
||||
generate a new device tree each time the FPGA bitstream changes. The
|
||||
parameters which used to be exported as #defines will now become
|
||||
properties of the device node. In general, device nodes for IP-cores
|
||||
will take the following form:
|
||||
|
||||
(name): (generic-name)@(base-address) {
|
||||
compatible = "xlnx,(ip-core-name)-(HW_VER)"
|
||||
[, (list of compatible devices), ...];
|
||||
reg = <(baseaddr) (size)>;
|
||||
interrupt-parent = <&interrupt-controller-phandle>;
|
||||
interrupts = < ... >;
|
||||
xlnx,(parameter1) = "(string-value)";
|
||||
xlnx,(parameter2) = <(int-value)>;
|
||||
};
|
||||
|
||||
(generic-name): an open firmware-style name that describes the
|
||||
generic class of device. Preferably, this is one word, such
|
||||
as 'serial' or 'ethernet'.
|
||||
(ip-core-name): the name of the ip block (given after the BEGIN
|
||||
directive in system.mhs). Should be in lowercase
|
||||
and all underscores '_' converted to dashes '-'.
|
||||
(name): is derived from the "PARAMETER INSTANCE" value.
|
||||
(parameter#): C_* parameters from system.mhs. The C_ prefix is
|
||||
dropped from the parameter name, the name is converted
|
||||
to lowercase and all underscore '_' characters are
|
||||
converted to dashes '-'.
|
||||
(baseaddr): the baseaddr parameter value (often named C_BASEADDR).
|
||||
(HW_VER): from the HW_VER parameter.
|
||||
(size): the address range size (often C_HIGHADDR - C_BASEADDR + 1).
|
||||
|
||||
Typically, the compatible list will include the exact IP core version
|
||||
followed by an older IP core version which implements the same
|
||||
interface or any other device with the same interface.
|
||||
|
||||
'reg', 'interrupt-parent' and 'interrupts' are all optional properties.
|
||||
|
||||
For example, the following block from system.mhs:
|
||||
|
||||
BEGIN opb_uartlite
|
||||
PARAMETER INSTANCE = opb_uartlite_0
|
||||
PARAMETER HW_VER = 1.00.b
|
||||
PARAMETER C_BAUDRATE = 115200
|
||||
PARAMETER C_DATA_BITS = 8
|
||||
PARAMETER C_ODD_PARITY = 0
|
||||
PARAMETER C_USE_PARITY = 0
|
||||
PARAMETER C_CLK_FREQ = 50000000
|
||||
PARAMETER C_BASEADDR = 0xEC100000
|
||||
PARAMETER C_HIGHADDR = 0xEC10FFFF
|
||||
BUS_INTERFACE SOPB = opb_7
|
||||
PORT OPB_Clk = CLK_50MHz
|
||||
PORT Interrupt = opb_uartlite_0_Interrupt
|
||||
PORT RX = opb_uartlite_0_RX
|
||||
PORT TX = opb_uartlite_0_TX
|
||||
PORT OPB_Rst = sys_bus_reset_0
|
||||
END
|
||||
|
||||
becomes the following device tree node:
|
||||
|
||||
opb_uartlite_0: serial@ec100000 {
|
||||
device_type = "serial";
|
||||
compatible = "xlnx,opb-uartlite-1.00.b";
|
||||
reg = <ec100000 10000>;
|
||||
interrupt-parent = <&opb_intc_0>;
|
||||
interrupts = <1 0>; // got this from the opb_intc parameters
|
||||
current-speed = <d#115200>; // standard serial device prop
|
||||
clock-frequency = <d#50000000>; // standard serial device prop
|
||||
xlnx,data-bits = <8>;
|
||||
xlnx,odd-parity = <0>;
|
||||
xlnx,use-parity = <0>;
|
||||
};
|
||||
|
||||
Some IP cores actually implement 2 or more logical devices. In
|
||||
this case, the device should still describe the whole IP core with
|
||||
a single node and add a child node for each logical device. The
|
||||
ranges property can be used to translate from parent IP-core to the
|
||||
registers of each device. In addition, the parent node should be
|
||||
compatible with the bus type 'xlnx,compound', and should contain
|
||||
#address-cells and #size-cells, as with any other bus. (Note: this
|
||||
makes the assumption that both logical devices have the same bus
|
||||
binding. If this is not true, then separate nodes should be used
|
||||
for each logical device). The 'cell-index' property can be used to
|
||||
enumerate logical devices within an IP core. For example, the
|
||||
following is the system.mhs entry for the dual ps2 controller found
|
||||
on the ml403 reference design.
|
||||
|
||||
BEGIN opb_ps2_dual_ref
|
||||
PARAMETER INSTANCE = opb_ps2_dual_ref_0
|
||||
PARAMETER HW_VER = 1.00.a
|
||||
PARAMETER C_BASEADDR = 0xA9000000
|
||||
PARAMETER C_HIGHADDR = 0xA9001FFF
|
||||
BUS_INTERFACE SOPB = opb_v20_0
|
||||
PORT Sys_Intr1 = ps2_1_intr
|
||||
PORT Sys_Intr2 = ps2_2_intr
|
||||
PORT Clkin1 = ps2_clk_rx_1
|
||||
PORT Clkin2 = ps2_clk_rx_2
|
||||
PORT Clkpd1 = ps2_clk_tx_1
|
||||
PORT Clkpd2 = ps2_clk_tx_2
|
||||
PORT Rx1 = ps2_d_rx_1
|
||||
PORT Rx2 = ps2_d_rx_2
|
||||
PORT Txpd1 = ps2_d_tx_1
|
||||
PORT Txpd2 = ps2_d_tx_2
|
||||
END
|
||||
|
||||
It would result in the following device tree nodes:
|
||||
|
||||
opb_ps2_dual_ref_0: opb-ps2-dual-ref@a9000000 {
|
||||
#address-cells = <1>;
|
||||
#size-cells = <1>;
|
||||
compatible = "xlnx,compound";
|
||||
ranges = <0 a9000000 2000>;
|
||||
// If this device had extra parameters, then they would
|
||||
// go here.
|
||||
ps2@0 {
|
||||
compatible = "xlnx,opb-ps2-dual-ref-1.00.a";
|
||||
reg = <0 40>;
|
||||
interrupt-parent = <&opb_intc_0>;
|
||||
interrupts = <3 0>;
|
||||
cell-index = <0>;
|
||||
};
|
||||
ps2@1000 {
|
||||
compatible = "xlnx,opb-ps2-dual-ref-1.00.a";
|
||||
reg = <1000 40>;
|
||||
interrupt-parent = <&opb_intc_0>;
|
||||
interrupts = <3 0>;
|
||||
cell-index = <0>;
|
||||
};
|
||||
};
|
||||
|
||||
Also, the system.mhs file defines bus attachments from the processor
|
||||
to the devices. The device tree structure should reflect the bus
|
||||
attachments. Again an example; this system.mhs fragment:
|
||||
|
||||
BEGIN ppc405_virtex4
|
||||
PARAMETER INSTANCE = ppc405_0
|
||||
PARAMETER HW_VER = 1.01.a
|
||||
BUS_INTERFACE DPLB = plb_v34_0
|
||||
BUS_INTERFACE IPLB = plb_v34_0
|
||||
END
|
||||
|
||||
BEGIN opb_intc
|
||||
PARAMETER INSTANCE = opb_intc_0
|
||||
PARAMETER HW_VER = 1.00.c
|
||||
PARAMETER C_BASEADDR = 0xD1000FC0
|
||||
PARAMETER C_HIGHADDR = 0xD1000FDF
|
||||
BUS_INTERFACE SOPB = opb_v20_0
|
||||
END
|
||||
|
||||
BEGIN opb_uart16550
|
||||
PARAMETER INSTANCE = opb_uart16550_0
|
||||
PARAMETER HW_VER = 1.00.d
|
||||
PARAMETER C_BASEADDR = 0xa0000000
|
||||
PARAMETER C_HIGHADDR = 0xa0001FFF
|
||||
BUS_INTERFACE SOPB = opb_v20_0
|
||||
END
|
||||
|
||||
BEGIN plb_v34
|
||||
PARAMETER INSTANCE = plb_v34_0
|
||||
PARAMETER HW_VER = 1.02.a
|
||||
END
|
||||
|
||||
BEGIN plb_bram_if_cntlr
|
||||
PARAMETER INSTANCE = plb_bram_if_cntlr_0
|
||||
PARAMETER HW_VER = 1.00.b
|
||||
PARAMETER C_BASEADDR = 0xFFFF0000
|
||||
PARAMETER C_HIGHADDR = 0xFFFFFFFF
|
||||
BUS_INTERFACE SPLB = plb_v34_0
|
||||
END
|
||||
|
||||
BEGIN plb2opb_bridge
|
||||
PARAMETER INSTANCE = plb2opb_bridge_0
|
||||
PARAMETER HW_VER = 1.01.a
|
||||
PARAMETER C_RNG0_BASEADDR = 0x20000000
|
||||
PARAMETER C_RNG0_HIGHADDR = 0x3FFFFFFF
|
||||
PARAMETER C_RNG1_BASEADDR = 0x60000000
|
||||
PARAMETER C_RNG1_HIGHADDR = 0x7FFFFFFF
|
||||
PARAMETER C_RNG2_BASEADDR = 0x80000000
|
||||
PARAMETER C_RNG2_HIGHADDR = 0xBFFFFFFF
|
||||
PARAMETER C_RNG3_BASEADDR = 0xC0000000
|
||||
PARAMETER C_RNG3_HIGHADDR = 0xDFFFFFFF
|
||||
BUS_INTERFACE SPLB = plb_v34_0
|
||||
BUS_INTERFACE MOPB = opb_v20_0
|
||||
END
|
||||
|
||||
Gives this device tree (some properties removed for clarity):
|
||||
|
||||
plb@0 {
|
||||
#address-cells = <1>;
|
||||
#size-cells = <1>;
|
||||
compatible = "xlnx,plb-v34-1.02.a";
|
||||
device_type = "ibm,plb";
|
||||
ranges; // 1:1 translation
|
||||
|
||||
plb_bram_if_cntrl_0: bram@ffff0000 {
|
||||
reg = <ffff0000 10000>;
|
||||
}
|
||||
|
||||
opb@20000000 {
|
||||
#address-cells = <1>;
|
||||
#size-cells = <1>;
|
||||
ranges = <20000000 20000000 20000000
|
||||
60000000 60000000 20000000
|
||||
80000000 80000000 40000000
|
||||
c0000000 c0000000 20000000>;
|
||||
|
||||
opb_uart16550_0: serial@a0000000 {
|
||||
reg = <a00000000 2000>;
|
||||
};
|
||||
|
||||
opb_intc_0: interrupt-controller@d1000fc0 {
|
||||
reg = <d1000fc0 20>;
|
||||
};
|
||||
};
|
||||
};
|
||||
|
||||
That covers the general approach to binding xilinx IP cores into the
|
||||
device tree. The following are bindings for specific devices:
|
||||
|
||||
i) Xilinx ML300 Framebuffer
|
||||
|
||||
Simple framebuffer device from the ML300 reference design (also on the
|
||||
ML403 reference design as well as others).
|
||||
|
||||
Optional properties:
|
||||
- resolution = <xres yres> : pixel resolution of framebuffer. Some
|
||||
implementations use a different resolution.
|
||||
Default is <d#640 d#480>
|
||||
- virt-resolution = <xvirt yvirt> : Size of framebuffer in memory.
|
||||
Default is <d#1024 d#480>.
|
||||
- rotate-display (empty) : rotate display 180 degrees.
|
||||
|
||||
ii) Xilinx SystemACE
|
||||
|
||||
The Xilinx SystemACE device is used to program FPGAs from an FPGA
|
||||
bitstream stored on a CF card. It can also be used as a generic CF
|
||||
interface device.
|
||||
|
||||
Optional properties:
|
||||
- 8-bit (empty) : Set this property for SystemACE in 8 bit mode
|
||||
|
||||
iii) Xilinx EMAC and Xilinx TEMAC
|
||||
|
||||
Xilinx Ethernet devices. In addition to general xilinx properties
|
||||
listed above, nodes for these devices should include a phy-handle
|
||||
property, and may include other common network device properties
|
||||
like local-mac-address.
|
||||
|
||||
iv) Xilinx Uartlite
|
||||
|
||||
Xilinx uartlite devices are simple fixed speed serial ports.
|
||||
|
||||
Required properties:
|
||||
- current-speed : Baud rate of uartlite
|
||||
|
||||
v) Xilinx hwicap
|
||||
|
||||
Xilinx hwicap devices provide access to the configuration logic
|
||||
of the FPGA through the Internal Configuration Access Port
|
||||
(ICAP). The ICAP enables partial reconfiguration of the FPGA,
|
||||
readback of the configuration information, and some control over
|
||||
'warm boots' of the FPGA fabric.
|
||||
|
||||
Required properties:
|
||||
- xlnx,family : The family of the FPGA, necessary since the
|
||||
capabilities of the underlying ICAP hardware
|
||||
differ between different families. May be
|
||||
'virtex2p', 'virtex4', or 'virtex5'.
|
||||
|
||||
vi) Xilinx Uart 16550
|
||||
|
||||
Xilinx UART 16550 devices are very similar to the NS16550 but with
|
||||
different register spacing and an offset from the base address.
|
||||
|
||||
Required properties:
|
||||
- clock-frequency : Frequency of the clock input
|
||||
- reg-offset : A value of 3 is required
|
||||
- reg-shift : A value of 2 is required
|
||||
|
||||
|
172
Documentation/pps/pps.txt
Normal file
172
Documentation/pps/pps.txt
Normal file
@ -0,0 +1,172 @@
|
||||
|
||||
PPS - Pulse Per Second
|
||||
----------------------
|
||||
|
||||
(C) Copyright 2007 Rodolfo Giometti <giometti@enneenne.com>
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 2 of the License, or
|
||||
(at your option) any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
LinuxPPS provides a programming interface (API) to define in the
|
||||
system several PPS sources.
|
||||
|
||||
PPS means "pulse per second" and a PPS source is just a device which
|
||||
provides a high precision signal each second so that an application
|
||||
can use it to adjust system clock time.
|
||||
|
||||
A PPS source can be connected to a serial port (usually to the Data
|
||||
Carrier Detect pin) or to a parallel port (ACK-pin) or to a special
|
||||
CPU's GPIOs (this is the common case in embedded systems) but in each
|
||||
case when a new pulse arrives the system must apply to it a timestamp
|
||||
and record it for userland.
|
||||
|
||||
Common use is the combination of the NTPD as userland program, with a
|
||||
GPS receiver as PPS source, to obtain a wallclock-time with
|
||||
sub-millisecond synchronisation to UTC.
|
||||
|
||||
|
||||
RFC considerations
|
||||
------------------
|
||||
|
||||
While implementing a PPS API as RFC 2783 defines and using an embedded
|
||||
CPU GPIO-Pin as physical link to the signal, I encountered a deeper
|
||||
problem:
|
||||
|
||||
At startup it needs a file descriptor as argument for the function
|
||||
time_pps_create().
|
||||
|
||||
This implies that the source has a /dev/... entry. This assumption is
|
||||
ok for the serial and parallel port, where you can do something
|
||||
useful besides(!) the gathering of timestamps as it is the central
|
||||
task for a PPS-API. But this assumption does not work for a single
|
||||
purpose GPIO line. In this case even basic file-related functionality
|
||||
(like read() and write()) makes no sense at all and should not be a
|
||||
precondition for the use of a PPS-API.
|
||||
|
||||
The problem can be simply solved if you consider that a PPS source is
|
||||
not always connected with a GPS data source.
|
||||
|
||||
So your programs should check if the GPS data source (the serial port
|
||||
for instance) is a PPS source too, and if not they should provide the
|
||||
possibility to open another device as PPS source.
|
||||
|
||||
In LinuxPPS the PPS sources are simply char devices usually mapped
|
||||
into files /dev/pps0, /dev/pps1, etc..
|
||||
|
||||
|
||||
Coding example
|
||||
--------------
|
||||
|
||||
To register a PPS source into the kernel you should define a struct
|
||||
pps_source_info_s as follows:
|
||||
|
||||
static struct pps_source_info pps_ktimer_info = {
|
||||
.name = "ktimer",
|
||||
.path = "",
|
||||
.mode = PPS_CAPTUREASSERT | PPS_OFFSETASSERT | \
|
||||
PPS_ECHOASSERT | \
|
||||
PPS_CANWAIT | PPS_TSFMT_TSPEC,
|
||||
.echo = pps_ktimer_echo,
|
||||
.owner = THIS_MODULE,
|
||||
};
|
||||
|
||||
and then calling the function pps_register_source() in your
|
||||
intialization routine as follows:
|
||||
|
||||
source = pps_register_source(&pps_ktimer_info,
|
||||
PPS_CAPTUREASSERT | PPS_OFFSETASSERT);
|
||||
|
||||
The pps_register_source() prototype is:
|
||||
|
||||
int pps_register_source(struct pps_source_info_s *info, int default_params)
|
||||
|
||||
where "info" is a pointer to a structure that describes a particular
|
||||
PPS source, "default_params" tells the system what the initial default
|
||||
parameters for the device should be (it is obvious that these parameters
|
||||
must be a subset of ones defined in the struct
|
||||
pps_source_info_s which describe the capabilities of the driver).
|
||||
|
||||
Once you have registered a new PPS source into the system you can
|
||||
signal an assert event (for example in the interrupt handler routine)
|
||||
just using:
|
||||
|
||||
pps_event(source, &ts, PPS_CAPTUREASSERT, ptr)
|
||||
|
||||
where "ts" is the event's timestamp.
|
||||
|
||||
The same function may also run the defined echo function
|
||||
(pps_ktimer_echo(), passing to it the "ptr" pointer) if the user
|
||||
asked for that... etc..
|
||||
|
||||
Please see the file drivers/pps/clients/ktimer.c for example code.
|
||||
|
||||
|
||||
SYSFS support
|
||||
-------------
|
||||
|
||||
If the SYSFS filesystem is enabled in the kernel it provides a new class:
|
||||
|
||||
$ ls /sys/class/pps/
|
||||
pps0/ pps1/ pps2/
|
||||
|
||||
Every directory is the ID of a PPS sources defined in the system and
|
||||
inside you find several files:
|
||||
|
||||
$ ls /sys/class/pps/pps0/
|
||||
assert clear echo mode name path subsystem@ uevent
|
||||
|
||||
Inside each "assert" and "clear" file you can find the timestamp and a
|
||||
sequence number:
|
||||
|
||||
$ cat /sys/class/pps/pps0/assert
|
||||
1170026870.983207967#8
|
||||
|
||||
Where before the "#" is the timestamp in seconds; after it is the
|
||||
sequence number. Other files are:
|
||||
|
||||
* echo: reports if the PPS source has an echo function or not;
|
||||
|
||||
* mode: reports available PPS functioning modes;
|
||||
|
||||
* name: reports the PPS source's name;
|
||||
|
||||
* path: reports the PPS source's device path, that is the device the
|
||||
PPS source is connected to (if it exists).
|
||||
|
||||
|
||||
Testing the PPS support
|
||||
-----------------------
|
||||
|
||||
In order to test the PPS support even without specific hardware you can use
|
||||
the ktimer driver (see the client subsection in the PPS configuration menu)
|
||||
and the userland tools provided into Documentaion/pps/ directory.
|
||||
|
||||
Once you have enabled the compilation of ktimer just modprobe it (if
|
||||
not statically compiled):
|
||||
|
||||
# modprobe ktimer
|
||||
|
||||
and the run ppstest as follow:
|
||||
|
||||
$ ./ppstest /dev/pps0
|
||||
trying PPS source "/dev/pps1"
|
||||
found PPS source "/dev/pps1"
|
||||
ok, found 1 source(s), now start fetching data...
|
||||
source 0 - assert 1186592699.388832443, sequence: 364 - clear 0.000000000, sequence: 0
|
||||
source 0 - assert 1186592700.388931295, sequence: 365 - clear 0.000000000, sequence: 0
|
||||
source 0 - assert 1186592701.389032765, sequence: 366 - clear 0.000000000, sequence: 0
|
||||
|
||||
Please, note that to compile userland programs you need the file timepps.h
|
||||
(see Documentation/pps/).
|
@ -1,575 +1,139 @@
|
||||
rfkill - RF switch subsystem support
|
||||
====================================
|
||||
rfkill - RF kill switch support
|
||||
===============================
|
||||
|
||||
1 Introduction
|
||||
2 Implementation details
|
||||
3 Kernel driver guidelines
|
||||
3.1 wireless device drivers
|
||||
3.2 platform/switch drivers
|
||||
3.3 input device drivers
|
||||
4 Kernel API
|
||||
5 Userspace support
|
||||
1. Introduction
|
||||
2. Implementation details
|
||||
3. Kernel API
|
||||
4. Userspace support
|
||||
|
||||
|
||||
1. Introduction:
|
||||
1. Introduction
|
||||
|
||||
The rfkill switch subsystem exists to add a generic interface to circuitry that
|
||||
can enable or disable the signal output of a wireless *transmitter* of any
|
||||
type. By far, the most common use is to disable radio-frequency transmitters.
|
||||
The rfkill subsystem provides a generic interface to disabling any radio
|
||||
transmitter in the system. When a transmitter is blocked, it shall not
|
||||
radiate any power.
|
||||
|
||||
Note that disabling the signal output means that the the transmitter is to be
|
||||
made to not emit any energy when "blocked". rfkill is not about blocking data
|
||||
transmissions, it is about blocking energy emission.
|
||||
The subsystem also provides the ability to react on button presses and
|
||||
disable all transmitters of a certain type (or all). This is intended for
|
||||
situations where transmitters need to be turned off, for example on
|
||||
aircraft.
|
||||
|
||||
The rfkill subsystem offers support for keys and switches often found on
|
||||
laptops to enable wireless devices like WiFi and Bluetooth, so that these keys
|
||||
and switches actually perform an action in all wireless devices of a given type
|
||||
attached to the system.
|
||||
The rfkill subsystem has a concept of "hard" and "soft" block, which
|
||||
differ little in their meaning (block == transmitters off) but rather in
|
||||
whether they can be changed or not:
|
||||
- hard block: read-only radio block that cannot be overriden by software
|
||||
- soft block: writable radio block (need not be readable) that is set by
|
||||
the system software.
|
||||
|
||||
The buttons to enable and disable the wireless transmitters are important in
|
||||
situations where the user is for example using his laptop on a location where
|
||||
radio-frequency transmitters _must_ be disabled (e.g. airplanes).
|
||||
|
||||
Because of this requirement, userspace support for the keys should not be made
|
||||
mandatory. Because userspace might want to perform some additional smarter
|
||||
tasks when the key is pressed, rfkill provides userspace the possibility to
|
||||
take over the task to handle the key events.
|
||||
2. Implementation details
|
||||
|
||||
The rfkill subsystem is composed of three main components:
|
||||
* the rfkill core,
|
||||
* the deprecated rfkill-input module (an input layer handler, being
|
||||
replaced by userspace policy code) and
|
||||
* the rfkill drivers.
|
||||
|
||||
The rfkill core provides API for kernel drivers to register their radio
|
||||
transmitter with the kernel, methods for turning it on and off and, letting
|
||||
the system know about hardware-disabled states that may be implemented on
|
||||
the device.
|
||||
|
||||
The rfkill core code also notifies userspace of state changes, and provides
|
||||
ways for userspace to query the current states. See the "Userspace support"
|
||||
section below.
|
||||
|
||||
When the device is hard-blocked (either by a call to rfkill_set_hw_state()
|
||||
or from query_hw_block) set_block() will be invoked for additional software
|
||||
block, but drivers can ignore the method call since they can use the return
|
||||
value of the function rfkill_set_hw_state() to sync the software state
|
||||
instead of keeping track of calls to set_block(). In fact, drivers should
|
||||
use the return value of rfkill_set_hw_state() unless the hardware actually
|
||||
keeps track of soft and hard block separately.
|
||||
|
||||
|
||||
3. Kernel API
|
||||
|
||||
|
||||
Drivers for radio transmitters normally implement an rfkill driver.
|
||||
|
||||
Platform drivers might implement input devices if the rfkill button is just
|
||||
that, a button. If that button influences the hardware then you need to
|
||||
implement an rfkill driver instead. This also applies if the platform provides
|
||||
a way to turn on/off the transmitter(s).
|
||||
|
||||
For some platforms, it is possible that the hardware state changes during
|
||||
suspend/hibernation, in which case it will be necessary to update the rfkill
|
||||
core with the current state is at resume time.
|
||||
|
||||
To create an rfkill driver, driver's Kconfig needs to have
|
||||
|
||||
depends on RFKILL || !RFKILL
|
||||
|
||||
to ensure the driver cannot be built-in when rfkill is modular. The !RFKILL
|
||||
case allows the driver to be built when rfkill is not configured, which which
|
||||
case all rfkill API can still be used but will be provided by static inlines
|
||||
which compile to almost nothing.
|
||||
|
||||
Calling rfkill_set_hw_state() when a state change happens is required from
|
||||
rfkill drivers that control devices that can be hard-blocked unless they also
|
||||
assign the poll_hw_block() callback (then the rfkill core will poll the
|
||||
device). Don't do this unless you cannot get the event in any other way.
|
||||
|
||||
|
||||
|
||||
5. Userspace support
|
||||
|
||||
The recommended userspace interface to use is /dev/rfkill, which is a misc
|
||||
character device that allows userspace to obtain and set the state of rfkill
|
||||
devices and sets of devices. It also notifies userspace about device addition
|
||||
and removal. The API is a simple read/write API that is defined in
|
||||
linux/rfkill.h, with one ioctl that allows turning off the deprecated input
|
||||
handler in the kernel for the transition period.
|
||||
|
||||
Except for the one ioctl, communication with the kernel is done via read()
|
||||
and write() of instances of 'struct rfkill_event'. In this structure, the
|
||||
soft and hard block are properly separated (unlike sysfs, see below) and
|
||||
userspace is able to get a consistent snapshot of all rfkill devices in the
|
||||
system. Also, it is possible to switch all rfkill drivers (or all drivers of
|
||||
a specified type) into a state which also updates the default state for
|
||||
hotplugged devices.
|
||||
|
||||
After an application opens /dev/rfkill, it can read the current state of
|
||||
all devices, and afterwards can poll the descriptor for hotplug or state
|
||||
change events.
|
||||
|
||||
Applications must ignore operations (the "op" field) they do not handle,
|
||||
this allows the API to be extended in the future.
|
||||
|
||||
===============================================================================
|
||||
2: Implementation details
|
||||
|
||||
The rfkill subsystem is composed of various components: the rfkill class, the
|
||||
rfkill-input module (an input layer handler), and some specific input layer
|
||||
events.
|
||||
|
||||
The rfkill class provides kernel drivers with an interface that allows them to
|
||||
know when they should enable or disable a wireless network device transmitter.
|
||||
This is enabled by the CONFIG_RFKILL Kconfig option.
|
||||
|
||||
The rfkill class support makes sure userspace will be notified of all state
|
||||
changes on rfkill devices through uevents. It provides a notification chain
|
||||
for interested parties in the kernel to also get notified of rfkill state
|
||||
changes in other drivers. It creates several sysfs entries which can be used
|
||||
by userspace. See section "Userspace support".
|
||||
|
||||
The rfkill-input module provides the kernel with the ability to implement a
|
||||
basic response when the user presses a key or button (or toggles a switch)
|
||||
related to rfkill functionality. It is an in-kernel implementation of default
|
||||
policy of reacting to rfkill-related input events and neither mandatory nor
|
||||
required for wireless drivers to operate. It is enabled by the
|
||||
CONFIG_RFKILL_INPUT Kconfig option.
|
||||
|
||||
rfkill-input is a rfkill-related events input layer handler. This handler will
|
||||
listen to all rfkill key events and will change the rfkill state of the
|
||||
wireless devices accordingly. With this option enabled userspace could either
|
||||
do nothing or simply perform monitoring tasks.
|
||||
|
||||
The rfkill-input module also provides EPO (emergency power-off) functionality
|
||||
for all wireless transmitters. This function cannot be overridden, and it is
|
||||
always active. rfkill EPO is related to *_RFKILL_ALL input layer events.
|
||||
|
||||
|
||||
Important terms for the rfkill subsystem:
|
||||
|
||||
In order to avoid confusion, we avoid the term "switch" in rfkill when it is
|
||||
referring to an electronic control circuit that enables or disables a
|
||||
transmitter. We reserve it for the physical device a human manipulates
|
||||
(which is an input device, by the way):
|
||||
|
||||
rfkill switch:
|
||||
|
||||
A physical device a human manipulates. Its state can be perceived by
|
||||
the kernel either directly (through a GPIO pin, ACPI GPE) or by its
|
||||
effect on a rfkill line of a wireless device.
|
||||
|
||||
rfkill controller:
|
||||
|
||||
A hardware circuit that controls the state of a rfkill line, which a
|
||||
kernel driver can interact with *to modify* that state (i.e. it has
|
||||
either write-only or read/write access).
|
||||
|
||||
rfkill line:
|
||||
|
||||
An input channel (hardware or software) of a wireless device, which
|
||||
causes a wireless transmitter to stop emitting energy (BLOCK) when it
|
||||
is active. Point of view is extremely important here: rfkill lines are
|
||||
always seen from the PoV of a wireless device (and its driver).
|
||||
|
||||
soft rfkill line/software rfkill line:
|
||||
|
||||
A rfkill line the wireless device driver can directly change the state
|
||||
of. Related to rfkill_state RFKILL_STATE_SOFT_BLOCKED.
|
||||
|
||||
hard rfkill line/hardware rfkill line:
|
||||
|
||||
A rfkill line that works fully in hardware or firmware, and that cannot
|
||||
be overridden by the kernel driver. The hardware device or the
|
||||
firmware just exports its status to the driver, but it is read-only.
|
||||
Related to rfkill_state RFKILL_STATE_HARD_BLOCKED.
|
||||
|
||||
The enum rfkill_state describes the rfkill state of a transmitter:
|
||||
|
||||
When a rfkill line or rfkill controller is in the RFKILL_STATE_UNBLOCKED state,
|
||||
the wireless transmitter (radio TX circuit for example) is *enabled*. When the
|
||||
it is in the RFKILL_STATE_SOFT_BLOCKED or RFKILL_STATE_HARD_BLOCKED, the
|
||||
wireless transmitter is to be *blocked* from operating.
|
||||
|
||||
RFKILL_STATE_SOFT_BLOCKED indicates that a call to toggle_radio() can change
|
||||
that state. RFKILL_STATE_HARD_BLOCKED indicates that a call to toggle_radio()
|
||||
will not be able to change the state and will return with a suitable error if
|
||||
attempts are made to set the state to RFKILL_STATE_UNBLOCKED.
|
||||
|
||||
RFKILL_STATE_HARD_BLOCKED is used by drivers to signal that the device is
|
||||
locked in the BLOCKED state by a hardwire rfkill line (typically an input pin
|
||||
that, when active, forces the transmitter to be disabled) which the driver
|
||||
CANNOT override.
|
||||
|
||||
Full rfkill functionality requires two different subsystems to cooperate: the
|
||||
input layer and the rfkill class. The input layer issues *commands* to the
|
||||
entire system requesting that devices registered to the rfkill class change
|
||||
state. The way this interaction happens is not complex, but it is not obvious
|
||||
either:
|
||||
|
||||
Kernel Input layer:
|
||||
|
||||
* Generates KEY_WWAN, KEY_WLAN, KEY_BLUETOOTH, SW_RFKILL_ALL, and
|
||||
other such events when the user presses certain keys, buttons, or
|
||||
toggles certain physical switches.
|
||||
|
||||
THE INPUT LAYER IS NEVER USED TO PROPAGATE STATUS, NOTIFICATIONS OR THE
|
||||
KIND OF STUFF AN ON-SCREEN-DISPLAY APPLICATION WOULD REPORT. It is
|
||||
used to issue *commands* for the system to change behaviour, and these
|
||||
commands may or may not be carried out by some kernel driver or
|
||||
userspace application. It follows that doing user feedback based only
|
||||
on input events is broken, as there is no guarantee that an input event
|
||||
will be acted upon.
|
||||
|
||||
Most wireless communication device drivers implementing rfkill
|
||||
functionality MUST NOT generate these events, and have no reason to
|
||||
register themselves with the input layer. Doing otherwise is a common
|
||||
misconception. There is an API to propagate rfkill status change
|
||||
information, and it is NOT the input layer.
|
||||
|
||||
rfkill class:
|
||||
|
||||
* Calls a hook in a driver to effectively change the wireless
|
||||
transmitter state;
|
||||
* Keeps track of the wireless transmitter state (with help from
|
||||
the driver);
|
||||
* Generates userspace notifications (uevents) and a call to a
|
||||
notification chain (kernel) when there is a wireless transmitter
|
||||
state change;
|
||||
* Connects a wireless communications driver with the common rfkill
|
||||
control system, which, for example, allows actions such as
|
||||
"switch all bluetooth devices offline" to be carried out by
|
||||
userspace or by rfkill-input.
|
||||
|
||||
THE RFKILL CLASS NEVER ISSUES INPUT EVENTS. THE RFKILL CLASS DOES
|
||||
NOT LISTEN TO INPUT EVENTS. NO DRIVER USING THE RFKILL CLASS SHALL
|
||||
EVER LISTEN TO, OR ACT ON RFKILL INPUT EVENTS. Doing otherwise is
|
||||
a layering violation.
|
||||
|
||||
Most wireless data communication drivers in the kernel have just to
|
||||
implement the rfkill class API to work properly. Interfacing to the
|
||||
input layer is not often required (and is very often a *bug*) on
|
||||
wireless drivers.
|
||||
|
||||
Platform drivers often have to attach to the input layer to *issue*
|
||||
(but never to listen to) rfkill events for rfkill switches, and also to
|
||||
the rfkill class to export a control interface for the platform rfkill
|
||||
controllers to the rfkill subsystem. This does NOT mean the rfkill
|
||||
switch is attached to a rfkill class (doing so is almost always wrong).
|
||||
It just means the same kernel module is the driver for different
|
||||
devices (rfkill switches and rfkill controllers).
|
||||
|
||||
|
||||
Userspace input handlers (uevents) or kernel input handlers (rfkill-input):
|
||||
|
||||
* Implements the policy of what should happen when one of the input
|
||||
layer events related to rfkill operation is received.
|
||||
* Uses the sysfs interface (userspace) or private rfkill API calls
|
||||
to tell the devices registered with the rfkill class to change
|
||||
their state (i.e. translates the input layer event into real
|
||||
action).
|
||||
|
||||
* rfkill-input implements EPO by handling EV_SW SW_RFKILL_ALL 0
|
||||
(power off all transmitters) in a special way: it ignores any
|
||||
overrides and local state cache and forces all transmitters to the
|
||||
RFKILL_STATE_SOFT_BLOCKED state (including those which are already
|
||||
supposed to be BLOCKED).
|
||||
* rfkill EPO will remain active until rfkill-input receives an
|
||||
EV_SW SW_RFKILL_ALL 1 event. While the EPO is active, transmitters
|
||||
are locked in the blocked state (rfkill will refuse to unblock them).
|
||||
* rfkill-input implements different policies that the user can
|
||||
select for handling EV_SW SW_RFKILL_ALL 1. It will unlock rfkill,
|
||||
and either do nothing (leave transmitters blocked, but now unlocked),
|
||||
restore the transmitters to their state before the EPO, or unblock
|
||||
them all.
|
||||
|
||||
Userspace uevent handler or kernel platform-specific drivers hooked to the
|
||||
rfkill notifier chain:
|
||||
|
||||
* Taps into the rfkill notifier chain or to KOBJ_CHANGE uevents,
|
||||
in order to know when a device that is registered with the rfkill
|
||||
class changes state;
|
||||
* Issues feedback notifications to the user;
|
||||
* In the rare platforms where this is required, synthesizes an input
|
||||
event to command all *OTHER* rfkill devices to also change their
|
||||
statues when a specific rfkill device changes state.
|
||||
|
||||
|
||||
===============================================================================
|
||||
3: Kernel driver guidelines
|
||||
|
||||
Remember: point-of-view is everything for a driver that connects to the rfkill
|
||||
subsystem. All the details below must be measured/perceived from the point of
|
||||
view of the specific driver being modified.
|
||||
|
||||
The first thing one needs to know is whether his driver should be talking to
|
||||
the rfkill class or to the input layer. In rare cases (platform drivers), it
|
||||
could happen that you need to do both, as platform drivers often handle a
|
||||
variety of devices in the same driver.
|
||||
|
||||
Do not mistake input devices for rfkill controllers. The only type of "rfkill
|
||||
switch" device that is to be registered with the rfkill class are those
|
||||
directly controlling the circuits that cause a wireless transmitter to stop
|
||||
working (or the software equivalent of them), i.e. what we call a rfkill
|
||||
controller. Every other kind of "rfkill switch" is just an input device and
|
||||
MUST NOT be registered with the rfkill class.
|
||||
|
||||
A driver should register a device with the rfkill class when ALL of the
|
||||
following conditions are met (they define a rfkill controller):
|
||||
|
||||
1. The device is/controls a data communications wireless transmitter;
|
||||
|
||||
2. The kernel can interact with the hardware/firmware to CHANGE the wireless
|
||||
transmitter state (block/unblock TX operation);
|
||||
|
||||
3. The transmitter can be made to not emit any energy when "blocked":
|
||||
rfkill is not about blocking data transmissions, it is about blocking
|
||||
energy emission;
|
||||
|
||||
A driver should register a device with the input subsystem to issue
|
||||
rfkill-related events (KEY_WLAN, KEY_BLUETOOTH, KEY_WWAN, KEY_WIMAX,
|
||||
SW_RFKILL_ALL, etc) when ALL of the folowing conditions are met:
|
||||
|
||||
1. It is directly related to some physical device the user interacts with, to
|
||||
command the O.S./firmware/hardware to enable/disable a data communications
|
||||
wireless transmitter.
|
||||
|
||||
Examples of the physical device are: buttons, keys and switches the user
|
||||
will press/touch/slide/switch to enable or disable the wireless
|
||||
communication device.
|
||||
|
||||
2. It is NOT slaved to another device, i.e. there is no other device that
|
||||
issues rfkill-related input events in preference to this one.
|
||||
|
||||
Please refer to the corner cases and examples section for more details.
|
||||
|
||||
When in doubt, do not issue input events. For drivers that should generate
|
||||
input events in some platforms, but not in others (e.g. b43), the best solution
|
||||
is to NEVER generate input events in the first place. That work should be
|
||||
deferred to a platform-specific kernel module (which will know when to generate
|
||||
events through the rfkill notifier chain) or to userspace. This avoids the
|
||||
usual maintenance problems with DMI whitelisting.
|
||||
|
||||
|
||||
Corner cases and examples:
|
||||
====================================
|
||||
|
||||
1. If the device is an input device that, because of hardware or firmware,
|
||||
causes wireless transmitters to be blocked regardless of the kernel's will, it
|
||||
is still just an input device, and NOT to be registered with the rfkill class.
|
||||
|
||||
2. If the wireless transmitter switch control is read-only, it is an input
|
||||
device and not to be registered with the rfkill class (and maybe not to be made
|
||||
an input layer event source either, see below).
|
||||
|
||||
3. If there is some other device driver *closer* to the actual hardware the
|
||||
user interacted with (the button/switch/key) to issue an input event, THAT is
|
||||
the device driver that should be issuing input events.
|
||||
|
||||
E.g:
|
||||
[RFKILL slider switch] -- [GPIO hardware] -- [WLAN card rf-kill input]
|
||||
(platform driver) (wireless card driver)
|
||||
|
||||
The user is closer to the RFKILL slide switch plaform driver, so the driver
|
||||
which must issue input events is the platform driver looking at the GPIO
|
||||
hardware, and NEVER the wireless card driver (which is just a slave). It is
|
||||
very likely that there are other leaves than just the WLAN card rf-kill input
|
||||
(e.g. a bluetooth card, etc)...
|
||||
|
||||
On the other hand, some embedded devices do this:
|
||||
|
||||
[RFKILL slider switch] -- [WLAN card rf-kill input]
|
||||
(wireless card driver)
|
||||
|
||||
In this situation, the wireless card driver *could* register itself as an input
|
||||
device and issue rf-kill related input events... but in order to AVOID the need
|
||||
for DMI whitelisting, the wireless card driver does NOT do it. Userspace (HAL)
|
||||
or a platform driver (that exists only on these embedded devices) will do the
|
||||
dirty job of issuing the input events.
|
||||
|
||||
|
||||
COMMON MISTAKES in kernel drivers, related to rfkill:
|
||||
====================================
|
||||
|
||||
1. NEVER confuse input device keys and buttons with input device switches.
|
||||
|
||||
1a. Switches are always set or reset. They report the current state
|
||||
(on position or off position).
|
||||
|
||||
1b. Keys and buttons are either in the pressed or not-pressed state, and
|
||||
that's it. A "button" that latches down when you press it, and
|
||||
unlatches when you press it again is in fact a switch as far as input
|
||||
devices go.
|
||||
|
||||
Add the SW_* events you need for switches, do NOT try to emulate a button using
|
||||
KEY_* events just because there is no such SW_* event yet. Do NOT try to use,
|
||||
for example, KEY_BLUETOOTH when you should be using SW_BLUETOOTH instead.
|
||||
|
||||
2. Input device switches (sources of EV_SW events) DO store their current state
|
||||
(so you *must* initialize it by issuing a gratuitous input layer event on
|
||||
driver start-up and also when resuming from sleep), and that state CAN be
|
||||
queried from userspace through IOCTLs. There is no sysfs interface for this,
|
||||
but that doesn't mean you should break things trying to hook it to the rfkill
|
||||
class to get a sysfs interface :-)
|
||||
|
||||
3. Do not issue *_RFKILL_ALL events by default, unless you are sure it is the
|
||||
correct event for your switch/button. These events are emergency power-off
|
||||
events when they are trying to turn the transmitters off. An example of an
|
||||
input device which SHOULD generate *_RFKILL_ALL events is the wireless-kill
|
||||
switch in a laptop which is NOT a hotkey, but a real sliding/rocker switch.
|
||||
An example of an input device which SHOULD NOT generate *_RFKILL_ALL events by
|
||||
default, is any sort of hot key that is type-specific (e.g. the one for WLAN).
|
||||
|
||||
|
||||
3.1 Guidelines for wireless device drivers
|
||||
------------------------------------------
|
||||
|
||||
(in this text, rfkill->foo means the foo field of struct rfkill).
|
||||
|
||||
1. Each independent transmitter in a wireless device (usually there is only one
|
||||
transmitter per device) should have a SINGLE rfkill class attached to it.
|
||||
|
||||
2. If the device does not have any sort of hardware assistance to allow the
|
||||
driver to rfkill the device, the driver should emulate it by taking all actions
|
||||
required to silence the transmitter.
|
||||
|
||||
3. If it is impossible to silence the transmitter (i.e. it still emits energy,
|
||||
even if it is just in brief pulses, when there is no data to transmit and there
|
||||
is no hardware support to turn it off) do NOT lie to the users. Do not attach
|
||||
it to a rfkill class. The rfkill subsystem does not deal with data
|
||||
transmission, it deals with energy emission. If the transmitter is emitting
|
||||
energy, it is not blocked in rfkill terms.
|
||||
|
||||
4. It doesn't matter if the device has multiple rfkill input lines affecting
|
||||
the same transmitter, their combined state is to be exported as a single state
|
||||
per transmitter (see rule 1).
|
||||
|
||||
This rule exists because users of the rfkill subsystem expect to get (and set,
|
||||
when possible) the overall transmitter rfkill state, not of a particular rfkill
|
||||
line.
|
||||
|
||||
5. The wireless device driver MUST NOT leave the transmitter enabled during
|
||||
suspend and hibernation unless:
|
||||
|
||||
5.1. The transmitter has to be enabled for some sort of functionality
|
||||
like wake-on-wireless-packet or autonomous packed forwarding in a mesh
|
||||
network, and that functionality is enabled for this suspend/hibernation
|
||||
cycle.
|
||||
|
||||
AND
|
||||
|
||||
5.2. The device was not on a user-requested BLOCKED state before
|
||||
the suspend (i.e. the driver must NOT unblock a device, not even
|
||||
to support wake-on-wireless-packet or remain in the mesh).
|
||||
|
||||
In other words, there is absolutely no allowed scenario where a driver can
|
||||
automatically take action to unblock a rfkill controller (obviously, this deals
|
||||
with scenarios where soft-blocking or both soft and hard blocking is happening.
|
||||
Scenarios where hardware rfkill lines are the only ones blocking the
|
||||
transmitter are outside of this rule, since the wireless device driver does not
|
||||
control its input hardware rfkill lines in the first place).
|
||||
|
||||
6. During resume, rfkill will try to restore its previous state.
|
||||
|
||||
7. After a rfkill class is suspended, it will *not* call rfkill->toggle_radio
|
||||
until it is resumed.
|
||||
|
||||
|
||||
Example of a WLAN wireless driver connected to the rfkill subsystem:
|
||||
--------------------------------------------------------------------
|
||||
|
||||
A certain WLAN card has one input pin that causes it to block the transmitter
|
||||
and makes the status of that input pin available (only for reading!) to the
|
||||
kernel driver. This is a hard rfkill input line (it cannot be overridden by
|
||||
the kernel driver).
|
||||
|
||||
The card also has one PCI register that, if manipulated by the driver, causes
|
||||
it to block the transmitter. This is a soft rfkill input line.
|
||||
|
||||
It has also a thermal protection circuitry that shuts down its transmitter if
|
||||
the card overheats, and makes the status of that protection available (only for
|
||||
reading!) to the kernel driver. This is also a hard rfkill input line.
|
||||
|
||||
If either one of these rfkill lines are active, the transmitter is blocked by
|
||||
the hardware and forced offline.
|
||||
|
||||
The driver should allocate and attach to its struct device *ONE* instance of
|
||||
the rfkill class (there is only one transmitter).
|
||||
|
||||
It can implement the get_state() hook, and return RFKILL_STATE_HARD_BLOCKED if
|
||||
either one of its two hard rfkill input lines are active. If the two hard
|
||||
rfkill lines are inactive, it must return RFKILL_STATE_SOFT_BLOCKED if its soft
|
||||
rfkill input line is active. Only if none of the rfkill input lines are
|
||||
active, will it return RFKILL_STATE_UNBLOCKED.
|
||||
|
||||
Since the device has a hardware rfkill line, it IS subject to state changes
|
||||
external to rfkill. Therefore, the driver must make sure that it calls
|
||||
rfkill_force_state() to keep the status always up-to-date, and it must do a
|
||||
rfkill_force_state() on resume from sleep.
|
||||
|
||||
Every time the driver gets a notification from the card that one of its rfkill
|
||||
lines changed state (polling might be needed on badly designed cards that don't
|
||||
generate interrupts for such events), it recomputes the rfkill state as per
|
||||
above, and calls rfkill_force_state() to update it.
|
||||
|
||||
The driver should implement the toggle_radio() hook, that:
|
||||
|
||||
1. Returns an error if one of the hardware rfkill lines are active, and the
|
||||
caller asked for RFKILL_STATE_UNBLOCKED.
|
||||
|
||||
2. Activates the soft rfkill line if the caller asked for state
|
||||
RFKILL_STATE_SOFT_BLOCKED. It should do this even if one of the hard rfkill
|
||||
lines are active, effectively double-blocking the transmitter.
|
||||
|
||||
3. Deactivates the soft rfkill line if none of the hardware rfkill lines are
|
||||
active and the caller asked for RFKILL_STATE_UNBLOCKED.
|
||||
|
||||
===============================================================================
|
||||
4: Kernel API
|
||||
|
||||
To build a driver with rfkill subsystem support, the driver should depend on
|
||||
(or select) the Kconfig symbol RFKILL; it should _not_ depend on RKFILL_INPUT.
|
||||
|
||||
The hardware the driver talks to may be write-only (where the current state
|
||||
of the hardware is unknown), or read-write (where the hardware can be queried
|
||||
about its current state).
|
||||
|
||||
The rfkill class will call the get_state hook of a device every time it needs
|
||||
to know the *real* current state of the hardware. This can happen often, but
|
||||
it does not do any polling, so it is not enough on hardware that is subject
|
||||
to state changes outside of the rfkill subsystem.
|
||||
|
||||
Therefore, calling rfkill_force_state() when a state change happens is
|
||||
mandatory when the device has a hardware rfkill line, or when something else
|
||||
like the firmware could cause its state to be changed without going through the
|
||||
rfkill class.
|
||||
|
||||
Some hardware provides events when its status changes. In these cases, it is
|
||||
best for the driver to not provide a get_state hook, and instead register the
|
||||
rfkill class *already* with the correct status, and keep it updated using
|
||||
rfkill_force_state() when it gets an event from the hardware.
|
||||
|
||||
rfkill_force_state() must be used on the device resume handlers to update the
|
||||
rfkill status, should there be any chance of the device status changing during
|
||||
the sleep.
|
||||
|
||||
There is no provision for a statically-allocated rfkill struct. You must
|
||||
use rfkill_allocate() to allocate one.
|
||||
|
||||
You should:
|
||||
- rfkill_allocate()
|
||||
- modify rfkill fields (flags, name)
|
||||
- modify state to the current hardware state (THIS IS THE ONLY TIME
|
||||
YOU CAN ACCESS state DIRECTLY)
|
||||
- rfkill_register()
|
||||
|
||||
The only way to set a device to the RFKILL_STATE_HARD_BLOCKED state is through
|
||||
a suitable return of get_state() or through rfkill_force_state().
|
||||
|
||||
When a device is in the RFKILL_STATE_HARD_BLOCKED state, the only way to switch
|
||||
it to a different state is through a suitable return of get_state() or through
|
||||
rfkill_force_state().
|
||||
|
||||
If toggle_radio() is called to set a device to state RFKILL_STATE_SOFT_BLOCKED
|
||||
when that device is already at the RFKILL_STATE_HARD_BLOCKED state, it should
|
||||
not return an error. Instead, it should try to double-block the transmitter,
|
||||
so that its state will change from RFKILL_STATE_HARD_BLOCKED to
|
||||
RFKILL_STATE_SOFT_BLOCKED should the hardware blocking cease.
|
||||
|
||||
Please refer to the source for more documentation.
|
||||
|
||||
===============================================================================
|
||||
5: Userspace support
|
||||
|
||||
rfkill devices issue uevents (with an action of "change"), with the following
|
||||
environment variables set:
|
||||
Additionally, each rfkill device is registered in sysfs and there has the
|
||||
following attributes:
|
||||
|
||||
name: Name assigned by driver to this key (interface or driver name).
|
||||
type: Driver type string ("wlan", "bluetooth", etc).
|
||||
persistent: Whether the soft blocked state is initialised from
|
||||
non-volatile storage at startup.
|
||||
state: Current state of the transmitter
|
||||
0: RFKILL_STATE_SOFT_BLOCKED
|
||||
transmitter is turned off by software
|
||||
1: RFKILL_STATE_UNBLOCKED
|
||||
transmitter is (potentially) active
|
||||
2: RFKILL_STATE_HARD_BLOCKED
|
||||
transmitter is forced off by something outside of
|
||||
the driver's control.
|
||||
This file is deprecated because it can only properly show
|
||||
three of the four possible states, soft-and-hard-blocked is
|
||||
missing.
|
||||
claim: 0: Kernel handles events
|
||||
This file is deprecated because there no longer is a way to
|
||||
claim just control over a single rfkill instance.
|
||||
|
||||
rfkill devices also issue uevents (with an action of "change"), with the
|
||||
following environment variables set:
|
||||
|
||||
RFKILL_NAME
|
||||
RFKILL_STATE
|
||||
RFKILL_TYPE
|
||||
|
||||
The ABI for these variables is defined by the sysfs attributes. It is best
|
||||
to take a quick look at the source to make sure of the possible values.
|
||||
|
||||
It is expected that HAL will trap those, and bridge them to DBUS, etc. These
|
||||
events CAN and SHOULD be used to give feedback to the user about the rfkill
|
||||
status of the system.
|
||||
|
||||
Input devices may issue events that are related to rfkill. These are the
|
||||
various KEY_* events and SW_* events supported by rfkill-input.c.
|
||||
|
||||
******IMPORTANT******
|
||||
When rfkill-input is ACTIVE, userspace is NOT TO CHANGE THE STATE OF AN RFKILL
|
||||
SWITCH IN RESPONSE TO AN INPUT EVENT also handled by rfkill-input, unless it
|
||||
has set to true the user_claim attribute for that particular switch. This rule
|
||||
is *absolute*; do NOT violate it.
|
||||
******IMPORTANT******
|
||||
|
||||
Userspace must not assume it is the only source of control for rfkill switches.
|
||||
Their state CAN and WILL change due to firmware actions, direct user actions,
|
||||
and the rfkill-input EPO override for *_RFKILL_ALL.
|
||||
|
||||
When rfkill-input is not active, userspace must initiate a rfkill status
|
||||
change by writing to the "state" attribute in order for anything to happen.
|
||||
|
||||
Take particular care to implement EV_SW SW_RFKILL_ALL properly. When that
|
||||
switch is set to OFF, *every* rfkill device *MUST* be immediately put into the
|
||||
RFKILL_STATE_SOFT_BLOCKED state, no questions asked.
|
||||
|
||||
The following sysfs entries will be created:
|
||||
|
||||
name: Name assigned by driver to this key (interface or driver name).
|
||||
type: Name of the key type ("wlan", "bluetooth", etc).
|
||||
state: Current state of the transmitter
|
||||
0: RFKILL_STATE_SOFT_BLOCKED
|
||||
transmitter is forced off, but one can override it
|
||||
by a write to the state attribute;
|
||||
1: RFKILL_STATE_UNBLOCKED
|
||||
transmiter is NOT forced off, and may operate if
|
||||
all other conditions for such operation are met
|
||||
(such as interface is up and configured, etc);
|
||||
2: RFKILL_STATE_HARD_BLOCKED
|
||||
transmitter is forced off by something outside of
|
||||
the driver's control. One cannot set a device to
|
||||
this state through writes to the state attribute;
|
||||
claim: 1: Userspace handles events, 0: Kernel handles events
|
||||
|
||||
Both the "state" and "claim" entries are also writable. For the "state" entry
|
||||
this means that when 1 or 0 is written, the device rfkill state (if not yet in
|
||||
the requested state), will be will be toggled accordingly.
|
||||
|
||||
For the "claim" entry writing 1 to it means that the kernel no longer handles
|
||||
key events even though RFKILL_INPUT input was enabled. When "claim" has been
|
||||
set to 0, userspace should make sure that it listens for the input events or
|
||||
check the sysfs "state" entry regularly to correctly perform the required tasks
|
||||
when the rkfill key is pressed.
|
||||
|
||||
A note about input devices and EV_SW events:
|
||||
|
||||
In order to know the current state of an input device switch (like
|
||||
SW_RFKILL_ALL), you will need to use an IOCTL. That information is not
|
||||
available through sysfs in a generic way at this time, and it is not available
|
||||
through the rfkill class AT ALL.
|
||||
The contents of these variables corresponds to the "name", "state" and
|
||||
"type" sysfs files explained above.
|
||||
|
@ -135,7 +135,7 @@ manipulating this list), the user code must observe the following
|
||||
protocol on 'lock entry' insertion and removal:
|
||||
|
||||
On insertion:
|
||||
1) set the 'list_op_pending' word to the address of the 'lock word'
|
||||
1) set the 'list_op_pending' word to the address of the 'lock entry'
|
||||
to be inserted,
|
||||
2) acquire the futex lock,
|
||||
3) add the lock entry, with its thread id (TID) in the bottom 29 bits
|
||||
@ -143,7 +143,7 @@ On insertion:
|
||||
4) clear the 'list_op_pending' word.
|
||||
|
||||
On removal:
|
||||
1) set the 'list_op_pending' word to the address of the 'lock word'
|
||||
1) set the 'list_op_pending' word to the address of the 'lock entry'
|
||||
to be removed,
|
||||
2) remove the lock entry for this lock from the 'head' list,
|
||||
2) release the futex lock, and
|
||||
|
@ -1,10 +1,11 @@
|
||||
SCSI FC Tansport
|
||||
=============================================
|
||||
|
||||
Date: 4/12/2007
|
||||
Date: 11/18/2008
|
||||
Kernel Revisions for features:
|
||||
rports : <<TBS>>
|
||||
vports : 2.6.22 (? TBD)
|
||||
vports : 2.6.22
|
||||
bsg support : 2.6.30 (?TBD?)
|
||||
|
||||
|
||||
Introduction
|
||||
@ -15,6 +16,7 @@ The FC transport can be found at:
|
||||
drivers/scsi/scsi_transport_fc.c
|
||||
include/scsi/scsi_transport_fc.h
|
||||
include/scsi/scsi_netlink_fc.h
|
||||
include/scsi/scsi_bsg_fc.h
|
||||
|
||||
This file is found at Documentation/scsi/scsi_fc_transport.txt
|
||||
|
||||
@ -472,6 +474,14 @@ int
|
||||
fc_vport_terminate(struct fc_vport *vport)
|
||||
|
||||
|
||||
FC BSG support (CT & ELS passthru, and more)
|
||||
========================================================================
|
||||
<< To Be Supplied >>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Credits
|
||||
=======
|
||||
The following people have contributed to this document:
|
||||
|
@ -1271,6 +1271,11 @@ of interest:
|
||||
hostdata[0] - area reserved for LLD at end of struct Scsi_Host. Size
|
||||
is set by the second argument (named 'xtr_bytes') to
|
||||
scsi_host_alloc() or scsi_register().
|
||||
vendor_id - a unique value that identifies the vendor supplying
|
||||
the LLD for the Scsi_Host. Used most often in validating
|
||||
vendor-specific message requests. Value consists of an
|
||||
identifier type and a vendor-specific value.
|
||||
See scsi_netlink.h for a description of valid formats.
|
||||
|
||||
The scsi_host structure is defined in include/scsi/scsi_host.h
|
||||
|
||||
|
@ -139,6 +139,7 @@ ALC883/888
|
||||
acer Acer laptops (Travelmate 3012WTMi, Aspire 5600, etc)
|
||||
acer-aspire Acer Aspire 9810
|
||||
acer-aspire-4930g Acer Aspire 4930G
|
||||
acer-aspire-6530g Acer Aspire 6530G
|
||||
acer-aspire-8930g Acer Aspire 8930G
|
||||
medion Medion Laptops
|
||||
medion-md2 Medion MD2
|
||||
|
@ -233,8 +233,8 @@ These protections are added to score to judge whether this zone should be used
|
||||
for page allocation or should be reclaimed.
|
||||
|
||||
In this example, if normal pages (index=2) are required to this DMA zone and
|
||||
pages_high is used for watermark, the kernel judges this zone should not be
|
||||
used because pages_free(1355) is smaller than watermark + protection[2]
|
||||
watermark[WMARK_HIGH] is used for watermark, the kernel judges this zone should
|
||||
not be used because pages_free(1355) is smaller than watermark + protection[2]
|
||||
(4 + 2004 = 2008). If this protection value is 0, this zone would be used for
|
||||
normal page requirement. If requirement is DMA zone(index=0), protection[0]
|
||||
(=0) is used.
|
||||
@ -280,9 +280,10 @@ The default value is 65536.
|
||||
min_free_kbytes:
|
||||
|
||||
This is used to force the Linux VM to keep a minimum number
|
||||
of kilobytes free. The VM uses this number to compute a pages_min
|
||||
value for each lowmem zone in the system. Each lowmem zone gets
|
||||
a number of reserved free pages based proportionally on its size.
|
||||
of kilobytes free. The VM uses this number to compute a
|
||||
watermark[WMARK_MIN] value for each lowmem zone in the system.
|
||||
Each lowmem zone gets a number of reserved free pages based
|
||||
proportionally on its size.
|
||||
|
||||
Some minimal amount of memory is needed to satisfy PF_MEMALLOC
|
||||
allocations; if you set this to lower than 1024KB, your system will
|
||||
@ -314,10 +315,14 @@ min_unmapped_ratio:
|
||||
|
||||
This is available only on NUMA kernels.
|
||||
|
||||
A percentage of the total pages in each zone. Zone reclaim will only
|
||||
occur if more than this percentage of pages are file backed and unmapped.
|
||||
This is to insure that a minimal amount of local pages is still available for
|
||||
file I/O even if the node is overallocated.
|
||||
This is a percentage of the total pages in each zone. Zone reclaim will
|
||||
only occur if more than this percentage of pages are in a state that
|
||||
zone_reclaim_mode allows to be reclaimed.
|
||||
|
||||
If zone_reclaim_mode has the value 4 OR'd, then the percentage is compared
|
||||
against all file-backed unmapped pages including swapcache pages and tmpfs
|
||||
files. Otherwise, only unmapped pages backed by normal files but not tmpfs
|
||||
files and similar are considered.
|
||||
|
||||
The default is 1 percent.
|
||||
|
||||
|
@ -7,7 +7,6 @@ Copyright 2008 Red Hat Inc.
|
||||
(dual licensed under the GPL v2)
|
||||
Reviewers: Elias Oltmanns, Randy Dunlap, Andrew Morton,
|
||||
John Kacur, and David Teigland.
|
||||
|
||||
Written for: 2.6.28-rc2
|
||||
|
||||
Introduction
|
||||
@ -33,13 +32,26 @@ The File System
|
||||
Ftrace uses the debugfs file system to hold the control files as
|
||||
well as the files to display output.
|
||||
|
||||
To mount the debugfs system:
|
||||
When debugfs is configured into the kernel (which selecting any ftrace
|
||||
option will do) the directory /sys/kernel/debug will be created. To mount
|
||||
this directory, you can add to your /etc/fstab file:
|
||||
|
||||
# mkdir /debug
|
||||
# mount -t debugfs nodev /debug
|
||||
debugfs /sys/kernel/debug debugfs defaults 0 0
|
||||
|
||||
( Note: it is more common to mount at /sys/kernel/debug, but for
|
||||
simplicity this document will use /debug)
|
||||
Or you can mount it at run time with:
|
||||
|
||||
mount -t debugfs nodev /sys/kernel/debug
|
||||
|
||||
For quicker access to that directory you may want to make a soft link to
|
||||
it:
|
||||
|
||||
ln -s /sys/kernel/debug /debug
|
||||
|
||||
Any selected ftrace option will also create a directory called tracing
|
||||
within the debugfs. The rest of the document will assume that you are in
|
||||
the ftrace directory (cd /sys/kernel/debug/tracing) and will only concentrate
|
||||
on the files within that directory and not distract from the content with
|
||||
the extended "/sys/kernel/debug/tracing" path name.
|
||||
|
||||
That's it! (assuming that you have ftrace configured into your kernel)
|
||||
|
||||
@ -389,18 +401,18 @@ trace_options
|
||||
The trace_options file is used to control what gets printed in
|
||||
the trace output. To see what is available, simply cat the file:
|
||||
|
||||
cat /debug/tracing/trace_options
|
||||
cat trace_options
|
||||
print-parent nosym-offset nosym-addr noverbose noraw nohex nobin \
|
||||
noblock nostacktrace nosched-tree nouserstacktrace nosym-userobj
|
||||
|
||||
To disable one of the options, echo in the option prepended with
|
||||
"no".
|
||||
|
||||
echo noprint-parent > /debug/tracing/trace_options
|
||||
echo noprint-parent > trace_options
|
||||
|
||||
To enable an option, leave off the "no".
|
||||
|
||||
echo sym-offset > /debug/tracing/trace_options
|
||||
echo sym-offset > trace_options
|
||||
|
||||
Here are the available options:
|
||||
|
||||
@ -476,11 +488,11 @@ sched_switch
|
||||
This tracer simply records schedule switches. Here is an example
|
||||
of how to use it.
|
||||
|
||||
# echo sched_switch > /debug/tracing/current_tracer
|
||||
# echo 1 > /debug/tracing/tracing_enabled
|
||||
# echo sched_switch > current_tracer
|
||||
# echo 1 > tracing_enabled
|
||||
# sleep 1
|
||||
# echo 0 > /debug/tracing/tracing_enabled
|
||||
# cat /debug/tracing/trace
|
||||
# echo 0 > tracing_enabled
|
||||
# cat trace
|
||||
|
||||
# tracer: sched_switch
|
||||
#
|
||||
@ -583,13 +595,13 @@ new trace is saved.
|
||||
To reset the maximum, echo 0 into tracing_max_latency. Here is
|
||||
an example:
|
||||
|
||||
# echo irqsoff > /debug/tracing/current_tracer
|
||||
# echo 0 > /debug/tracing/tracing_max_latency
|
||||
# echo 1 > /debug/tracing/tracing_enabled
|
||||
# echo irqsoff > current_tracer
|
||||
# echo 0 > tracing_max_latency
|
||||
# echo 1 > tracing_enabled
|
||||
# ls -ltr
|
||||
[...]
|
||||
# echo 0 > /debug/tracing/tracing_enabled
|
||||
# cat /debug/tracing/latency_trace
|
||||
# echo 0 > tracing_enabled
|
||||
# cat latency_trace
|
||||
# tracer: irqsoff
|
||||
#
|
||||
irqsoff latency trace v1.1.5 on 2.6.26
|
||||
@ -690,13 +702,13 @@ Like the irqsoff tracer, it records the maximum latency for
|
||||
which preemption was disabled. The control of preemptoff tracer
|
||||
is much like the irqsoff tracer.
|
||||
|
||||
# echo preemptoff > /debug/tracing/current_tracer
|
||||
# echo 0 > /debug/tracing/tracing_max_latency
|
||||
# echo 1 > /debug/tracing/tracing_enabled
|
||||
# echo preemptoff > current_tracer
|
||||
# echo 0 > tracing_max_latency
|
||||
# echo 1 > tracing_enabled
|
||||
# ls -ltr
|
||||
[...]
|
||||
# echo 0 > /debug/tracing/tracing_enabled
|
||||
# cat /debug/tracing/latency_trace
|
||||
# echo 0 > tracing_enabled
|
||||
# cat latency_trace
|
||||
# tracer: preemptoff
|
||||
#
|
||||
preemptoff latency trace v1.1.5 on 2.6.26-rc8
|
||||
@ -837,13 +849,13 @@ tracer.
|
||||
Again, using this trace is much like the irqsoff and preemptoff
|
||||
tracers.
|
||||
|
||||
# echo preemptirqsoff > /debug/tracing/current_tracer
|
||||
# echo 0 > /debug/tracing/tracing_max_latency
|
||||
# echo 1 > /debug/tracing/tracing_enabled
|
||||
# echo preemptirqsoff > current_tracer
|
||||
# echo 0 > tracing_max_latency
|
||||
# echo 1 > tracing_enabled
|
||||
# ls -ltr
|
||||
[...]
|
||||
# echo 0 > /debug/tracing/tracing_enabled
|
||||
# cat /debug/tracing/latency_trace
|
||||
# echo 0 > tracing_enabled
|
||||
# cat latency_trace
|
||||
# tracer: preemptirqsoff
|
||||
#
|
||||
preemptirqsoff latency trace v1.1.5 on 2.6.26-rc8
|
||||
@ -999,12 +1011,12 @@ slightly differently than we did with the previous tracers.
|
||||
Instead of performing an 'ls', we will run 'sleep 1' under
|
||||
'chrt' which changes the priority of the task.
|
||||
|
||||
# echo wakeup > /debug/tracing/current_tracer
|
||||
# echo 0 > /debug/tracing/tracing_max_latency
|
||||
# echo 1 > /debug/tracing/tracing_enabled
|
||||
# echo wakeup > current_tracer
|
||||
# echo 0 > tracing_max_latency
|
||||
# echo 1 > tracing_enabled
|
||||
# chrt -f 5 sleep 1
|
||||
# echo 0 > /debug/tracing/tracing_enabled
|
||||
# cat /debug/tracing/latency_trace
|
||||
# echo 0 > tracing_enabled
|
||||
# cat latency_trace
|
||||
# tracer: wakeup
|
||||
#
|
||||
wakeup latency trace v1.1.5 on 2.6.26-rc8
|
||||
@ -1114,11 +1126,11 @@ can be done from the debug file system. Make sure the
|
||||
ftrace_enabled is set; otherwise this tracer is a nop.
|
||||
|
||||
# sysctl kernel.ftrace_enabled=1
|
||||
# echo function > /debug/tracing/current_tracer
|
||||
# echo 1 > /debug/tracing/tracing_enabled
|
||||
# echo function > current_tracer
|
||||
# echo 1 > tracing_enabled
|
||||
# usleep 1
|
||||
# echo 0 > /debug/tracing/tracing_enabled
|
||||
# cat /debug/tracing/trace
|
||||
# echo 0 > tracing_enabled
|
||||
# cat trace
|
||||
# tracer: function
|
||||
#
|
||||
# TASK-PID CPU# TIMESTAMP FUNCTION
|
||||
@ -1155,7 +1167,7 @@ int trace_fd;
|
||||
[...]
|
||||
int main(int argc, char *argv[]) {
|
||||
[...]
|
||||
trace_fd = open("/debug/tracing/tracing_enabled", O_WRONLY);
|
||||
trace_fd = open(tracing_file("tracing_enabled"), O_WRONLY);
|
||||
[...]
|
||||
if (condition_hit()) {
|
||||
write(trace_fd, "0", 1);
|
||||
@ -1163,26 +1175,20 @@ int main(int argc, char *argv[]) {
|
||||
[...]
|
||||
}
|
||||
|
||||
Note: Here we hard coded the path name. The debugfs mount is not
|
||||
guaranteed to be at /debug (and is more commonly at
|
||||
/sys/kernel/debug). For simple one time traces, the above is
|
||||
sufficent. For anything else, a search through /proc/mounts may
|
||||
be needed to find where the debugfs file-system is mounted.
|
||||
|
||||
|
||||
Single thread tracing
|
||||
---------------------
|
||||
|
||||
By writing into /debug/tracing/set_ftrace_pid you can trace a
|
||||
By writing into set_ftrace_pid you can trace a
|
||||
single thread. For example:
|
||||
|
||||
# cat /debug/tracing/set_ftrace_pid
|
||||
# cat set_ftrace_pid
|
||||
no pid
|
||||
# echo 3111 > /debug/tracing/set_ftrace_pid
|
||||
# cat /debug/tracing/set_ftrace_pid
|
||||
# echo 3111 > set_ftrace_pid
|
||||
# cat set_ftrace_pid
|
||||
3111
|
||||
# echo function > /debug/tracing/current_tracer
|
||||
# cat /debug/tracing/trace | head
|
||||
# echo function > current_tracer
|
||||
# cat trace | head
|
||||
# tracer: function
|
||||
#
|
||||
# TASK-PID CPU# TIMESTAMP FUNCTION
|
||||
@ -1193,8 +1199,8 @@ no pid
|
||||
yum-updatesd-3111 [003] 1637.254683: lock_hrtimer_base <-hrtimer_try_to_cancel
|
||||
yum-updatesd-3111 [003] 1637.254685: fget_light <-do_sys_poll
|
||||
yum-updatesd-3111 [003] 1637.254686: pipe_poll <-do_sys_poll
|
||||
# echo -1 > /debug/tracing/set_ftrace_pid
|
||||
# cat /debug/tracing/trace |head
|
||||
# echo -1 > set_ftrace_pid
|
||||
# cat trace |head
|
||||
# tracer: function
|
||||
#
|
||||
# TASK-PID CPU# TIMESTAMP FUNCTION
|
||||
@ -1216,6 +1222,51 @@ something like this simple program:
|
||||
#include <fcntl.h>
|
||||
#include <unistd.h>
|
||||
|
||||
#define _STR(x) #x
|
||||
#define STR(x) _STR(x)
|
||||
#define MAX_PATH 256
|
||||
|
||||
const char *find_debugfs(void)
|
||||
{
|
||||
static char debugfs[MAX_PATH+1];
|
||||
static int debugfs_found;
|
||||
char type[100];
|
||||
FILE *fp;
|
||||
|
||||
if (debugfs_found)
|
||||
return debugfs;
|
||||
|
||||
if ((fp = fopen("/proc/mounts","r")) == NULL) {
|
||||
perror("/proc/mounts");
|
||||
return NULL;
|
||||
}
|
||||
|
||||
while (fscanf(fp, "%*s %"
|
||||
STR(MAX_PATH)
|
||||
"s %99s %*s %*d %*d\n",
|
||||
debugfs, type) == 2) {
|
||||
if (strcmp(type, "debugfs") == 0)
|
||||
break;
|
||||
}
|
||||
fclose(fp);
|
||||
|
||||
if (strcmp(type, "debugfs") != 0) {
|
||||
fprintf(stderr, "debugfs not mounted");
|
||||
return NULL;
|
||||
}
|
||||
|
||||
debugfs_found = 1;
|
||||
|
||||
return debugfs;
|
||||
}
|
||||
|
||||
const char *tracing_file(const char *file_name)
|
||||
{
|
||||
static char trace_file[MAX_PATH+1];
|
||||
snprintf(trace_file, MAX_PATH, "%s/%s", find_debugfs(), file_name);
|
||||
return trace_file;
|
||||
}
|
||||
|
||||
int main (int argc, char **argv)
|
||||
{
|
||||
if (argc < 1)
|
||||
@ -1226,12 +1277,12 @@ int main (int argc, char **argv)
|
||||
char line[64];
|
||||
int s;
|
||||
|
||||
ffd = open("/debug/tracing/current_tracer", O_WRONLY);
|
||||
ffd = open(tracing_file("current_tracer"), O_WRONLY);
|
||||
if (ffd < 0)
|
||||
exit(-1);
|
||||
write(ffd, "nop", 3);
|
||||
|
||||
fd = open("/debug/tracing/set_ftrace_pid", O_WRONLY);
|
||||
fd = open(tracing_file("set_ftrace_pid"), O_WRONLY);
|
||||
s = sprintf(line, "%d\n", getpid());
|
||||
write(fd, line, s);
|
||||
|
||||
@ -1383,22 +1434,22 @@ want, depending on your needs.
|
||||
tracing_cpu_mask file) or you might sometimes see unordered
|
||||
function calls while cpu tracing switch.
|
||||
|
||||
hide: echo nofuncgraph-cpu > /debug/tracing/trace_options
|
||||
show: echo funcgraph-cpu > /debug/tracing/trace_options
|
||||
hide: echo nofuncgraph-cpu > trace_options
|
||||
show: echo funcgraph-cpu > trace_options
|
||||
|
||||
- The duration (function's time of execution) is displayed on
|
||||
the closing bracket line of a function or on the same line
|
||||
than the current function in case of a leaf one. It is default
|
||||
enabled.
|
||||
|
||||
hide: echo nofuncgraph-duration > /debug/tracing/trace_options
|
||||
show: echo funcgraph-duration > /debug/tracing/trace_options
|
||||
hide: echo nofuncgraph-duration > trace_options
|
||||
show: echo funcgraph-duration > trace_options
|
||||
|
||||
- The overhead field precedes the duration field in case of
|
||||
reached duration thresholds.
|
||||
|
||||
hide: echo nofuncgraph-overhead > /debug/tracing/trace_options
|
||||
show: echo funcgraph-overhead > /debug/tracing/trace_options
|
||||
hide: echo nofuncgraph-overhead > trace_options
|
||||
show: echo funcgraph-overhead > trace_options
|
||||
depends on: funcgraph-duration
|
||||
|
||||
ie:
|
||||
@ -1427,8 +1478,8 @@ want, depending on your needs.
|
||||
- The task/pid field displays the thread cmdline and pid which
|
||||
executed the function. It is default disabled.
|
||||
|
||||
hide: echo nofuncgraph-proc > /debug/tracing/trace_options
|
||||
show: echo funcgraph-proc > /debug/tracing/trace_options
|
||||
hide: echo nofuncgraph-proc > trace_options
|
||||
show: echo funcgraph-proc > trace_options
|
||||
|
||||
ie:
|
||||
|
||||
@ -1451,8 +1502,8 @@ want, depending on your needs.
|
||||
system clock since it started. A snapshot of this time is
|
||||
given on each entry/exit of functions
|
||||
|
||||
hide: echo nofuncgraph-abstime > /debug/tracing/trace_options
|
||||
show: echo funcgraph-abstime > /debug/tracing/trace_options
|
||||
hide: echo nofuncgraph-abstime > trace_options
|
||||
show: echo funcgraph-abstime > trace_options
|
||||
|
||||
ie:
|
||||
|
||||
@ -1549,7 +1600,7 @@ listed in:
|
||||
|
||||
available_filter_functions
|
||||
|
||||
# cat /debug/tracing/available_filter_functions
|
||||
# cat available_filter_functions
|
||||
put_prev_task_idle
|
||||
kmem_cache_create
|
||||
pick_next_task_rt
|
||||
@ -1561,12 +1612,12 @@ mutex_lock
|
||||
If I am only interested in sys_nanosleep and hrtimer_interrupt:
|
||||
|
||||
# echo sys_nanosleep hrtimer_interrupt \
|
||||
> /debug/tracing/set_ftrace_filter
|
||||
# echo ftrace > /debug/tracing/current_tracer
|
||||
# echo 1 > /debug/tracing/tracing_enabled
|
||||
> set_ftrace_filter
|
||||
# echo ftrace > current_tracer
|
||||
# echo 1 > tracing_enabled
|
||||
# usleep 1
|
||||
# echo 0 > /debug/tracing/tracing_enabled
|
||||
# cat /debug/tracing/trace
|
||||
# echo 0 > tracing_enabled
|
||||
# cat trace
|
||||
# tracer: ftrace
|
||||
#
|
||||
# TASK-PID CPU# TIMESTAMP FUNCTION
|
||||
@ -1577,7 +1628,7 @@ If I am only interested in sys_nanosleep and hrtimer_interrupt:
|
||||
|
||||
To see which functions are being traced, you can cat the file:
|
||||
|
||||
# cat /debug/tracing/set_ftrace_filter
|
||||
# cat set_ftrace_filter
|
||||
hrtimer_interrupt
|
||||
sys_nanosleep
|
||||
|
||||
@ -1597,7 +1648,7 @@ Note: It is better to use quotes to enclose the wild cards,
|
||||
otherwise the shell may expand the parameters into names
|
||||
of files in the local directory.
|
||||
|
||||
# echo 'hrtimer_*' > /debug/tracing/set_ftrace_filter
|
||||
# echo 'hrtimer_*' > set_ftrace_filter
|
||||
|
||||
Produces:
|
||||
|
||||
@ -1618,7 +1669,7 @@ Produces:
|
||||
|
||||
Notice that we lost the sys_nanosleep.
|
||||
|
||||
# cat /debug/tracing/set_ftrace_filter
|
||||
# cat set_ftrace_filter
|
||||
hrtimer_run_queues
|
||||
hrtimer_run_pending
|
||||
hrtimer_init
|
||||
@ -1644,17 +1695,17 @@ To append to the filters, use '>>'
|
||||
To clear out a filter so that all functions will be recorded
|
||||
again:
|
||||
|
||||
# echo > /debug/tracing/set_ftrace_filter
|
||||
# cat /debug/tracing/set_ftrace_filter
|
||||
# echo > set_ftrace_filter
|
||||
# cat set_ftrace_filter
|
||||
#
|
||||
|
||||
Again, now we want to append.
|
||||
|
||||
# echo sys_nanosleep > /debug/tracing/set_ftrace_filter
|
||||
# cat /debug/tracing/set_ftrace_filter
|
||||
# echo sys_nanosleep > set_ftrace_filter
|
||||
# cat set_ftrace_filter
|
||||
sys_nanosleep
|
||||
# echo 'hrtimer_*' >> /debug/tracing/set_ftrace_filter
|
||||
# cat /debug/tracing/set_ftrace_filter
|
||||
# echo 'hrtimer_*' >> set_ftrace_filter
|
||||
# cat set_ftrace_filter
|
||||
hrtimer_run_queues
|
||||
hrtimer_run_pending
|
||||
hrtimer_init
|
||||
@ -1677,7 +1728,7 @@ hrtimer_init_sleeper
|
||||
The set_ftrace_notrace prevents those functions from being
|
||||
traced.
|
||||
|
||||
# echo '*preempt*' '*lock*' > /debug/tracing/set_ftrace_notrace
|
||||
# echo '*preempt*' '*lock*' > set_ftrace_notrace
|
||||
|
||||
Produces:
|
||||
|
||||
@ -1767,13 +1818,13 @@ the effect on the tracing is different. Every read from
|
||||
trace_pipe is consumed. This means that subsequent reads will be
|
||||
different. The trace is live.
|
||||
|
||||
# echo function > /debug/tracing/current_tracer
|
||||
# cat /debug/tracing/trace_pipe > /tmp/trace.out &
|
||||
# echo function > current_tracer
|
||||
# cat trace_pipe > /tmp/trace.out &
|
||||
[1] 4153
|
||||
# echo 1 > /debug/tracing/tracing_enabled
|
||||
# echo 1 > tracing_enabled
|
||||
# usleep 1
|
||||
# echo 0 > /debug/tracing/tracing_enabled
|
||||
# cat /debug/tracing/trace
|
||||
# echo 0 > tracing_enabled
|
||||
# cat trace
|
||||
# tracer: function
|
||||
#
|
||||
# TASK-PID CPU# TIMESTAMP FUNCTION
|
||||
@ -1809,7 +1860,7 @@ number listed is the number of entries that can be recorded per
|
||||
CPU. To know the full size, multiply the number of possible CPUS
|
||||
with the number of entries.
|
||||
|
||||
# cat /debug/tracing/buffer_size_kb
|
||||
# cat buffer_size_kb
|
||||
1408 (units kilobytes)
|
||||
|
||||
Note, to modify this, you must have tracing completely disabled.
|
||||
@ -1817,18 +1868,18 @@ To do that, echo "nop" into the current_tracer. If the
|
||||
current_tracer is not set to "nop", an EINVAL error will be
|
||||
returned.
|
||||
|
||||
# echo nop > /debug/tracing/current_tracer
|
||||
# echo 10000 > /debug/tracing/buffer_size_kb
|
||||
# cat /debug/tracing/buffer_size_kb
|
||||
# echo nop > current_tracer
|
||||
# echo 10000 > buffer_size_kb
|
||||
# cat buffer_size_kb
|
||||
10000 (units kilobytes)
|
||||
|
||||
The number of pages which will be allocated is limited to a
|
||||
percentage of available memory. Allocating too much will produce
|
||||
an error.
|
||||
|
||||
# echo 1000000000000 > /debug/tracing/buffer_size_kb
|
||||
# echo 1000000000000 > buffer_size_kb
|
||||
-bash: echo: write error: Cannot allocate memory
|
||||
# cat /debug/tracing/buffer_size_kb
|
||||
# cat buffer_size_kb
|
||||
85
|
||||
|
||||
-----------
|
||||
|
@ -32,41 +32,41 @@ is no way to automatically detect if you are losing events due to CPUs racing.
|
||||
Usage Quick Reference
|
||||
---------------------
|
||||
|
||||
$ mount -t debugfs debugfs /debug
|
||||
$ echo mmiotrace > /debug/tracing/current_tracer
|
||||
$ cat /debug/tracing/trace_pipe > mydump.txt &
|
||||
$ mount -t debugfs debugfs /sys/kernel/debug
|
||||
$ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
|
||||
$ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt &
|
||||
Start X or whatever.
|
||||
$ echo "X is up" > /debug/tracing/trace_marker
|
||||
$ echo nop > /debug/tracing/current_tracer
|
||||
$ echo "X is up" > /sys/kernel/debug/tracing/trace_marker
|
||||
$ echo nop > /sys/kernel/debug/tracing/current_tracer
|
||||
Check for lost events.
|
||||
|
||||
|
||||
Usage
|
||||
-----
|
||||
|
||||
Make sure debugfs is mounted to /debug. If not, (requires root privileges)
|
||||
$ mount -t debugfs debugfs /debug
|
||||
Make sure debugfs is mounted to /sys/kernel/debug. If not, (requires root privileges)
|
||||
$ mount -t debugfs debugfs /sys/kernel/debug
|
||||
|
||||
Check that the driver you are about to trace is not loaded.
|
||||
|
||||
Activate mmiotrace (requires root privileges):
|
||||
$ echo mmiotrace > /debug/tracing/current_tracer
|
||||
$ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
|
||||
|
||||
Start storing the trace:
|
||||
$ cat /debug/tracing/trace_pipe > mydump.txt &
|
||||
$ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt &
|
||||
The 'cat' process should stay running (sleeping) in the background.
|
||||
|
||||
Load the driver you want to trace and use it. Mmiotrace will only catch MMIO
|
||||
accesses to areas that are ioremapped while mmiotrace is active.
|
||||
|
||||
During tracing you can place comments (markers) into the trace by
|
||||
$ echo "X is up" > /debug/tracing/trace_marker
|
||||
$ echo "X is up" > /sys/kernel/debug/tracing/trace_marker
|
||||
This makes it easier to see which part of the (huge) trace corresponds to
|
||||
which action. It is recommended to place descriptive markers about what you
|
||||
do.
|
||||
|
||||
Shut down mmiotrace (requires root privileges):
|
||||
$ echo nop > /debug/tracing/current_tracer
|
||||
$ echo nop > /sys/kernel/debug/tracing/current_tracer
|
||||
The 'cat' process exits. If it does not, kill it by issuing 'fg' command and
|
||||
pressing ctrl+c.
|
||||
|
||||
@ -78,10 +78,10 @@ to view your kernel log and look for "mmiotrace has lost events" warning. If
|
||||
events were lost, the trace is incomplete. You should enlarge the buffers and
|
||||
try again. Buffers are enlarged by first seeing how large the current buffers
|
||||
are:
|
||||
$ cat /debug/tracing/buffer_size_kb
|
||||
$ cat /sys/kernel/debug/tracing/buffer_size_kb
|
||||
gives you a number. Approximately double this number and write it back, for
|
||||
instance:
|
||||
$ echo 128000 > /debug/tracing/buffer_size_kb
|
||||
$ echo 128000 > /sys/kernel/debug/tracing/buffer_size_kb
|
||||
Then start again from the top.
|
||||
|
||||
If you are doing a trace for a driver project, e.g. Nouveau, you should also
|
||||
|
@ -16,3 +16,8 @@
|
||||
15 -> TeVii S470 [d470:9022]
|
||||
16 -> DVBWorld DVB-S2 2005 [0001:2005]
|
||||
17 -> NetUP Dual DVB-S2 CI [1b55:2a2c]
|
||||
18 -> Hauppauge WinTV-HVR1270 [0070:2211]
|
||||
19 -> Hauppauge WinTV-HVR1275 [0070:2215]
|
||||
20 -> Hauppauge WinTV-HVR1255 [0070:2251]
|
||||
21 -> Hauppauge WinTV-HVR1210 [0070:2291,0070:2295]
|
||||
22 -> Mygica X8506 DMB-TH [14f1:8651]
|
||||
|
@ -6,8 +6,8 @@
|
||||
5 -> Leadtek Winfast 2000XP Expert [107d:6611,107d:6613]
|
||||
6 -> AverTV Studio 303 (M126) [1461:000b]
|
||||
7 -> MSI TV-@nywhere Master [1462:8606]
|
||||
8 -> Leadtek Winfast DV2000 [107d:6620]
|
||||
9 -> Leadtek PVR 2000 [107d:663b,107d:663c,107d:6632]
|
||||
8 -> Leadtek Winfast DV2000 [107d:6620,107d:6621]
|
||||
9 -> Leadtek PVR 2000 [107d:663b,107d:663c,107d:6632,107d:6630,107d:6638,107d:6631,107d:6637,107d:663d]
|
||||
10 -> IODATA GV-VCP3/PCI [10fc:d003]
|
||||
11 -> Prolink PlayTV PVR
|
||||
12 -> ASUS PVR-416 [1043:4823,1461:c111]
|
||||
@ -59,7 +59,7 @@
|
||||
58 -> Pinnacle PCTV HD 800i [11bd:0051]
|
||||
59 -> DViCO FusionHDTV 5 PCI nano [18ac:d530]
|
||||
60 -> Pinnacle Hybrid PCTV [12ab:1788]
|
||||
61 -> Winfast TV2000 XP Global [107d:6f18]
|
||||
61 -> Leadtek TV2000 XP Global [107d:6f18,107d:6618]
|
||||
62 -> PowerColor RA330 [14f1:ea3d]
|
||||
63 -> Geniatech X8000-MT DVBT [14f1:8852]
|
||||
64 -> DViCO FusionHDTV DVB-T PRO [18ac:db30]
|
||||
@ -78,3 +78,5 @@
|
||||
77 -> TBS 8910 DVB-S [8910:8888]
|
||||
78 -> Prof 6200 DVB-S [b022:3022]
|
||||
79 -> Terratec Cinergy HT PCI MKII [153b:1177]
|
||||
80 -> Hauppauge WinTV-IR Only [0070:9290]
|
||||
81 -> Leadtek WinFast DTV1800 Hybrid [107d:6654]
|
||||
|
@ -17,7 +17,7 @@
|
||||
16 -> Hauppauge WinTV HVR 950 (em2883) [2040:6513,2040:6517,2040:651b]
|
||||
17 -> Pinnacle PCTV HD Pro Stick (em2880) [2304:0227]
|
||||
18 -> Hauppauge WinTV HVR 900 (R2) (em2880) [2040:6502]
|
||||
19 -> PointNix Intra-Oral Camera (em2860)
|
||||
19 -> EM2860/SAA711X Reference Design (em2860)
|
||||
20 -> AMD ATI TV Wonder HD 600 (em2880) [0438:b002]
|
||||
21 -> eMPIA Technology, Inc. GrabBeeX+ Video Encoder (em2800) [eb1a:2801]
|
||||
22 -> Unknown EM2750/EM2751 webcam grabber (em2750) [eb1a:2750,eb1a:2751]
|
||||
@ -61,3 +61,8 @@
|
||||
63 -> Kaiomy TVnPC U2 (em2860) [eb1a:e303]
|
||||
64 -> Easy Cap Capture DC-60 (em2860)
|
||||
65 -> IO-DATA GV-MVP/SZ (em2820/em2840) [04bb:0515]
|
||||
66 -> Empire dual TV (em2880)
|
||||
67 -> Terratec Grabby (em2860) [0ccd:0096]
|
||||
68 -> Terratec AV350 (em2860) [0ccd:0084]
|
||||
69 -> KWorld ATSC 315U HDTV TV Box (em2882) [eb1a:a313]
|
||||
70 -> Evga inDtube (em2882)
|
||||
|
@ -124,10 +124,10 @@
|
||||
123 -> Beholder BeholdTV 407 [0000:4070]
|
||||
124 -> Beholder BeholdTV 407 FM [0000:4071]
|
||||
125 -> Beholder BeholdTV 409 [0000:4090]
|
||||
126 -> Beholder BeholdTV 505 FM/RDS [0000:5051,0000:505B,5ace:5050]
|
||||
127 -> Beholder BeholdTV 507 FM/RDS / BeholdTV 509 FM [0000:5071,0000:507B,5ace:5070,5ace:5090]
|
||||
126 -> Beholder BeholdTV 505 FM [5ace:5050]
|
||||
127 -> Beholder BeholdTV 507 FM / BeholdTV 509 FM [5ace:5070,5ace:5090]
|
||||
128 -> Beholder BeholdTV Columbus TVFM [0000:5201]
|
||||
129 -> Beholder BeholdTV 607 / BeholdTV 609 [5ace:6070,5ace:6071,5ace:6072,5ace:6073,5ace:6090,5ace:6091,5ace:6092,5ace:6093]
|
||||
129 -> Beholder BeholdTV 607 FM [5ace:6070]
|
||||
130 -> Beholder BeholdTV M6 [5ace:6190]
|
||||
131 -> Twinhan Hybrid DTV-DVB 3056 PCI [1822:0022]
|
||||
132 -> Genius TVGO AM11MCE
|
||||
@ -143,7 +143,7 @@
|
||||
142 -> Beholder BeholdTV H6 [5ace:6290]
|
||||
143 -> Beholder BeholdTV M63 [5ace:6191]
|
||||
144 -> Beholder BeholdTV M6 Extra [5ace:6193]
|
||||
145 -> AVerMedia MiniPCI DVB-T Hybrid M103 [1461:f636]
|
||||
145 -> AVerMedia MiniPCI DVB-T Hybrid M103 [1461:f636,1461:f736]
|
||||
146 -> ASUSTeK P7131 Analog
|
||||
147 -> Asus Tiger 3in1 [1043:4878]
|
||||
148 -> Encore ENLTV-FM v5.3 [1a7f:2008]
|
||||
@ -154,4 +154,16 @@
|
||||
153 -> Kworld Plus TV Analog Lite PCI [17de:7128]
|
||||
154 -> Avermedia AVerTV GO 007 FM Plus [1461:f31d]
|
||||
155 -> Hauppauge WinTV-HVR1120 ATSC/QAM-Hybrid [0070:6706,0070:6708]
|
||||
156 -> Hauppauge WinTV-HVR1110r3 [0070:6707,0070:6709,0070:670a]
|
||||
156 -> Hauppauge WinTV-HVR1110r3 DVB-T/Hybrid [0070:6707,0070:6709,0070:670a]
|
||||
157 -> Avermedia AVerTV Studio 507UA [1461:a11b]
|
||||
158 -> AVerMedia Cardbus TV/Radio (E501R) [1461:b7e9]
|
||||
159 -> Beholder BeholdTV 505 RDS [0000:505B]
|
||||
160 -> Beholder BeholdTV 507 RDS [0000:5071]
|
||||
161 -> Beholder BeholdTV 507 RDS [0000:507B]
|
||||
162 -> Beholder BeholdTV 607 FM [5ace:6071]
|
||||
163 -> Beholder BeholdTV 609 FM [5ace:6090]
|
||||
164 -> Beholder BeholdTV 609 FM [5ace:6091]
|
||||
165 -> Beholder BeholdTV 607 RDS [5ace:6072]
|
||||
166 -> Beholder BeholdTV 607 RDS [5ace:6073]
|
||||
167 -> Beholder BeholdTV 609 RDS [5ace:6092]
|
||||
168 -> Beholder BeholdTV 609 RDS [5ace:6093]
|
||||
|
@ -76,3 +76,5 @@ tuner=75 - Philips TEA5761 FM Radio
|
||||
tuner=76 - Xceive 5000 tuner
|
||||
tuner=77 - TCL tuner MF02GIP-5N-E
|
||||
tuner=78 - Philips FMD1216MEX MK3 Hybrid Tuner
|
||||
tuner=79 - Philips PAL/SECAM multi (FM1216 MK5)
|
||||
tuner=80 - Philips FQ1216LME MK3 PAL/SECAM w/active loopthrough
|
||||
|
@ -163,10 +163,11 @@ sunplus 055f:c650 Mustek MDC5500Z
|
||||
zc3xx 055f:d003 Mustek WCam300A
|
||||
zc3xx 055f:d004 Mustek WCam300 AN
|
||||
conex 0572:0041 Creative Notebook cx11646
|
||||
ov519 05a9:0519 OmniVision
|
||||
ov519 05a9:0519 OV519 Microphone
|
||||
ov519 05a9:0530 OmniVision
|
||||
ov519 05a9:4519 OmniVision
|
||||
ov519 05a9:4519 Webcam Classic
|
||||
ov519 05a9:8519 OmniVision
|
||||
ov519 05a9:a518 D-Link DSB-C310 Webcam
|
||||
sunplus 05da:1018 Digital Dream Enigma 1.3
|
||||
stk014 05e1:0893 Syntek DV4000
|
||||
spca561 060b:a001 Maxell Compact Pc PM3
|
||||
@ -178,6 +179,7 @@ spca506 06e1:a190 ADS Instant VCD
|
||||
ov534 06f8:3002 Hercules Blog Webcam
|
||||
ov534 06f8:3003 Hercules Dualpix HD Weblog
|
||||
sonixj 06f8:3004 Hercules Classic Silver
|
||||
sonixj 06f8:3008 Hercules Deluxe Optical Glass
|
||||
spca508 0733:0110 ViewQuest VQ110
|
||||
spca508 0130:0130 Clone Digital Webcam 11043
|
||||
spca501 0733:0401 Intel Create and Share
|
||||
@ -209,6 +211,7 @@ sunplus 08ca:2050 Medion MD 41437
|
||||
sunplus 08ca:2060 Aiptek PocketDV5300
|
||||
tv8532 0923:010f ICM532 cams
|
||||
mars 093a:050f Mars-Semi Pc-Camera
|
||||
mr97310a 093a:010f Sakar Digital no. 77379
|
||||
pac207 093a:2460 Qtec Webcam 100
|
||||
pac207 093a:2461 HP Webcam
|
||||
pac207 093a:2463 Philips SPC 220 NC
|
||||
@ -265,6 +268,11 @@ sonixj 0c45:60ec SN9C105+MO4000
|
||||
sonixj 0c45:60fb Surfer NoName
|
||||
sonixj 0c45:60fc LG-LIC300
|
||||
sonixj 0c45:60fe Microdia Audio
|
||||
sonixj 0c45:6100 PC Camera (SN9C128)
|
||||
sonixj 0c45:610a PC Camera (SN9C128)
|
||||
sonixj 0c45:610b PC Camera (SN9C128)
|
||||
sonixj 0c45:610c PC Camera (SN9C128)
|
||||
sonixj 0c45:610e PC Camera (SN9C128)
|
||||
sonixj 0c45:6128 Microdia/Sonix SNP325
|
||||
sonixj 0c45:612a Avant Camera
|
||||
sonixj 0c45:612c Typhoon Rasy Cam 1.3MPix
|
||||
|
@ -26,6 +26,55 @@ Global video workflow
|
||||
|
||||
Once the last buffer is filled in, the QCI interface stops.
|
||||
|
||||
c) Capture global finite state machine schema
|
||||
|
||||
+----+ +---+ +----+
|
||||
| DQ | | Q | | DQ |
|
||||
| v | v | v
|
||||
+-----------+ +------------------------+
|
||||
| STOP | | Wait for capture start |
|
||||
+-----------+ Q +------------------------+
|
||||
+-> | QCI: stop | ------------------> | QCI: run | <------------+
|
||||
| | DMA: stop | | DMA: stop | |
|
||||
| +-----------+ +-----> +------------------------+ |
|
||||
| / | |
|
||||
| / +---+ +----+ | |
|
||||
|capture list empty / | Q | | DQ | | QCI Irq EOF |
|
||||
| / | v | v v |
|
||||
| +--------------------+ +----------------------+ |
|
||||
| | DMA hotlink missed | | Capture running | |
|
||||
| +--------------------+ +----------------------+ |
|
||||
| | QCI: run | +-----> | QCI: run | <-+ |
|
||||
| | DMA: stop | / | DMA: run | | |
|
||||
| +--------------------+ / +----------------------+ | Other |
|
||||
| ^ /DMA still | | channels |
|
||||
| | capture list / running | DMA Irq End | not |
|
||||
| | not empty / | | finished |
|
||||
| | / v | yet |
|
||||
| +----------------------+ +----------------------+ | |
|
||||
| | Videobuf released | | Channel completed | | |
|
||||
| +----------------------+ +----------------------+ | |
|
||||
+-- | QCI: run | | QCI: run | --+ |
|
||||
| DMA: run | | DMA: run | |
|
||||
+----------------------+ +----------------------+ |
|
||||
^ / | |
|
||||
| no overrun / | overrun |
|
||||
| / v |
|
||||
+--------------------+ / +----------------------+ |
|
||||
| Frame completed | / | Frame overran | |
|
||||
+--------------------+ <-----+ +----------------------+ restart frame |
|
||||
| QCI: run | | QCI: stop | --------------+
|
||||
| DMA: run | | DMA: stop |
|
||||
+--------------------+ +----------------------+
|
||||
|
||||
Legend: - each box is a FSM state
|
||||
- each arrow is the condition to transition to another state
|
||||
- an arrow with a comment is a mandatory transition (no condition)
|
||||
- arrow "Q" means : a buffer was enqueued
|
||||
- arrow "DQ" means : a buffer was dequeued
|
||||
- "QCI: stop" means the QCI interface is not enabled
|
||||
- "DMA: stop" means all 3 DMA channels are stopped
|
||||
- "DMA: run" means at least 1 DMA channel is still running
|
||||
|
||||
DMA usage
|
||||
---------
|
||||
|
@ -89,6 +89,11 @@ from dev (driver name followed by the bus_id, to be precise). If you set it
|
||||
up before calling v4l2_device_register then it will be untouched. If dev is
|
||||
NULL, then you *must* setup v4l2_dev->name before calling v4l2_device_register.
|
||||
|
||||
You can use v4l2_device_set_name() to set the name based on a driver name and
|
||||
a driver-global atomic_t instance. This will generate names like ivtv0, ivtv1,
|
||||
etc. If the name ends with a digit, then it will insert a dash: cx18-0,
|
||||
cx18-1, etc. This function returns the instance number.
|
||||
|
||||
The first 'dev' argument is normally the struct device pointer of a pci_dev,
|
||||
usb_interface or platform_device. It is rare for dev to be NULL, but it happens
|
||||
with ISA devices or when one device creates multiple PCI devices, thus making
|
||||
@ -385,6 +390,30 @@ later date. It differs between i2c drivers and as such can be confusing.
|
||||
To see which chip variants are supported you can look in the i2c driver code
|
||||
for the i2c_device_id table. This lists all the possibilities.
|
||||
|
||||
There are two more helper functions:
|
||||
|
||||
v4l2_i2c_new_subdev_cfg: this function adds new irq and platform_data
|
||||
arguments and has both 'addr' and 'probed_addrs' arguments: if addr is not
|
||||
0 then that will be used (non-probing variant), otherwise the probed_addrs
|
||||
are probed.
|
||||
|
||||
For example: this will probe for address 0x10:
|
||||
|
||||
struct v4l2_subdev *sd = v4l2_i2c_new_subdev_cfg(v4l2_dev, adapter,
|
||||
"module_foo", "chipid", 0, NULL, 0, I2C_ADDRS(0x10));
|
||||
|
||||
v4l2_i2c_new_subdev_board uses an i2c_board_info struct which is passed
|
||||
to the i2c driver and replaces the irq, platform_data and addr arguments.
|
||||
|
||||
If the subdev supports the s_config core ops, then that op is called with
|
||||
the irq and platform_data arguments after the subdev was setup. The older
|
||||
v4l2_i2c_new_(probed_)subdev functions will call s_config as well, but with
|
||||
irq set to 0 and platform_data set to NULL.
|
||||
|
||||
Note that in the next kernel release the functions v4l2_i2c_new_subdev,
|
||||
v4l2_i2c_new_probed_subdev and v4l2_i2c_new_probed_subdev_addr will all be
|
||||
replaced by a single v4l2_i2c_new_subdev that is identical to
|
||||
v4l2_i2c_new_subdev_cfg but without the irq and platform_data arguments.
|
||||
|
||||
struct video_device
|
||||
-------------------
|
||||
|
@ -2,7 +2,7 @@
|
||||
obj- := dummy.o
|
||||
|
||||
# List of programs to build
|
||||
hostprogs-y := slabinfo
|
||||
hostprogs-y := slabinfo page-types
|
||||
|
||||
# Tell kbuild to always build the programs
|
||||
always := $(hostprogs-y)
|
||||
|
@ -75,15 +75,15 @@ Page stealing from process memory and shm is done if stealing the page would
|
||||
alleviate memory pressure on any zone in the page's node that has fallen below
|
||||
its watermark.
|
||||
|
||||
pages_min/pages_low/pages_high/low_on_memory/zone_wake_kswapd: These are
|
||||
per-zone fields, used to determine when a zone needs to be balanced. When
|
||||
the number of pages falls below pages_min, the hysteric field low_on_memory
|
||||
gets set. This stays set till the number of free pages becomes pages_high.
|
||||
When low_on_memory is set, page allocation requests will try to free some
|
||||
pages in the zone (providing GFP_WAIT is set in the request). Orthogonal
|
||||
to this, is the decision to poke kswapd to free some zone pages. That
|
||||
decision is not hysteresis based, and is done when the number of free
|
||||
pages is below pages_low; in which case zone_wake_kswapd is also set.
|
||||
watemark[WMARK_MIN/WMARK_LOW/WMARK_HIGH]/low_on_memory/zone_wake_kswapd: These
|
||||
are per-zone fields, used to determine when a zone needs to be balanced. When
|
||||
the number of pages falls below watermark[WMARK_MIN], the hysteric field
|
||||
low_on_memory gets set. This stays set till the number of free pages becomes
|
||||
watermark[WMARK_HIGH]. When low_on_memory is set, page allocation requests will
|
||||
try to free some pages in the zone (providing GFP_WAIT is set in the request).
|
||||
Orthogonal to this, is the decision to poke kswapd to free some zone pages.
|
||||
That decision is not hysteresis based, and is done when the number of free
|
||||
pages is below watermark[WMARK_LOW]; in which case zone_wake_kswapd is also set.
|
||||
|
||||
|
||||
(Good) Ideas that I have heard:
|
||||
|
698
Documentation/vm/page-types.c
Normal file
698
Documentation/vm/page-types.c
Normal file
@ -0,0 +1,698 @@
|
||||
/*
|
||||
* page-types: Tool for querying page flags
|
||||
*
|
||||
* Copyright (C) 2009 Intel corporation
|
||||
* Copyright (C) 2009 Wu Fengguang <fengguang.wu@intel.com>
|
||||
*/
|
||||
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <unistd.h>
|
||||
#include <stdint.h>
|
||||
#include <stdarg.h>
|
||||
#include <string.h>
|
||||
#include <getopt.h>
|
||||
#include <limits.h>
|
||||
#include <sys/types.h>
|
||||
#include <sys/errno.h>
|
||||
#include <sys/fcntl.h>
|
||||
|
||||
|
||||
/*
|
||||
* kernel page flags
|
||||
*/
|
||||
|
||||
#define KPF_BYTES 8
|
||||
#define PROC_KPAGEFLAGS "/proc/kpageflags"
|
||||
|
||||
/* copied from kpageflags_read() */
|
||||
#define KPF_LOCKED 0
|
||||
#define KPF_ERROR 1
|
||||
#define KPF_REFERENCED 2
|
||||
#define KPF_UPTODATE 3
|
||||
#define KPF_DIRTY 4
|
||||
#define KPF_LRU 5
|
||||
#define KPF_ACTIVE 6
|
||||
#define KPF_SLAB 7
|
||||
#define KPF_WRITEBACK 8
|
||||
#define KPF_RECLAIM 9
|
||||
#define KPF_BUDDY 10
|
||||
|
||||
/* [11-20] new additions in 2.6.31 */
|
||||
#define KPF_MMAP 11
|
||||
#define KPF_ANON 12
|
||||
#define KPF_SWAPCACHE 13
|
||||
#define KPF_SWAPBACKED 14
|
||||
#define KPF_COMPOUND_HEAD 15
|
||||
#define KPF_COMPOUND_TAIL 16
|
||||
#define KPF_HUGE 17
|
||||
#define KPF_UNEVICTABLE 18
|
||||
#define KPF_NOPAGE 20
|
||||
|
||||
/* [32-] kernel hacking assistances */
|
||||
#define KPF_RESERVED 32
|
||||
#define KPF_MLOCKED 33
|
||||
#define KPF_MAPPEDTODISK 34
|
||||
#define KPF_PRIVATE 35
|
||||
#define KPF_PRIVATE_2 36
|
||||
#define KPF_OWNER_PRIVATE 37
|
||||
#define KPF_ARCH 38
|
||||
#define KPF_UNCACHED 39
|
||||
|
||||
/* [48-] take some arbitrary free slots for expanding overloaded flags
|
||||
* not part of kernel API
|
||||
*/
|
||||
#define KPF_READAHEAD 48
|
||||
#define KPF_SLOB_FREE 49
|
||||
#define KPF_SLUB_FROZEN 50
|
||||
#define KPF_SLUB_DEBUG 51
|
||||
|
||||
#define KPF_ALL_BITS ((uint64_t)~0ULL)
|
||||
#define KPF_HACKERS_BITS (0xffffULL << 32)
|
||||
#define KPF_OVERLOADED_BITS (0xffffULL << 48)
|
||||
#define BIT(name) (1ULL << KPF_##name)
|
||||
#define BITS_COMPOUND (BIT(COMPOUND_HEAD) | BIT(COMPOUND_TAIL))
|
||||
|
||||
static char *page_flag_names[] = {
|
||||
[KPF_LOCKED] = "L:locked",
|
||||
[KPF_ERROR] = "E:error",
|
||||
[KPF_REFERENCED] = "R:referenced",
|
||||
[KPF_UPTODATE] = "U:uptodate",
|
||||
[KPF_DIRTY] = "D:dirty",
|
||||
[KPF_LRU] = "l:lru",
|
||||
[KPF_ACTIVE] = "A:active",
|
||||
[KPF_SLAB] = "S:slab",
|
||||
[KPF_WRITEBACK] = "W:writeback",
|
||||
[KPF_RECLAIM] = "I:reclaim",
|
||||
[KPF_BUDDY] = "B:buddy",
|
||||
|
||||
[KPF_MMAP] = "M:mmap",
|
||||
[KPF_ANON] = "a:anonymous",
|
||||
[KPF_SWAPCACHE] = "s:swapcache",
|
||||
[KPF_SWAPBACKED] = "b:swapbacked",
|
||||
[KPF_COMPOUND_HEAD] = "H:compound_head",
|
||||
[KPF_COMPOUND_TAIL] = "T:compound_tail",
|
||||
[KPF_HUGE] = "G:huge",
|
||||
[KPF_UNEVICTABLE] = "u:unevictable",
|
||||
[KPF_NOPAGE] = "n:nopage",
|
||||
|
||||
[KPF_RESERVED] = "r:reserved",
|
||||
[KPF_MLOCKED] = "m:mlocked",
|
||||
[KPF_MAPPEDTODISK] = "d:mappedtodisk",
|
||||
[KPF_PRIVATE] = "P:private",
|
||||
[KPF_PRIVATE_2] = "p:private_2",
|
||||
[KPF_OWNER_PRIVATE] = "O:owner_private",
|
||||
[KPF_ARCH] = "h:arch",
|
||||
[KPF_UNCACHED] = "c:uncached",
|
||||
|
||||
[KPF_READAHEAD] = "I:readahead",
|
||||
[KPF_SLOB_FREE] = "P:slob_free",
|
||||
[KPF_SLUB_FROZEN] = "A:slub_frozen",
|
||||
[KPF_SLUB_DEBUG] = "E:slub_debug",
|
||||
};
|
||||
|
||||
|
||||
/*
|
||||
* data structures
|
||||
*/
|
||||
|
||||
static int opt_raw; /* for kernel developers */
|
||||
static int opt_list; /* list pages (in ranges) */
|
||||
static int opt_no_summary; /* don't show summary */
|
||||
static pid_t opt_pid; /* process to walk */
|
||||
|
||||
#define MAX_ADDR_RANGES 1024
|
||||
static int nr_addr_ranges;
|
||||
static unsigned long opt_offset[MAX_ADDR_RANGES];
|
||||
static unsigned long opt_size[MAX_ADDR_RANGES];
|
||||
|
||||
#define MAX_BIT_FILTERS 64
|
||||
static int nr_bit_filters;
|
||||
static uint64_t opt_mask[MAX_BIT_FILTERS];
|
||||
static uint64_t opt_bits[MAX_BIT_FILTERS];
|
||||
|
||||
static int page_size;
|
||||
|
||||
#define PAGES_BATCH (64 << 10) /* 64k pages */
|
||||
static int kpageflags_fd;
|
||||
static uint64_t kpageflags_buf[KPF_BYTES * PAGES_BATCH];
|
||||
|
||||
#define HASH_SHIFT 13
|
||||
#define HASH_SIZE (1 << HASH_SHIFT)
|
||||
#define HASH_MASK (HASH_SIZE - 1)
|
||||
#define HASH_KEY(flags) (flags & HASH_MASK)
|
||||
|
||||
static unsigned long total_pages;
|
||||
static unsigned long nr_pages[HASH_SIZE];
|
||||
static uint64_t page_flags[HASH_SIZE];
|
||||
|
||||
|
||||
/*
|
||||
* helper functions
|
||||
*/
|
||||
|
||||
#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
|
||||
|
||||
#define min_t(type, x, y) ({ \
|
||||
type __min1 = (x); \
|
||||
type __min2 = (y); \
|
||||
__min1 < __min2 ? __min1 : __min2; })
|
||||
|
||||
unsigned long pages2mb(unsigned long pages)
|
||||
{
|
||||
return (pages * page_size) >> 20;
|
||||
}
|
||||
|
||||
void fatal(const char *x, ...)
|
||||
{
|
||||
va_list ap;
|
||||
|
||||
va_start(ap, x);
|
||||
vfprintf(stderr, x, ap);
|
||||
va_end(ap);
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
|
||||
|
||||
/*
|
||||
* page flag names
|
||||
*/
|
||||
|
||||
char *page_flag_name(uint64_t flags)
|
||||
{
|
||||
static char buf[65];
|
||||
int present;
|
||||
int i, j;
|
||||
|
||||
for (i = 0, j = 0; i < ARRAY_SIZE(page_flag_names); i++) {
|
||||
present = (flags >> i) & 1;
|
||||
if (!page_flag_names[i]) {
|
||||
if (present)
|
||||
fatal("unkown flag bit %d\n", i);
|
||||
continue;
|
||||
}
|
||||
buf[j++] = present ? page_flag_names[i][0] : '_';
|
||||
}
|
||||
|
||||
return buf;
|
||||
}
|
||||
|
||||
char *page_flag_longname(uint64_t flags)
|
||||
{
|
||||
static char buf[1024];
|
||||
int i, n;
|
||||
|
||||
for (i = 0, n = 0; i < ARRAY_SIZE(page_flag_names); i++) {
|
||||
if (!page_flag_names[i])
|
||||
continue;
|
||||
if ((flags >> i) & 1)
|
||||
n += snprintf(buf + n, sizeof(buf) - n, "%s,",
|
||||
page_flag_names[i] + 2);
|
||||
}
|
||||
if (n)
|
||||
n--;
|
||||
buf[n] = '\0';
|
||||
|
||||
return buf;
|
||||
}
|
||||
|
||||
|
||||
/*
|
||||
* page list and summary
|
||||
*/
|
||||
|
||||
void show_page_range(unsigned long offset, uint64_t flags)
|
||||
{
|
||||
static uint64_t flags0;
|
||||
static unsigned long index;
|
||||
static unsigned long count;
|
||||
|
||||
if (flags == flags0 && offset == index + count) {
|
||||
count++;
|
||||
return;
|
||||
}
|
||||
|
||||
if (count)
|
||||
printf("%lu\t%lu\t%s\n",
|
||||
index, count, page_flag_name(flags0));
|
||||
|
||||
flags0 = flags;
|
||||
index = offset;
|
||||
count = 1;
|
||||
}
|
||||
|
||||
void show_page(unsigned long offset, uint64_t flags)
|
||||
{
|
||||
printf("%lu\t%s\n", offset, page_flag_name(flags));
|
||||
}
|
||||
|
||||
void show_summary(void)
|
||||
{
|
||||
int i;
|
||||
|
||||
printf(" flags\tpage-count MB"
|
||||
" symbolic-flags\t\t\tlong-symbolic-flags\n");
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(nr_pages); i++) {
|
||||
if (nr_pages[i])
|
||||
printf("0x%016llx\t%10lu %8lu %s\t%s\n",
|
||||
(unsigned long long)page_flags[i],
|
||||
nr_pages[i],
|
||||
pages2mb(nr_pages[i]),
|
||||
page_flag_name(page_flags[i]),
|
||||
page_flag_longname(page_flags[i]));
|
||||
}
|
||||
|
||||
printf(" total\t%10lu %8lu\n",
|
||||
total_pages, pages2mb(total_pages));
|
||||
}
|
||||
|
||||
|
||||
/*
|
||||
* page flag filters
|
||||
*/
|
||||
|
||||
int bit_mask_ok(uint64_t flags)
|
||||
{
|
||||
int i;
|
||||
|
||||
for (i = 0; i < nr_bit_filters; i++) {
|
||||
if (opt_bits[i] == KPF_ALL_BITS) {
|
||||
if ((flags & opt_mask[i]) == 0)
|
||||
return 0;
|
||||
} else {
|
||||
if ((flags & opt_mask[i]) != opt_bits[i])
|
||||
return 0;
|
||||
}
|
||||
}
|
||||
|
||||
return 1;
|
||||
}
|
||||
|
||||
uint64_t expand_overloaded_flags(uint64_t flags)
|
||||
{
|
||||
/* SLOB/SLUB overload several page flags */
|
||||
if (flags & BIT(SLAB)) {
|
||||
if (flags & BIT(PRIVATE))
|
||||
flags ^= BIT(PRIVATE) | BIT(SLOB_FREE);
|
||||
if (flags & BIT(ACTIVE))
|
||||
flags ^= BIT(ACTIVE) | BIT(SLUB_FROZEN);
|
||||
if (flags & BIT(ERROR))
|
||||
flags ^= BIT(ERROR) | BIT(SLUB_DEBUG);
|
||||
}
|
||||
|
||||
/* PG_reclaim is overloaded as PG_readahead in the read path */
|
||||
if ((flags & (BIT(RECLAIM) | BIT(WRITEBACK))) == BIT(RECLAIM))
|
||||
flags ^= BIT(RECLAIM) | BIT(READAHEAD);
|
||||
|
||||
return flags;
|
||||
}
|
||||
|
||||
uint64_t well_known_flags(uint64_t flags)
|
||||
{
|
||||
/* hide flags intended only for kernel hacker */
|
||||
flags &= ~KPF_HACKERS_BITS;
|
||||
|
||||
/* hide non-hugeTLB compound pages */
|
||||
if ((flags & BITS_COMPOUND) && !(flags & BIT(HUGE)))
|
||||
flags &= ~BITS_COMPOUND;
|
||||
|
||||
return flags;
|
||||
}
|
||||
|
||||
|
||||
/*
|
||||
* page frame walker
|
||||
*/
|
||||
|
||||
int hash_slot(uint64_t flags)
|
||||
{
|
||||
int k = HASH_KEY(flags);
|
||||
int i;
|
||||
|
||||
/* Explicitly reserve slot 0 for flags 0: the following logic
|
||||
* cannot distinguish an unoccupied slot from slot (flags==0).
|
||||
*/
|
||||
if (flags == 0)
|
||||
return 0;
|
||||
|
||||
/* search through the remaining (HASH_SIZE-1) slots */
|
||||
for (i = 1; i < ARRAY_SIZE(page_flags); i++, k++) {
|
||||
if (!k || k >= ARRAY_SIZE(page_flags))
|
||||
k = 1;
|
||||
if (page_flags[k] == 0) {
|
||||
page_flags[k] = flags;
|
||||
return k;
|
||||
}
|
||||
if (page_flags[k] == flags)
|
||||
return k;
|
||||
}
|
||||
|
||||
fatal("hash table full: bump up HASH_SHIFT?\n");
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
|
||||
void add_page(unsigned long offset, uint64_t flags)
|
||||
{
|
||||
flags = expand_overloaded_flags(flags);
|
||||
|
||||
if (!opt_raw)
|
||||
flags = well_known_flags(flags);
|
||||
|
||||
if (!bit_mask_ok(flags))
|
||||
return;
|
||||
|
||||
if (opt_list == 1)
|
||||
show_page_range(offset, flags);
|
||||
else if (opt_list == 2)
|
||||
show_page(offset, flags);
|
||||
|
||||
nr_pages[hash_slot(flags)]++;
|
||||
total_pages++;
|
||||
}
|
||||
|
||||
void walk_pfn(unsigned long index, unsigned long count)
|
||||
{
|
||||
unsigned long batch;
|
||||
unsigned long n;
|
||||
unsigned long i;
|
||||
|
||||
if (index > ULONG_MAX / KPF_BYTES)
|
||||
fatal("index overflow: %lu\n", index);
|
||||
|
||||
lseek(kpageflags_fd, index * KPF_BYTES, SEEK_SET);
|
||||
|
||||
while (count) {
|
||||
batch = min_t(unsigned long, count, PAGES_BATCH);
|
||||
n = read(kpageflags_fd, kpageflags_buf, batch * KPF_BYTES);
|
||||
if (n == 0)
|
||||
break;
|
||||
if (n < 0) {
|
||||
perror(PROC_KPAGEFLAGS);
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
|
||||
if (n % KPF_BYTES != 0)
|
||||
fatal("partial read: %lu bytes\n", n);
|
||||
n = n / KPF_BYTES;
|
||||
|
||||
for (i = 0; i < n; i++)
|
||||
add_page(index + i, kpageflags_buf[i]);
|
||||
|
||||
index += batch;
|
||||
count -= batch;
|
||||
}
|
||||
}
|
||||
|
||||
void walk_addr_ranges(void)
|
||||
{
|
||||
int i;
|
||||
|
||||
kpageflags_fd = open(PROC_KPAGEFLAGS, O_RDONLY);
|
||||
if (kpageflags_fd < 0) {
|
||||
perror(PROC_KPAGEFLAGS);
|
||||
exit(EXIT_FAILURE);
|
||||
}
|
||||
|
||||
if (!nr_addr_ranges)
|
||||
walk_pfn(0, ULONG_MAX);
|
||||
|
||||
for (i = 0; i < nr_addr_ranges; i++)
|
||||
walk_pfn(opt_offset[i], opt_size[i]);
|
||||
|
||||
close(kpageflags_fd);
|
||||
}
|
||||
|
||||
|
||||
/*
|
||||
* user interface
|
||||
*/
|
||||
|
||||
const char *page_flag_type(uint64_t flag)
|
||||
{
|
||||
if (flag & KPF_HACKERS_BITS)
|
||||
return "(r)";
|
||||
if (flag & KPF_OVERLOADED_BITS)
|
||||
return "(o)";
|
||||
return " ";
|
||||
}
|
||||
|
||||
void usage(void)
|
||||
{
|
||||
int i, j;
|
||||
|
||||
printf(
|
||||
"page-types [options]\n"
|
||||
" -r|--raw Raw mode, for kernel developers\n"
|
||||
" -a|--addr addr-spec Walk a range of pages\n"
|
||||
" -b|--bits bits-spec Walk pages with specified bits\n"
|
||||
#if 0 /* planned features */
|
||||
" -p|--pid pid Walk process address space\n"
|
||||
" -f|--file filename Walk file address space\n"
|
||||
#endif
|
||||
" -l|--list Show page details in ranges\n"
|
||||
" -L|--list-each Show page details one by one\n"
|
||||
" -N|--no-summary Don't show summay info\n"
|
||||
" -h|--help Show this usage message\n"
|
||||
"addr-spec:\n"
|
||||
" N one page at offset N (unit: pages)\n"
|
||||
" N+M pages range from N to N+M-1\n"
|
||||
" N,M pages range from N to M-1\n"
|
||||
" N, pages range from N to end\n"
|
||||
" ,M pages range from 0 to M\n"
|
||||
"bits-spec:\n"
|
||||
" bit1,bit2 (flags & (bit1|bit2)) != 0\n"
|
||||
" bit1,bit2=bit1 (flags & (bit1|bit2)) == bit1\n"
|
||||
" bit1,~bit2 (flags & (bit1|bit2)) == bit1\n"
|
||||
" =bit1,bit2 flags == (bit1|bit2)\n"
|
||||
"bit-names:\n"
|
||||
);
|
||||
|
||||
for (i = 0, j = 0; i < ARRAY_SIZE(page_flag_names); i++) {
|
||||
if (!page_flag_names[i])
|
||||
continue;
|
||||
printf("%16s%s", page_flag_names[i] + 2,
|
||||
page_flag_type(1ULL << i));
|
||||
if (++j > 3) {
|
||||
j = 0;
|
||||
putchar('\n');
|
||||
}
|
||||
}
|
||||
printf("\n "
|
||||
"(r) raw mode bits (o) overloaded bits\n");
|
||||
}
|
||||
|
||||
unsigned long long parse_number(const char *str)
|
||||
{
|
||||
unsigned long long n;
|
||||
|
||||
n = strtoll(str, NULL, 0);
|
||||
|
||||
if (n == 0 && str[0] != '0')
|
||||
fatal("invalid name or number: %s\n", str);
|
||||
|
||||
return n;
|
||||
}
|
||||
|
||||
void parse_pid(const char *str)
|
||||
{
|
||||
opt_pid = parse_number(str);
|
||||
}
|
||||
|
||||
void parse_file(const char *name)
|
||||
{
|
||||
}
|
||||
|
||||
void add_addr_range(unsigned long offset, unsigned long size)
|
||||
{
|
||||
if (nr_addr_ranges >= MAX_ADDR_RANGES)
|
||||
fatal("too much addr ranges\n");
|
||||
|
||||
opt_offset[nr_addr_ranges] = offset;
|
||||
opt_size[nr_addr_ranges] = size;
|
||||
nr_addr_ranges++;
|
||||
}
|
||||
|
||||
void parse_addr_range(const char *optarg)
|
||||
{
|
||||
unsigned long offset;
|
||||
unsigned long size;
|
||||
char *p;
|
||||
|
||||
p = strchr(optarg, ',');
|
||||
if (!p)
|
||||
p = strchr(optarg, '+');
|
||||
|
||||
if (p == optarg) {
|
||||
offset = 0;
|
||||
size = parse_number(p + 1);
|
||||
} else if (p) {
|
||||
offset = parse_number(optarg);
|
||||
if (p[1] == '\0')
|
||||
size = ULONG_MAX;
|
||||
else {
|
||||
size = parse_number(p + 1);
|
||||
if (*p == ',') {
|
||||
if (size < offset)
|
||||
fatal("invalid range: %lu,%lu\n",
|
||||
offset, size);
|
||||
size -= offset;
|
||||
}
|
||||
}
|
||||
} else {
|
||||
offset = parse_number(optarg);
|
||||
size = 1;
|
||||
}
|
||||
|
||||
add_addr_range(offset, size);
|
||||
}
|
||||
|
||||
void add_bits_filter(uint64_t mask, uint64_t bits)
|
||||
{
|
||||
if (nr_bit_filters >= MAX_BIT_FILTERS)
|
||||
fatal("too much bit filters\n");
|
||||
|
||||
opt_mask[nr_bit_filters] = mask;
|
||||
opt_bits[nr_bit_filters] = bits;
|
||||
nr_bit_filters++;
|
||||
}
|
||||
|
||||
uint64_t parse_flag_name(const char *str, int len)
|
||||
{
|
||||
int i;
|
||||
|
||||
if (!*str || !len)
|
||||
return 0;
|
||||
|
||||
if (len <= 8 && !strncmp(str, "compound", len))
|
||||
return BITS_COMPOUND;
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(page_flag_names); i++) {
|
||||
if (!page_flag_names[i])
|
||||
continue;
|
||||
if (!strncmp(str, page_flag_names[i] + 2, len))
|
||||
return 1ULL << i;
|
||||
}
|
||||
|
||||
return parse_number(str);
|
||||
}
|
||||
|
||||
uint64_t parse_flag_names(const char *str, int all)
|
||||
{
|
||||
const char *p = str;
|
||||
uint64_t flags = 0;
|
||||
|
||||
while (1) {
|
||||
if (*p == ',' || *p == '=' || *p == '\0') {
|
||||
if ((*str != '~') || (*str == '~' && all && *++str))
|
||||
flags |= parse_flag_name(str, p - str);
|
||||
if (*p != ',')
|
||||
break;
|
||||
str = p + 1;
|
||||
}
|
||||
p++;
|
||||
}
|
||||
|
||||
return flags;
|
||||
}
|
||||
|
||||
void parse_bits_mask(const char *optarg)
|
||||
{
|
||||
uint64_t mask;
|
||||
uint64_t bits;
|
||||
const char *p;
|
||||
|
||||
p = strchr(optarg, '=');
|
||||
if (p == optarg) {
|
||||
mask = KPF_ALL_BITS;
|
||||
bits = parse_flag_names(p + 1, 0);
|
||||
} else if (p) {
|
||||
mask = parse_flag_names(optarg, 0);
|
||||
bits = parse_flag_names(p + 1, 0);
|
||||
} else if (strchr(optarg, '~')) {
|
||||
mask = parse_flag_names(optarg, 1);
|
||||
bits = parse_flag_names(optarg, 0);
|
||||
} else {
|
||||
mask = parse_flag_names(optarg, 0);
|
||||
bits = KPF_ALL_BITS;
|
||||
}
|
||||
|
||||
add_bits_filter(mask, bits);
|
||||
}
|
||||
|
||||
|
||||
struct option opts[] = {
|
||||
{ "raw" , 0, NULL, 'r' },
|
||||
{ "pid" , 1, NULL, 'p' },
|
||||
{ "file" , 1, NULL, 'f' },
|
||||
{ "addr" , 1, NULL, 'a' },
|
||||
{ "bits" , 1, NULL, 'b' },
|
||||
{ "list" , 0, NULL, 'l' },
|
||||
{ "list-each" , 0, NULL, 'L' },
|
||||
{ "no-summary", 0, NULL, 'N' },
|
||||
{ "help" , 0, NULL, 'h' },
|
||||
{ NULL , 0, NULL, 0 }
|
||||
};
|
||||
|
||||
int main(int argc, char *argv[])
|
||||
{
|
||||
int c;
|
||||
|
||||
page_size = getpagesize();
|
||||
|
||||
while ((c = getopt_long(argc, argv,
|
||||
"rp:f:a:b:lLNh", opts, NULL)) != -1) {
|
||||
switch (c) {
|
||||
case 'r':
|
||||
opt_raw = 1;
|
||||
break;
|
||||
case 'p':
|
||||
parse_pid(optarg);
|
||||
break;
|
||||
case 'f':
|
||||
parse_file(optarg);
|
||||
break;
|
||||
case 'a':
|
||||
parse_addr_range(optarg);
|
||||
break;
|
||||
case 'b':
|
||||
parse_bits_mask(optarg);
|
||||
break;
|
||||
case 'l':
|
||||
opt_list = 1;
|
||||
break;
|
||||
case 'L':
|
||||
opt_list = 2;
|
||||
break;
|
||||
case 'N':
|
||||
opt_no_summary = 1;
|
||||
break;
|
||||
case 'h':
|
||||
usage();
|
||||
exit(0);
|
||||
default:
|
||||
usage();
|
||||
exit(1);
|
||||
}
|
||||
}
|
||||
|
||||
if (opt_list == 1)
|
||||
printf("offset\tcount\tflags\n");
|
||||
if (opt_list == 2)
|
||||
printf("offset\tflags\n");
|
||||
|
||||
walk_addr_ranges();
|
||||
|
||||
if (opt_list == 1)
|
||||
show_page_range(0, 0); /* drain the buffer */
|
||||
|
||||
if (opt_no_summary)
|
||||
return 0;
|
||||
|
||||
if (opt_list)
|
||||
printf("\n\n");
|
||||
|
||||
show_summary();
|
||||
|
||||
return 0;
|
||||
}
|
@ -12,9 +12,9 @@ There are three components to pagemap:
|
||||
value for each virtual page, containing the following data (from
|
||||
fs/proc/task_mmu.c, above pagemap_read):
|
||||
|
||||
* Bits 0-55 page frame number (PFN) if present
|
||||
* Bits 0-54 page frame number (PFN) if present
|
||||
* Bits 0-4 swap type if swapped
|
||||
* Bits 5-55 swap offset if swapped
|
||||
* Bits 5-54 swap offset if swapped
|
||||
* Bits 55-60 page shift (page size = 1<<page shift)
|
||||
* Bit 61 reserved for future use
|
||||
* Bit 62 page swapped
|
||||
@ -36,7 +36,7 @@ There are three components to pagemap:
|
||||
* /proc/kpageflags. This file contains a 64-bit set of flags for each
|
||||
page, indexed by PFN.
|
||||
|
||||
The flags are (from fs/proc/proc_misc, above kpageflags_read):
|
||||
The flags are (from fs/proc/page.c, above kpageflags_read):
|
||||
|
||||
0. LOCKED
|
||||
1. ERROR
|
||||
@ -49,6 +49,68 @@ There are three components to pagemap:
|
||||
8. WRITEBACK
|
||||
9. RECLAIM
|
||||
10. BUDDY
|
||||
11. MMAP
|
||||
12. ANON
|
||||
13. SWAPCACHE
|
||||
14. SWAPBACKED
|
||||
15. COMPOUND_HEAD
|
||||
16. COMPOUND_TAIL
|
||||
16. HUGE
|
||||
18. UNEVICTABLE
|
||||
20. NOPAGE
|
||||
|
||||
Short descriptions to the page flags:
|
||||
|
||||
0. LOCKED
|
||||
page is being locked for exclusive access, eg. by undergoing read/write IO
|
||||
|
||||
7. SLAB
|
||||
page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator
|
||||
When compound page is used, SLUB/SLQB will only set this flag on the head
|
||||
page; SLOB will not flag it at all.
|
||||
|
||||
10. BUDDY
|
||||
a free memory block managed by the buddy system allocator
|
||||
The buddy system organizes free memory in blocks of various orders.
|
||||
An order N block has 2^N physically contiguous pages, with the BUDDY flag
|
||||
set for and _only_ for the first page.
|
||||
|
||||
15. COMPOUND_HEAD
|
||||
16. COMPOUND_TAIL
|
||||
A compound page with order N consists of 2^N physically contiguous pages.
|
||||
A compound page with order 2 takes the form of "HTTT", where H donates its
|
||||
head page and T donates its tail page(s). The major consumers of compound
|
||||
pages are hugeTLB pages (Documentation/vm/hugetlbpage.txt), the SLUB etc.
|
||||
memory allocators and various device drivers. However in this interface,
|
||||
only huge/giga pages are made visible to end users.
|
||||
17. HUGE
|
||||
this is an integral part of a HugeTLB page
|
||||
|
||||
20. NOPAGE
|
||||
no page frame exists at the requested address
|
||||
|
||||
[IO related page flags]
|
||||
1. ERROR IO error occurred
|
||||
3. UPTODATE page has up-to-date data
|
||||
ie. for file backed page: (in-memory data revision >= on-disk one)
|
||||
4. DIRTY page has been written to, hence contains new data
|
||||
ie. for file backed page: (in-memory data revision > on-disk one)
|
||||
8. WRITEBACK page is being synced to disk
|
||||
|
||||
[LRU related page flags]
|
||||
5. LRU page is in one of the LRU lists
|
||||
6. ACTIVE page is in the active LRU list
|
||||
18. UNEVICTABLE page is in the unevictable (non-)LRU list
|
||||
It is somehow pinned and not a candidate for LRU page reclaims,
|
||||
eg. ramfs pages, shmctl(SHM_LOCK) and mlock() memory segments
|
||||
2. REFERENCED page has been referenced since last LRU list enqueue/requeue
|
||||
9. RECLAIM page will be reclaimed soon after its pageout IO completed
|
||||
11. MMAP a memory mapped page
|
||||
12. ANON a memory mapped page that is not part of a file
|
||||
13. SWAPCACHE page is mapped to swap space, ie. has an associated swap entry
|
||||
14. SWAPBACKED page is backed by swap/RAM
|
||||
|
||||
The page-types tool in this directory can be used to query the above flags.
|
||||
|
||||
Using pagemap to do something useful:
|
||||
|
||||
|
95
Documentation/watchdog/hpwdt.txt
Normal file
95
Documentation/watchdog/hpwdt.txt
Normal file
@ -0,0 +1,95 @@
|
||||
Last reviewed: 06/02/2009
|
||||
|
||||
HP iLO2 NMI Watchdog Driver
|
||||
NMI sourcing for iLO2 based ProLiant Servers
|
||||
Documentation and Driver by
|
||||
Thomas Mingarelli <thomas.mingarelli@hp.com>
|
||||
|
||||
The HP iLO2 NMI Watchdog driver is a kernel module that provides basic
|
||||
watchdog functionality and the added benefit of NMI sourcing. Both the
|
||||
watchdog functionality and the NMI sourcing capability need to be enabled
|
||||
by the user. Remember that the two modes are not dependant on one another.
|
||||
A user can have the NMI sourcing without the watchdog timer and vice-versa.
|
||||
|
||||
Watchdog functionality is enabled like any other common watchdog driver. That
|
||||
is, an application needs to be started that kicks off the watchdog timer. A
|
||||
basic application exists in the Documentation/watchdog/src directory called
|
||||
watchdog-test.c. Simply compile the C file and kick it off. If the system
|
||||
gets into a bad state and hangs, the HP ProLiant iLO 2 timer register will
|
||||
not be updated in a timely fashion and a hardware system reset (also known as
|
||||
an Automatic Server Recovery (ASR)) event will occur.
|
||||
|
||||
The hpwdt driver also has four (4) module parameters. They are the following:
|
||||
|
||||
soft_margin - allows the user to set the watchdog timer value
|
||||
allow_kdump - allows the user to save off a kernel dump image after an NMI
|
||||
nowayout - basic watchdog parameter that does not allow the timer to
|
||||
be restarted or an impending ASR to be escaped.
|
||||
priority - determines whether or not the hpwdt driver is first on the
|
||||
die_notify list to handle NMIs or last. The default value
|
||||
for this module parameter is 0 or LAST. If the user wants to
|
||||
enable NMI sourcing then reload the hpwdt driver with
|
||||
priority=1 (and boot with nmi_watchdog=0).
|
||||
|
||||
NOTE: More information about watchdog drivers in general, including the ioctl
|
||||
interface to /dev/watchdog can be found in
|
||||
Documentation/watchdog/watchdog-api.txt and Documentation/IPMI.txt.
|
||||
|
||||
The priority parameter was introduced due to other kernel software that relied
|
||||
on handling NMIs (like oprofile). Keeping hpwdt's priority at 0 (or LAST)
|
||||
enables the users of NMIs for non critical events to be work as expected.
|
||||
|
||||
The NMI sourcing capability is disabled by default due to the inability to
|
||||
distinguish between "NMI Watchdog Ticks" and "HW generated NMI events" in the
|
||||
Linux kernel. What this means is that the hpwdt nmi handler code is called
|
||||
each time the NMI signal fires off. This could amount to several thousands of
|
||||
NMIs in a matter of seconds. If a user sees the Linux kernel's "dazed and
|
||||
confused" message in the logs or if the system gets into a hung state, then
|
||||
the hpwdt driver can be reloaded with the "priority" module parameter set
|
||||
(priority=1).
|
||||
|
||||
1. If the kernel has not been booted with nmi_watchdog turned off then
|
||||
edit /boot/grub/menu.lst and place the nmi_watchdog=0 at the end of the
|
||||
currently booting kernel line.
|
||||
2. reboot the sever
|
||||
3. Once the system comes up perform a rmmod hpwdt
|
||||
4. insmod /lib/modules/`uname -r`/kernel/drivers/char/watchdog/hpwdt.ko priority=1
|
||||
|
||||
Now, the hpwdt can successfully receive and source the NMI and provide a log
|
||||
message that details the reason for the NMI (as determined by the HP BIOS).
|
||||
|
||||
Below is a list of NMIs the HP BIOS understands along with the associated
|
||||
code (reason):
|
||||
|
||||
No source found 00h
|
||||
|
||||
Uncorrectable Memory Error 01h
|
||||
|
||||
ASR NMI 1Bh
|
||||
|
||||
PCI Parity Error 20h
|
||||
|
||||
NMI Button Press 27h
|
||||
|
||||
SB_BUS_NMI 28h
|
||||
|
||||
ILO Doorbell NMI 29h
|
||||
|
||||
ILO IOP NMI 2Ah
|
||||
|
||||
ILO Watchdog NMI 2Bh
|
||||
|
||||
Proc Throt NMI 2Ch
|
||||
|
||||
Front Side Bus NMI 2Dh
|
||||
|
||||
PCI Express Error 2Fh
|
||||
|
||||
DMA controller NMI 30h
|
||||
|
||||
Hypertransport/CSI Error 31h
|
||||
|
||||
|
||||
|
||||
-- Tom Mingarelli
|
||||
(thomas.mingarelli@hp.com)
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
x
Reference in New Issue
Block a user