Merge branch 'linus' into sched/core

Merge reason: we will merge a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
This commit is contained in:
Ingo Molnar 2009-06-29 09:16:13 +02:00
commit 348b346b23
5214 changed files with 520144 additions and 339238 deletions

View File

@ -1253,6 +1253,10 @@ S: 8124 Constitution Apt. 7
S: Sterling Heights, Michigan 48313
S: USA
N: Wolfgang Grandegger
E: wg@grandegger.com
D: Controller Area Network (device drivers)
N: William Greathouse
E: wgreathouse@smva.com
E: wgreathouse@myfavoritei.com

View File

@ -122,3 +122,10 @@ Description:
This symbolic link appears when a device is a Virtual Function.
The symbolic link points to the PCI device sysfs entry of the
Physical Function this device associates with.
What: /sys/bus/pci/slots/.../module
Date: June 2009
Contact: linux-pci@vger.kernel.org
Description:
This symbolic link points to the PCI hotplug controller driver
module that manages the hotplug slot.

View File

@ -0,0 +1,125 @@
What: /sys/class/mtd/
Date: April 2009
KernelVersion: 2.6.29
Contact: linux-mtd@lists.infradead.org
Description:
The mtd/ class subdirectory belongs to the MTD subsystem
(MTD core).
What: /sys/class/mtd/mtdX/
Date: April 2009
KernelVersion: 2.6.29
Contact: linux-mtd@lists.infradead.org
Description:
The /sys/class/mtd/mtd{0,1,2,3,...} directories correspond
to each /dev/mtdX character device. These may represent
physical/simulated flash devices, partitions on a flash
device, or concatenated flash devices. They exist regardless
of whether CONFIG_MTD_CHAR is actually enabled.
What: /sys/class/mtd/mtdXro/
Date: April 2009
KernelVersion: 2.6.29
Contact: linux-mtd@lists.infradead.org
Description:
These directories provide the corresponding read-only device
nodes for /sys/class/mtd/mtdX/ . They are only created
(for the benefit of udev) if CONFIG_MTD_CHAR is enabled.
What: /sys/class/mtd/mtdX/dev
Date: April 2009
KernelVersion: 2.6.29
Contact: linux-mtd@lists.infradead.org
Description:
Major and minor numbers of the character device corresponding
to this MTD device (in <major>:<minor> format). This is the
read-write device so <minor> will be even.
What: /sys/class/mtd/mtdXro/dev
Date: April 2009
KernelVersion: 2.6.29
Contact: linux-mtd@lists.infradead.org
Description:
Major and minor numbers of the character device corresponding
to the read-only variant of thie MTD device (in
<major>:<minor> format). In this case <minor> will be odd.
What: /sys/class/mtd/mtdX/erasesize
Date: April 2009
KernelVersion: 2.6.29
Contact: linux-mtd@lists.infradead.org
Description:
"Major" erase size for the device. If numeraseregions is
zero, this is the eraseblock size for the entire device.
Otherwise, the MEMGETREGIONCOUNT/MEMGETREGIONINFO ioctls
can be used to determine the actual eraseblock layout.
What: /sys/class/mtd/mtdX/flags
Date: April 2009
KernelVersion: 2.6.29
Contact: linux-mtd@lists.infradead.org
Description:
A hexadecimal value representing the device flags, ORed
together:
0x0400: MTD_WRITEABLE - device is writable
0x0800: MTD_BIT_WRITEABLE - single bits can be flipped
0x1000: MTD_NO_ERASE - no erase necessary
0x2000: MTD_POWERUP_LOCK - always locked after reset
What: /sys/class/mtd/mtdX/name
Date: April 2009
KernelVersion: 2.6.29
Contact: linux-mtd@lists.infradead.org
Description:
A human-readable ASCII name for the device or partition.
This will match the name in /proc/mtd .
What: /sys/class/mtd/mtdX/numeraseregions
Date: April 2009
KernelVersion: 2.6.29
Contact: linux-mtd@lists.infradead.org
Description:
For devices that have variable eraseblock sizes, this
provides the total number of erase regions. Otherwise,
it will read back as zero.
What: /sys/class/mtd/mtdX/oobsize
Date: April 2009
KernelVersion: 2.6.29
Contact: linux-mtd@lists.infradead.org
Description:
Number of OOB bytes per page.
What: /sys/class/mtd/mtdX/size
Date: April 2009
KernelVersion: 2.6.29
Contact: linux-mtd@lists.infradead.org
Description:
Total size of the device/partition, in bytes.
What: /sys/class/mtd/mtdX/type
Date: April 2009
KernelVersion: 2.6.29
Contact: linux-mtd@lists.infradead.org
Description:
One of the following ASCII strings, representing the device
type:
absent, ram, rom, nor, nand, dataflash, ubi, unknown
What: /sys/class/mtd/mtdX/writesize
Date: April 2009
KernelVersion: 2.6.29
Contact: linux-mtd@lists.infradead.org
Description:
Minimal writable flash unit size. This will always be
a positive integer.
In the case of NOR flash it is 1 (even though individual
bits can be cleared).
In the case of NAND flash it is one NAND page (or a
half page, or a quarter page).
In the case of ECC NOR, it is the ECC block size.

View File

@ -79,3 +79,13 @@ Description:
This file is read-only and shows the number of
kilobytes of data that have been written to this
filesystem since it was mounted.
What: /sys/fs/ext4/<disk>/inode_goal
Date: June 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
Tuning parameter which (if non-zero) controls the goal
inode used by the inode allocator in p0reference to
all other allocation hueristics. This is intended for
debugging use only, and should be 0 on production
systems.

View File

@ -0,0 +1,73 @@
What: /sys/class/pps/
Date: February 2008
Contact: Rodolfo Giometti <giometti@linux.it>
Description:
The /sys/class/pps/ directory will contain files and
directories that will provide a unified interface to
the PPS sources.
What: /sys/class/pps/ppsX/
Date: February 2008
Contact: Rodolfo Giometti <giometti@linux.it>
Description:
The /sys/class/pps/ppsX/ directory is related to X-th
PPS source into the system. Each directory will
contain files to manage and control its PPS source.
What: /sys/class/pps/ppsX/assert
Date: February 2008
Contact: Rodolfo Giometti <giometti@linux.it>
Description:
The /sys/class/pps/ppsX/assert file reports the assert events
and the assert sequence number of the X-th source in the form:
<secs>.<nsec>#<sequence>
If the source has no assert events the content of this file
is empty.
What: /sys/class/pps/ppsX/clear
Date: February 2008
Contact: Rodolfo Giometti <giometti@linux.it>
Description:
The /sys/class/pps/ppsX/clear file reports the clear events
and the clear sequence number of the X-th source in the form:
<secs>.<nsec>#<sequence>
If the source has no clear events the content of this file
is empty.
What: /sys/class/pps/ppsX/mode
Date: February 2008
Contact: Rodolfo Giometti <giometti@linux.it>
Description:
The /sys/class/pps/ppsX/mode file reports the functioning
mode of the X-th source in hexadecimal encoding.
Please, refer to linux/include/linux/pps.h for further
info.
What: /sys/class/pps/ppsX/echo
Date: February 2008
Contact: Rodolfo Giometti <giometti@linux.it>
Description:
The /sys/class/pps/ppsX/echo file reports if the X-th does
or does not support an "echo" function.
What: /sys/class/pps/ppsX/name
Date: February 2008
Contact: Rodolfo Giometti <giometti@linux.it>
Description:
The /sys/class/pps/ppsX/name file reports the name of the
X-th source.
What: /sys/class/pps/ppsX/path
Date: February 2008
Contact: Rodolfo Giometti <giometti@linux.it>
Description:
The /sys/class/pps/ppsX/path file reports the path name of
the device connected with the X-th source.
If the source is not connected with any device the content
of this file is empty.

View File

@ -72,6 +72,13 @@ assembling the 16-bit boot code, removing the need for as86 to compile
your kernel. This change does, however, mean that you need a recent
release of binutils.
Perl
----
You will need perl 5 and the following modules: Getopt::Long, Getopt::Std,
File::Basename, and File::Find to build the kernel.
System utilities
================

View File

@ -106,7 +106,7 @@
number of errors are printk'ed including a full stack trace.
</para>
<para>
The statistics are available via debugfs/debug_objects/stats.
The statistics are available via /sys/kernel/debug/debug_objects/stats.
They provide information about the number of warnings and the
number of successful fixups along with information about the
usage of the internal tracking objects and the state of the

View File

@ -145,7 +145,6 @@ usage should require reading the full document.
interface in STA mode at first!
</para>
!Finclude/net/mac80211.h ieee80211_if_init_conf
!Finclude/net/mac80211.h ieee80211_if_conf
</chapter>
<chapter id="rx-tx">

View File

@ -61,6 +61,10 @@ be initiated although firmwares have no _OSC support. To enable the
walkaround, pls. add aerdriver.forceload=y to kernel boot parameter line
when booting kernel. Note that forceload=n by default.
nosourceid, another parameter of type bool, can be used when broken
hardware (mostly chipsets) has root ports that cannot obtain the reporting
source ID. nosourceid=n by default.
2.3 AER error output
When a PCI-E AER error is captured, an error message will be outputed to
console. If it's a correctable error, it is outputed as a warning.
@ -246,3 +250,24 @@ with the PCI Express AER Root driver?
A: It could call the helper functions to enable AER in devices and
cleanup uncorrectable status register. Pls. refer to section 3.3.
4. Software error injection
Debugging PCIE AER error recovery code is quite difficult because it
is hard to trigger real hardware errors. Software based error
injection can be used to fake various kinds of PCIE errors.
First you should enable PCIE AER software error injection in kernel
configuration, that is, following item should be in your .config.
CONFIG_PCIEAER_INJECT=y or CONFIG_PCIEAER_INJECT=m
After reboot with new kernel or insert the module, a device file named
/dev/aer_inject should be created.
Then, you need a user space tool named aer-inject, which can be gotten
from:
http://www.kernel.org/pub/linux/utils/pci/aer-inject/
More information about aer-inject can be found in the document comes
with its source code.

View File

@ -54,7 +54,7 @@ kernel patches.
CONFIG_PREEMPT.
14: If the patch affects IO/Disk, etc: has been tested with and without
CONFIG_LBD.
CONFIG_LBDAF.
15: All codepaths have been exercised with all lockdep features enabled.

View File

@ -246,7 +246,8 @@ void print_ioacct(struct taskstats *t)
int main(int argc, char *argv[])
{
int c, rc, rep_len, aggr_len, len2, cmd_type;
int c, rc, rep_len, aggr_len, len2;
int cmd_type = TASKSTATS_CMD_ATTR_UNSPEC;
__u16 id;
__u32 mypid;

View File

@ -229,10 +229,10 @@ kernel. It is the use of atomic counters to implement reference
counting, and it works such that once the counter falls to zero it can
be guaranteed that no other entity can be accessing the object:
static void obj_list_add(struct obj *obj)
static void obj_list_add(struct obj *obj, struct list_head *head)
{
obj->active = 1;
list_add(&obj->list);
list_add(&obj->list, head);
}
static void obj_list_del(struct obj *obj)

View File

@ -117,7 +117,7 @@ Using the pktcdvd debugfs interface
To read pktcdvd device infos in human readable form, do:
# cat /debug/pktcdvd/pktcdvd[0-7]/info
# cat /sys/kernel/debug/pktcdvd/pktcdvd[0-7]/info
For a description of the debugfs interface look into the file:

View File

@ -152,14 +152,19 @@ When swap is accounted, following files are added.
usage of mem+swap is limited by memsw.limit_in_bytes.
Note: why 'mem+swap' rather than swap.
* why 'mem+swap' rather than swap.
The global LRU(kswapd) can swap out arbitrary pages. Swap-out means
to move account from memory to swap...there is no change in usage of
mem+swap.
mem+swap. In other words, when we want to limit the usage of swap without
affecting global LRU, mem+swap limit is better than just limiting swap from
OS point of view.
In other words, when we want to limit the usage of swap without affecting
global LRU, mem+swap limit is better than just limiting swap from OS point
of view.
* What happens when a cgroup hits memory.memsw.limit_in_bytes
When a cgroup his memory.memsw.limit_in_bytes, it's useless to do swap-out
in this cgroup. Then, swap-out will not be done by cgroup routine and file
caches are dropped. But as mentioned above, global LRU can do swapout memory
from it for sanity of the system's memory management state. You can't forbid
it by cgroup.
2.5 Reclaim
@ -204,6 +209,7 @@ We can alter the memory limit:
NOTE: We can use a suffix (k, K, m, M, g or G) to indicate values in kilo,
mega or gigabytes.
NOTE: We can write "-1" to reset the *.limit_in_bytes(unlimited).
# cat /cgroups/0/memory.limit_in_bytes
4194304

View File

@ -41,6 +41,12 @@ void cn_test_callback(void *data)
msg->seq, msg->ack, msg->len, (char *)msg->data);
}
/*
* Do not remove this function even if no one is using it as
* this is an example of how to get notifications about new
* connector user registration
*/
#if 0
static int cn_test_want_notify(void)
{
struct cn_ctl_msg *ctl;
@ -117,6 +123,7 @@ nlmsg_failure:
kfree_skb(skb);
return -EINVAL;
}
#endif
static u32 cn_test_timer_counter;
static void cn_test_timer_func(unsigned long __data)

View File

@ -155,7 +155,7 @@ actual frequency must be determined using the following rules:
- if relation==CPUFREQ_REL_H, try to select a new_freq lower than or equal
target_freq. ("H for highest, but no higher than")
Here again the frequency table helper might assist you - see section 3
Here again the frequency table helper might assist you - see section 2
for details.

View File

@ -119,10 +119,6 @@ want the kernel to look at the CPU usage and to make decisions on
what to do about the frequency. Typically this is set to values of
around '10000' or more. It's default value is (cmp. with users-guide.txt):
transition_latency * 1000
The lowest value you can set is:
transition_latency * 100 or it may get restricted to a value where it
makes not sense for the kernel anymore to poll that often which depends
on your HZ config variable (HZ=1000: max=20000us, HZ=250: max=5000).
Be aware that transition latency is in ns and sampling_rate is in us, so you
get the same sysfs value by default.
Sampling rate should always get adjusted considering the transition latency
@ -131,14 +127,20 @@ in the bash (as said, 1000 is default), do:
echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) \
>ondemand/sampling_rate
show_sampling_rate_(min|max): THIS INTERFACE IS DEPRECATED, DON'T USE IT.
You can use wider ranges now and the general
cpuinfo_transition_latency variable (cmp. with user-guide.txt) can be
used to obtain exactly the same info:
show_sampling_rate_min = transtition_latency * 500 / 1000
show_sampling_rate_max = transtition_latency * 500000 / 1000
(divided by 1000 is to illustrate that sampling rate is in us and
transition latency is exported ns).
show_sampling_rate_min:
The sampling rate is limited by the HW transition latency:
transition_latency * 100
Or by kernel restrictions:
If CONFIG_NO_HZ is set, the limit is 10ms fixed.
If CONFIG_NO_HZ is not set or no_hz=off boot parameter is used, the
limits depend on the CONFIG_HZ option:
HZ=1000: min=20000us (20ms)
HZ=250: min=80000us (80ms)
HZ=100: min=200000us (200ms)
The highest value of kernel and HW latency restrictions is shown and
used as the minimum sampling rate.
show_sampling_rate_max: THIS INTERFACE IS DEPRECATED, DON'T USE IT.
up_threshold: defines what the average CPU usage between the samplings
of 'sampling_rate' needs to be for the kernel to make a decision on

View File

@ -31,7 +31,6 @@ Contents:
3. How to change the CPU cpufreq policy and/or speed
3.1 Preferred interface: sysfs
3.2 Deprecated interfaces

View File

@ -0,0 +1,54 @@
Device-Mapper Logging
=====================
The device-mapper logging code is used by some of the device-mapper
RAID targets to track regions of the disk that are not consistent.
A region (or portion of the address space) of the disk may be
inconsistent because a RAID stripe is currently being operated on or
a machine died while the region was being altered. In the case of
mirrors, a region would be considered dirty/inconsistent while you
are writing to it because the writes need to be replicated for all
the legs of the mirror and may not reach the legs at the same time.
Once all writes are complete, the region is considered clean again.
There is a generic logging interface that the device-mapper RAID
implementations use to perform logging operations (see
dm_dirty_log_type in include/linux/dm-dirty-log.h). Various different
logging implementations are available and provide different
capabilities. The list includes:
Type Files
==== =====
disk drivers/md/dm-log.c
core drivers/md/dm-log.c
userspace drivers/md/dm-log-userspace* include/linux/dm-log-userspace.h
The "disk" log type
-------------------
This log implementation commits the log state to disk. This way, the
logging state survives reboots/crashes.
The "core" log type
-------------------
This log implementation keeps the log state in memory. The log state
will not survive a reboot or crash, but there may be a small boost in
performance. This method can also be used if no storage device is
available for storing log state.
The "userspace" log type
------------------------
This log type simply provides a way to export the log API to userspace,
so log implementations can be done there. This is done by forwarding most
logging requests to userspace, where a daemon receives and processes the
request.
The structure used for communication between kernel and userspace are
located in include/linux/dm-log-userspace.h. Due to the frequency,
diversity, and 2-way communication nature of the exchanges between
kernel and userspace, 'connector' is used as the interface for
communication.
There are currently two userspace log implementations that leverage this
framework - "clustered_disk" and "clustered_core". These implementations
provide a cluster-coherent log for shared-storage. Device-mapper mirroring
can be used in a shared-storage environment when the cluster log implementations
are employed.

View File

@ -0,0 +1,39 @@
dm-queue-length
===============
dm-queue-length is a path selector module for device-mapper targets,
which selects a path with the least number of in-flight I/Os.
The path selector name is 'queue-length'.
Table parameters for each path: [<repeat_count>]
<repeat_count>: The number of I/Os to dispatch using the selected
path before switching to the next path.
If not given, internal default is used. To check
the default value, see the activated table.
Status for each path: <status> <fail-count> <in-flight>
<status>: 'A' if the path is active, 'F' if the path is failed.
<fail-count>: The number of path failures.
<in-flight>: The number of in-flight I/Os on the path.
Algorithm
=========
dm-queue-length increments/decrements 'in-flight' when an I/O is
dispatched/completed respectively.
dm-queue-length selects a path with the minimum 'in-flight'.
Examples
========
In case that 2 paths (sda and sdb) are used with repeat_count == 128.
# echo "0 10 multipath 0 0 1 1 queue-length 0 2 1 8:0 128 8:16 128" \
dmsetup create test
#
# dmsetup table
test: 0 10 multipath 0 0 1 1 queue-length 0 2 1 8:0 128 8:16 128
#
# dmsetup status
test: 0 10 multipath 2 0 0 0 1 1 E 0 2 1 8:0 A 0 0 8:16 A 0 0

View File

@ -0,0 +1,91 @@
dm-service-time
===============
dm-service-time is a path selector module for device-mapper targets,
which selects a path with the shortest estimated service time for
the incoming I/O.
The service time for each path is estimated by dividing the total size
of in-flight I/Os on a path with the performance value of the path.
The performance value is a relative throughput value among all paths
in a path-group, and it can be specified as a table argument.
The path selector name is 'service-time'.
Table parameters for each path: [<repeat_count> [<relative_throughput>]]
<repeat_count>: The number of I/Os to dispatch using the selected
path before switching to the next path.
If not given, internal default is used. To check
the default value, see the activated table.
<relative_throughput>: The relative throughput value of the path
among all paths in the path-group.
The valid range is 0-100.
If not given, minimum value '1' is used.
If '0' is given, the path isn't selected while
other paths having a positive value are available.
Status for each path: <status> <fail-count> <in-flight-size> \
<relative_throughput>
<status>: 'A' if the path is active, 'F' if the path is failed.
<fail-count>: The number of path failures.
<in-flight-size>: The size of in-flight I/Os on the path.
<relative_throughput>: The relative throughput value of the path
among all paths in the path-group.
Algorithm
=========
dm-service-time adds the I/O size to 'in-flight-size' when the I/O is
dispatched and substracts when completed.
Basically, dm-service-time selects a path having minimum service time
which is calculated by:
('in-flight-size' + 'size-of-incoming-io') / 'relative_throughput'
However, some optimizations below are used to reduce the calculation
as much as possible.
1. If the paths have the same 'relative_throughput', skip
the division and just compare the 'in-flight-size'.
2. If the paths have the same 'in-flight-size', skip the division
and just compare the 'relative_throughput'.
3. If some paths have non-zero 'relative_throughput' and others
have zero 'relative_throughput', ignore those paths with zero
'relative_throughput'.
If such optimizations can't be applied, calculate service time, and
compare service time.
If calculated service time is equal, the path having maximum
'relative_throughput' may be better. So compare 'relative_throughput'
then.
Examples
========
In case that 2 paths (sda and sdb) are used with repeat_count == 128
and sda has an average throughput 1GB/s and sdb has 4GB/s,
'relative_throughput' value may be '1' for sda and '4' for sdb.
# echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4" \
dmsetup create test
#
# dmsetup table
test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 1 8:16 128 4
#
# dmsetup status
test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 1 8:16 A 0 0 4
Or '2' for sda and '8' for sdb would be also true.
# echo "0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8" \
dmsetup create test
#
# dmsetup table
test: 0 10 multipath 0 0 1 1 service-time 0 2 2 8:0 128 2 8:16 128 8
#
# dmsetup status
test: 0 10 multipath 2 0 0 0 1 1 E 0 2 2 8:0 A 0 0 2 8:16 A 0 0 8

View File

@ -162,3 +162,35 @@ device_remove_file(dev,&dev_attr_power);
The file name will be 'power' with a mode of 0644 (-rw-r--r--).
Word of warning: While the kernel allows device_create_file() and
device_remove_file() to be called on a device at any time, userspace has
strict expectations on when attributes get created. When a new device is
registered in the kernel, a uevent is generated to notify userspace (like
udev) that a new device is available. If attributes are added after the
device is registered, then userspace won't get notified and userspace will
not know about the new attributes.
This is important for device driver that need to publish additional
attributes for a device at driver probe time. If the device driver simply
calls device_create_file() on the device structure passed to it, then
userspace will never be notified of the new attributes. Instead, it should
probably use class_create() and class->dev_attrs to set up a list of
desired attributes in the modules_init function, and then in the .probe()
hook, and then use device_create() to create a new device as a child
of the probed device. The new device will generate a new uevent and
properly advertise the new attributes to userspace.
For example, if a driver wanted to add the following attributes:
struct device_attribute mydriver_attribs[] = {
__ATTR(port_count, 0444, port_count_show),
__ATTR(serial_number, 0444, serial_number_show),
NULL
};
Then in the module init function is would do:
mydriver_class = class_create(THIS_MODULE, "my_attrs");
mydriver_class.dev_attr = mydriver_attribs;
And assuming 'dev' is the struct device passed into the probe hook, the driver
probe function would do something like:
create_device(&mydriver_class, dev, chrdev, &private_data, "my_name");

View File

@ -112,7 +112,7 @@ sub tda10045 {
sub tda10046 {
my $sourcefile = "TT_PCI_2.19h_28_11_2006.zip";
my $url = "http://technotrend-online.com/download/software/219/$sourcefile";
my $url = "http://www.tt-download.com/download/updates/219/$sourcefile";
my $hash = "6a7e1e2f2644b162ff0502367553c72d";
my $outfile = "dvb-fe-tda10046.fw";
my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1);
@ -129,8 +129,8 @@ sub tda10046 {
}
sub tda10046lifeview {
my $sourcefile = "Drv_2.11.02.zip";
my $url = "http://www.lifeview.com.tw/drivers/pci_card/FlyDVB-T/$sourcefile";
my $sourcefile = "7%5Cdrv_2.11.02.zip";
my $url = "http://www.lifeview.hk/dbimages/document/$sourcefile";
my $hash = "1ea24dee4eea8fe971686981f34fd2e0";
my $outfile = "dvb-fe-tda10046.fw";
my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1);
@ -317,7 +317,7 @@ sub nxt2002 {
sub nxt2004 {
my $sourcefile = "AVerTVHD_MCE_A180_Drv_v1.2.2.16.zip";
my $url = "http://www.aver.com/support/Drivers/$sourcefile";
my $url = "http://www.avermedia-usa.com/support/Drivers/$sourcefile";
my $hash = "111cb885b1e009188346d72acfed024c";
my $outfile = "dvb-fe-nxt2004.fw";
my $tmpdir = tempdir(DIR => "/tmp", CLEANUP => 1);

View File

@ -29,16 +29,16 @@ o debugfs entries
fault-inject-debugfs kernel module provides some debugfs entries for runtime
configuration of fault-injection capabilities.
- /debug/fail*/probability:
- /sys/kernel/debug/fail*/probability:
likelihood of failure injection, in percent.
Format: <percent>
Note that one-failure-per-hundred is a very high error rate
for some testcases. Consider setting probability=100 and configure
/debug/fail*/interval for such testcases.
/sys/kernel/debug/fail*/interval for such testcases.
- /debug/fail*/interval:
- /sys/kernel/debug/fail*/interval:
specifies the interval between failures, for calls to
should_fail() that pass all the other tests.
@ -46,18 +46,18 @@ configuration of fault-injection capabilities.
Note that if you enable this, by setting interval>1, you will
probably want to set probability=100.
- /debug/fail*/times:
- /sys/kernel/debug/fail*/times:
specifies how many times failures may happen at most.
A value of -1 means "no limit".
- /debug/fail*/space:
- /sys/kernel/debug/fail*/space:
specifies an initial resource "budget", decremented by "size"
on each call to should_fail(,size). Failure injection is
suppressed until "space" reaches zero.
- /debug/fail*/verbose
- /sys/kernel/debug/fail*/verbose
Format: { 0 | 1 | 2 }
specifies the verbosity of the messages when failure is
@ -65,17 +65,17 @@ configuration of fault-injection capabilities.
log line per failure; '2' will print a call trace too -- useful
to debug the problems revealed by fault injection.
- /debug/fail*/task-filter:
- /sys/kernel/debug/fail*/task-filter:
Format: { 'Y' | 'N' }
A value of 'N' disables filtering by process (default).
Any positive value limits failures to only processes indicated by
/proc/<pid>/make-it-fail==1.
- /debug/fail*/require-start:
- /debug/fail*/require-end:
- /debug/fail*/reject-start:
- /debug/fail*/reject-end:
- /sys/kernel/debug/fail*/require-start:
- /sys/kernel/debug/fail*/require-end:
- /sys/kernel/debug/fail*/reject-start:
- /sys/kernel/debug/fail*/reject-end:
specifies the range of virtual addresses tested during
stacktrace walking. Failure is injected only if some caller
@ -84,26 +84,26 @@ configuration of fault-injection capabilities.
Default required range is [0,ULONG_MAX) (whole of virtual address space).
Default rejected range is [0,0).
- /debug/fail*/stacktrace-depth:
- /sys/kernel/debug/fail*/stacktrace-depth:
specifies the maximum stacktrace depth walked during search
for a caller within [require-start,require-end) OR
[reject-start,reject-end).
- /debug/fail_page_alloc/ignore-gfp-highmem:
- /sys/kernel/debug/fail_page_alloc/ignore-gfp-highmem:
Format: { 'Y' | 'N' }
default is 'N', setting it to 'Y' won't inject failures into
highmem/user allocations.
- /debug/failslab/ignore-gfp-wait:
- /debug/fail_page_alloc/ignore-gfp-wait:
- /sys/kernel/debug/failslab/ignore-gfp-wait:
- /sys/kernel/debug/fail_page_alloc/ignore-gfp-wait:
Format: { 'Y' | 'N' }
default is 'N', setting it to 'Y' will inject failures
only into non-sleep allocations (GFP_ATOMIC allocations).
- /debug/fail_page_alloc/min-order:
- /sys/kernel/debug/fail_page_alloc/min-order:
specifies the minimum page allocation order to be injected
failures.
@ -166,13 +166,13 @@ o Inject slab allocation failures into module init/exit code
#!/bin/bash
FAILTYPE=failslab
echo Y > /debug/$FAILTYPE/task-filter
echo 10 > /debug/$FAILTYPE/probability
echo 100 > /debug/$FAILTYPE/interval
echo -1 > /debug/$FAILTYPE/times
echo 0 > /debug/$FAILTYPE/space
echo 2 > /debug/$FAILTYPE/verbose
echo 1 > /debug/$FAILTYPE/ignore-gfp-wait
echo Y > /sys/kernel/debug/$FAILTYPE/task-filter
echo 10 > /sys/kernel/debug/$FAILTYPE/probability
echo 100 > /sys/kernel/debug/$FAILTYPE/interval
echo -1 > /sys/kernel/debug/$FAILTYPE/times
echo 0 > /sys/kernel/debug/$FAILTYPE/space
echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
faulty_system()
{
@ -217,20 +217,20 @@ then
exit 1
fi
cat /sys/module/$module/sections/.text > /debug/$FAILTYPE/require-start
cat /sys/module/$module/sections/.data > /debug/$FAILTYPE/require-end
cat /sys/module/$module/sections/.text > /sys/kernel/debug/$FAILTYPE/require-start
cat /sys/module/$module/sections/.data > /sys/kernel/debug/$FAILTYPE/require-end
echo N > /debug/$FAILTYPE/task-filter
echo 10 > /debug/$FAILTYPE/probability
echo 100 > /debug/$FAILTYPE/interval
echo -1 > /debug/$FAILTYPE/times
echo 0 > /debug/$FAILTYPE/space
echo 2 > /debug/$FAILTYPE/verbose
echo 1 > /debug/$FAILTYPE/ignore-gfp-wait
echo 1 > /debug/$FAILTYPE/ignore-gfp-highmem
echo 10 > /debug/$FAILTYPE/stacktrace-depth
echo N > /sys/kernel/debug/$FAILTYPE/task-filter
echo 10 > /sys/kernel/debug/$FAILTYPE/probability
echo 100 > /sys/kernel/debug/$FAILTYPE/interval
echo -1 > /sys/kernel/debug/$FAILTYPE/times
echo 0 > /sys/kernel/debug/$FAILTYPE/space
echo 2 > /sys/kernel/debug/$FAILTYPE/verbose
echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-wait
echo 1 > /sys/kernel/debug/$FAILTYPE/ignore-gfp-highmem
echo 10 > /sys/kernel/debug/$FAILTYPE/stacktrace-depth
trap "echo 0 > /debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
trap "echo 0 > /sys/kernel/debug/$FAILTYPE/probability" SIGINT SIGTERM EXIT
echo "Injecting errors into the module $module... (interrupt to stop)"
sleep 1000000

View File

@ -95,7 +95,7 @@ There is no way to change the vesafb video mode and/or timings after
booting linux. If you are not happy with the 60 Hz refresh rate, you
have these options:
* configure and load the DOS-Tools for your the graphics board (if
* configure and load the DOS-Tools for the graphics board (if
available) and boot linux with loadlin.
* use a native driver (matroxfb/atyfb) instead if vesafb. If none
is available, write a new one!

View File

@ -6,6 +6,20 @@ be removed from this file.
---------------------------
What: IRQF_SAMPLE_RANDOM
Check: IRQF_SAMPLE_RANDOM
When: July 2009
Why: Many of IRQF_SAMPLE_RANDOM users are technically bogus as entropy
sources in the kernel's current entropy model. To resolve this, every
input point to the kernel's entropy pool needs to better document the
type of entropy source it actually is. This will be replaced with
additional add_*_randomness functions in drivers/char/random.c
Who: Robin Getz <rgetz@blackfin.uclinux.org> & Matt Mackall <mpm@selenic.com>
---------------------------
What: The ieee80211_regdom module parameter
When: March 2010 / desktop catchup
@ -354,16 +368,6 @@ Who: Krzysztof Piotr Oledzki <ole@ans.pl>
---------------------------
What: i2c_attach_client(), i2c_detach_client(), i2c_driver->detach_client(),
i2c_adapter->client_register(), i2c_adapter->client_unregister
When: 2.6.30
Check: i2c_attach_client i2c_detach_client
Why: Deprecated by the new (standard) device driver binding model. Use
i2c_driver->probe() and ->remove() instead.
Who: Jean Delvare <khali@linux-fr.org>
---------------------------
What: fscher and fscpos drivers
When: June 2009
Why: Deprecated by the new fschmd driver.
@ -438,6 +442,13 @@ Why: Superseded by tdfxfb. I2C/DDC support used to live in a separate
Who: Jean Delvare <khali@linux-fr.org>
Krzysztof Helt <krzysztof.h1@wp.pl>
---------------------------
What: CONFIG_RFKILL_INPUT
When: 2.6.33
Why: Should be implemented in userspace, policy daemon.
Who: Johannes Berg <johannes@sipsolutions.net>
----------------------------
What: CONFIG_X86_OLD_MCE

View File

@ -66,6 +66,10 @@ mandatory-locking.txt
- info on the Linux implementation of Sys V mandatory file locking.
ncpfs.txt
- info on Novell Netware(tm) filesystem using NCP protocol.
nfs41-server.txt
- info on the Linux server implementation of NFSv4 minor version 1.
nfs-rdma.txt
- how to install and setup the Linux NFS/RDMA client and server software.
nfsroot.txt
- short guide on setting up a diskless box with NFS root filesystem.
nilfs2.txt

View File

@ -109,27 +109,28 @@ prototypes:
locking rules:
All may block.
BKL s_lock s_umount
alloc_inode: no no no
destroy_inode: no
dirty_inode: no (must not sleep)
write_inode: no
drop_inode: no !!!inode_lock!!!
delete_inode: no
put_super: yes yes no
write_super: no yes read
sync_fs: no no read
freeze_fs: ?
unfreeze_fs: ?
statfs: no no no
remount_fs: yes yes maybe (see below)
clear_inode: no
umount_begin: yes no no
show_options: no (vfsmount->sem)
quota_read: no no no (see below)
quota_write: no no no (see below)
None have BKL
s_umount
alloc_inode:
destroy_inode:
dirty_inode: (must not sleep)
write_inode:
drop_inode: !!!inode_lock!!!
delete_inode:
put_super: write
write_super: read
sync_fs: read
freeze_fs: read
unfreeze_fs: read
statfs: no
remount_fs: maybe (see below)
clear_inode:
umount_begin: no
show_options: no (namespace_sem)
quota_read: no (see below)
quota_write: no (see below)
->remount_fs() will have the s_umount lock if it's already mounted.
->remount_fs() will have the s_umount exclusive lock if it's already mounted.
When called from get_sb_single, it does NOT have the s_umount lock.
->quota_read() and ->quota_write() functions are both guaranteed to
be the only ones operating on the quota file by the quota code (via
@ -187,7 +188,7 @@ readpages: no
write_begin: no locks the page yes
write_end: no yes, unlocks yes
perform_write: no n/a yes
bmap: yes
bmap: no
invalidatepage: no yes
releasepage: no yes
direct_IO: no

View File

@ -322,7 +322,7 @@ an upper limit on the block size imposed by the page size of the kernel,
so 8kB blocks are only allowed on Alpha systems (and other architectures
which support larger pages).
There is an upper limit of 32768 subdirectories in a single directory.
There is an upper limit of 32000 subdirectories in a single directory.
There is a "soft" upper limit of about 10-15k files in a single directory
with the current linear linked-list directory implementation. This limit

View File

@ -235,6 +235,10 @@ minixdf Make 'df' act like Minix.
debug Extra debugging information is sent to syslog.
abort Simulate the effects of calling ext4_abort() for
debugging purposes. This is normally used while
remounting a filesystem which is already mounted.
errors=remount-ro Remount the filesystem read-only on an error.
errors=continue Keep going on a filesystem error.
errors=panic Panic and halt the machine if an error occurs.

View File

@ -23,8 +23,13 @@ Mount options unique to the isofs filesystem.
map=off Do not map non-Rock Ridge filenames to lower case
map=normal Map non-Rock Ridge filenames to lower case
map=acorn As map=normal but also apply Acorn extensions if present
mode=xxx Sets the permissions on files to xxx
dmode=xxx Sets the permissions on directories to xxx
mode=xxx Sets the permissions on files to xxx unless Rock Ridge
extensions set the permissions otherwise
dmode=xxx Sets the permissions on directories to xxx unless Rock Ridge
extensions set the permissions otherwise
overriderockperm Set permissions on files and directories according to
'mode' and 'dmode' even though Rock Ridge extensions are
present.
nojoliet Ignore Joliet extensions if they are present.
norock Ignore Rock Ridge extensions if they are present.
hide Completely strip hidden files from the file system.

View File

@ -39,9 +39,8 @@ Features which NILFS2 does not support yet:
- extended attributes
- POSIX ACLs
- quotas
- writable snapshots
- remote backup (CDP)
- data integrity
- fsck
- resize
- defragmentation
Mount options

View File

@ -5,11 +5,12 @@
Bodo Bauer <bb@ricochet.net>
2.4.x update Jorge Nerin <comandante@zaralinux.com> November 14 2000
move /proc/sys Shen Feng <shen@cn.fujitsu.com> April 1 2009
move /proc/sys Shen Feng <shen@cn.fujitsu.com> April 1 2009
------------------------------------------------------------------------------
Version 1.3 Kernel version 2.2.12
Kernel version 2.4.0-test11-pre4
------------------------------------------------------------------------------
fixes/update part 1.1 Stefani Seibold <stefani@seibold.net> June 9 2009
Table of Contents
-----------------
@ -116,7 +117,7 @@ The link self points to the process reading the file system. Each process
subdirectory has the entries listed in Table 1-1.
Table 1-1: Process specific entries in /proc
Table 1-1: Process specific entries in /proc
..............................................................................
File Content
clear_refs Clears page referenced bits shown in smaps output
@ -134,46 +135,103 @@ Table 1-1: Process specific entries in /proc
status Process status in human readable form
wchan If CONFIG_KALLSYMS is set, a pre-decoded wchan
stack Report full stack trace, enable via CONFIG_STACKTRACE
smaps Extension based on maps, the rss size for each mapped file
smaps a extension based on maps, showing the memory consumption of
each mapping
..............................................................................
For example, to get the status information of a process, all you have to do is
read the file /proc/PID/status:
>cat /proc/self/status
Name: cat
State: R (running)
Pid: 5452
PPid: 743
>cat /proc/self/status
Name: cat
State: R (running)
Tgid: 5452
Pid: 5452
PPid: 743
TracerPid: 0 (2.4)
Uid: 501 501 501 501
Gid: 100 100 100 100
Groups: 100 14 16
VmSize: 1112 kB
VmLck: 0 kB
VmRSS: 348 kB
VmData: 24 kB
VmStk: 12 kB
VmExe: 8 kB
VmLib: 1044 kB
SigPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000000000000
CapInh: 00000000fffffeff
CapPrm: 0000000000000000
CapEff: 0000000000000000
Uid: 501 501 501 501
Gid: 100 100 100 100
FDSize: 256
Groups: 100 14 16
VmPeak: 5004 kB
VmSize: 5004 kB
VmLck: 0 kB
VmHWM: 476 kB
VmRSS: 476 kB
VmData: 156 kB
VmStk: 88 kB
VmExe: 68 kB
VmLib: 1412 kB
VmPTE: 20 kb
Threads: 1
SigQ: 0/28578
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000000000
SigCgt: 0000000000000000
CapInh: 00000000fffffeff
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffffffffffff
voluntary_ctxt_switches: 0
nonvoluntary_ctxt_switches: 1
This shows you nearly the same information you would get if you viewed it with
the ps command. In fact, ps uses the proc file system to obtain its
information. The statm file contains more detailed information about the
process memory usage. Its seven fields are explained in Table 1-2. The stat
file contains details information about the process itself. Its fields are
explained in Table 1-3.
information. But you get a more detailed view of the process by reading the
file /proc/PID/status. It fields are described in table 1-2.
The statm file contains more detailed information about the process
memory usage. Its seven fields are explained in Table 1-3. The stat file
contains details information about the process itself. Its fields are
explained in Table 1-4.
Table 1-2: Contents of the statm files (as of 2.6.8-rc3)
Table 1-2: Contents of the statm files (as of 2.6.30-rc7)
..............................................................................
Field Content
Name filename of the executable
State state (R is running, S is sleeping, D is sleeping
in an uninterruptible wait, Z is zombie,
T is traced or stopped)
Tgid thread group ID
Pid process id
PPid process id of the parent process
TracerPid PID of process tracing this process (0 if not)
Uid Real, effective, saved set, and file system UIDs
Gid Real, effective, saved set, and file system GIDs
FDSize number of file descriptor slots currently allocated
Groups supplementary group list
VmPeak peak virtual memory size
VmSize total program size
VmLck locked memory size
VmHWM peak resident set size ("high water mark")
VmRSS size of memory portions
VmData size of data, stack, and text segments
VmStk size of data, stack, and text segments
VmExe size of text segment
VmLib size of shared library code
VmPTE size of page table entries
Threads number of threads
SigQ number of signals queued/max. number for queue
SigPnd bitmap of pending signals for the thread
ShdPnd bitmap of shared pending signals for the process
SigBlk bitmap of blocked signals
SigIgn bitmap of ignored signals
SigCgt bitmap of catched signals
CapInh bitmap of inheritable capabilities
CapPrm bitmap of permitted capabilities
CapEff bitmap of effective capabilities
CapBnd bitmap of capabilities bounding set
Cpus_allowed mask of CPUs on which this process may run
Cpus_allowed_list Same as previous, but in "list format"
Mems_allowed mask of memory nodes allowed to this process
Mems_allowed_list Same as previous, but in "list format"
voluntary_ctxt_switches number of voluntary context switches
nonvoluntary_ctxt_switches number of non voluntary context switches
..............................................................................
Table 1-3: Contents of the statm files (as of 2.6.8-rc3)
..............................................................................
Field Content
size total program size (pages) (same as VmSize in status)
@ -188,7 +246,7 @@ Table 1-2: Contents of the statm files (as of 2.6.8-rc3)
..............................................................................
Table 1-3: Contents of the stat files (as of 2.6.22-rc3)
Table 1-4: Contents of the stat files (as of 2.6.30-rc7)
..............................................................................
Field Content
pid process id
@ -222,10 +280,10 @@ Table 1-3: Contents of the stat files (as of 2.6.22-rc3)
start_stack address of the start of the stack
esp current value of ESP
eip current value of EIP
pending bitmap of pending signals (obsolete)
blocked bitmap of blocked signals (obsolete)
sigign bitmap of ignored signals (obsolete)
sigcatch bitmap of catched signals (obsolete)
pending bitmap of pending signals
blocked bitmap of blocked signals
sigign bitmap of ignored signals
sigcatch bitmap of catched signals
wchan address where process went to sleep
0 (place holder)
0 (place holder)
@ -234,19 +292,99 @@ Table 1-3: Contents of the stat files (as of 2.6.22-rc3)
rt_priority realtime priority
policy scheduling policy (man sched_setscheduler)
blkio_ticks time spent waiting for block IO
gtime guest time of the task in jiffies
cgtime guest time of the task children in jiffies
..............................................................................
The /proc/PID/map file containing the currently mapped memory regions and
their access permissions.
The format is:
address perms offset dev inode pathname
08048000-08049000 r-xp 00000000 03:00 8312 /opt/test
08049000-0804a000 rw-p 00001000 03:00 8312 /opt/test
0804a000-0806b000 rw-p 00000000 00:00 0 [heap]
a7cb1000-a7cb2000 ---p 00000000 00:00 0
a7cb2000-a7eb2000 rw-p 00000000 00:00 0
a7eb2000-a7eb3000 ---p 00000000 00:00 0
a7eb3000-a7ed5000 rw-p 00000000 00:00 0
a7ed5000-a8008000 r-xp 00000000 03:00 4222 /lib/libc.so.6
a8008000-a800a000 r--p 00133000 03:00 4222 /lib/libc.so.6
a800a000-a800b000 rw-p 00135000 03:00 4222 /lib/libc.so.6
a800b000-a800e000 rw-p 00000000 00:00 0
a800e000-a8022000 r-xp 00000000 03:00 14462 /lib/libpthread.so.0
a8022000-a8023000 r--p 00013000 03:00 14462 /lib/libpthread.so.0
a8023000-a8024000 rw-p 00014000 03:00 14462 /lib/libpthread.so.0
a8024000-a8027000 rw-p 00000000 00:00 0
a8027000-a8043000 r-xp 00000000 03:00 8317 /lib/ld-linux.so.2
a8043000-a8044000 r--p 0001b000 03:00 8317 /lib/ld-linux.so.2
a8044000-a8045000 rw-p 0001c000 03:00 8317 /lib/ld-linux.so.2
aff35000-aff4a000 rw-p 00000000 00:00 0 [stack]
ffffe000-fffff000 r-xp 00000000 00:00 0 [vdso]
where "address" is the address space in the process that it occupies, "perms"
is a set of permissions:
r = read
w = write
x = execute
s = shared
p = private (copy on write)
"offset" is the offset into the mapping, "dev" is the device (major:minor), and
"inode" is the inode on that device. 0 indicates that no inode is associated
with the memory region, as the case would be with BSS (uninitialized data).
The "pathname" shows the name associated file for this mapping. If the mapping
is not associated with a file:
[heap] = the heap of the program
[stack] = the stack of the main process
[vdso] = the "virtual dynamic shared object",
the kernel system call handler
or if empty, the mapping is anonymous.
The /proc/PID/smaps is an extension based on maps, showing the memory
consumption for each of the process's mappings. For each of mappings there
is a series of lines such as the following:
08048000-080bc000 r-xp 00000000 03:02 13130 /bin/bash
Size: 1084 kB
Rss: 892 kB
Pss: 374 kB
Shared_Clean: 892 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
Referenced: 892 kB
Swap: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
The first of these lines shows the same information as is displayed for the
mapping in /proc/PID/maps. The remaining lines show the size of the mapping,
the amount of the mapping that is currently resident in RAM, the "proportional
set size” (divide each shared page by the number of processes sharing it), the
number of clean and dirty shared pages in the mapping, and the number of clean
and dirty private pages in the mapping. The "Referenced" indicates the amount
of memory currently marked as referenced or accessed.
This file is only present if the CONFIG_MMU kernel configuration option is
enabled.
1.2 Kernel data
---------------
Similar to the process entries, the kernel data files give information about
the running kernel. The files used to obtain this information are contained in
/proc and are listed in Table 1-4. Not all of these will be present in your
/proc and are listed in Table 1-5. Not all of these will be present in your
system. It depends on the kernel configuration and the loaded modules, which
files are there, and which are missing.
Table 1-4: Kernel info in /proc
Table 1-5: Kernel info in /proc
..............................................................................
File Content
apm Advanced power management info
@ -283,6 +421,7 @@ Table 1-4: Kernel info in /proc
rtc Real time clock
scsi SCSI info (see text)
slabinfo Slab pool info
softirqs softirq usage
stat Overall statistics
swaps Swap space utilization
sys See chapter 2
@ -597,6 +736,25 @@ on the kind of area :
0xffffffffa0017000-0xffffffffa0022000 45056 sys_init_module+0xc27/0x1d00 ...
pages=10 vmalloc N0=10
..............................................................................
softirqs:
Provides counts of softirq handlers serviced since boot time, for each cpu.
> cat /proc/softirqs
CPU0 CPU1 CPU2 CPU3
HI: 0 0 0 0
TIMER: 27166 27120 27097 27034
NET_TX: 0 0 0 17
NET_RX: 42 0 0 39
BLOCK: 0 0 107 1121
TASKLET: 0 0 0 290
SCHED: 27035 26983 26971 26746
HRTIMER: 0 0 0 0
RCU: 1678 1769 2178 2250
1.3 IDE devices in /proc/ide
----------------------------
@ -614,10 +772,10 @@ IDE devices:
More detailed information can be found in the controller specific
subdirectories. These are named ide0, ide1 and so on. Each of these
directories contains the files shown in table 1-5.
directories contains the files shown in table 1-6.
Table 1-5: IDE controller info in /proc/ide/ide?
Table 1-6: IDE controller info in /proc/ide/ide?
..............................................................................
File Content
channel IDE channel (0 or 1)
@ -627,11 +785,11 @@ Table 1-5: IDE controller info in /proc/ide/ide?
..............................................................................
Each device connected to a controller has a separate subdirectory in the
controllers directory. The files listed in table 1-6 are contained in these
controllers directory. The files listed in table 1-7 are contained in these
directories.
Table 1-6: IDE device information
Table 1-7: IDE device information
..............................................................................
File Content
cache The cache
@ -673,12 +831,12 @@ the drive parameters:
1.4 Networking info in /proc/net
--------------------------------
The subdirectory /proc/net follows the usual pattern. Table 1-6 shows the
The subdirectory /proc/net follows the usual pattern. Table 1-8 shows the
additional values you get for IP version 6 if you configure the kernel to
support this. Table 1-7 lists the files and their meaning.
support this. Table 1-9 lists the files and their meaning.
Table 1-6: IPv6 info in /proc/net
Table 1-8: IPv6 info in /proc/net
..............................................................................
File Content
udp6 UDP sockets (IPv6)
@ -693,7 +851,7 @@ Table 1-6: IPv6 info in /proc/net
..............................................................................
Table 1-7: Network info in /proc/net
Table 1-9: Network info in /proc/net
..............................................................................
File Content
arp Kernel ARP table
@ -817,10 +975,10 @@ The directory /proc/parport contains information about the parallel ports of
your system. It has one subdirectory for each port, named after the port
number (0,1,2,...).
These directories contain the four files shown in Table 1-8.
These directories contain the four files shown in Table 1-10.
Table 1-8: Files in /proc/parport
Table 1-10: Files in /proc/parport
..............................................................................
File Content
autoprobe Any IEEE-1284 device ID information that has been acquired.
@ -838,10 +996,10 @@ Table 1-8: Files in /proc/parport
Information about the available and actually used tty's can be found in the
directory /proc/tty.You'll find entries for drivers and line disciplines in
this directory, as shown in Table 1-9.
this directory, as shown in Table 1-11.
Table 1-9: Files in /proc/tty
Table 1-11: Files in /proc/tty
..............................................................................
File Content
drivers list of drivers and their usage
@ -883,6 +1041,7 @@ since the system first booted. For a quick look, simply cat the file:
processes 2915
procs_running 1
procs_blocked 0
softirq 183433 0 21755 12 39 1137 231 21459 2263
The very first "cpu" line aggregates the numbers in all of the other "cpuN"
lines. These numbers identify the amount of time the CPU has spent performing
@ -918,6 +1077,11 @@ CPUs.
The "procs_blocked" line gives the number of processes currently blocked,
waiting for I/O to complete.
The "softirq" line gives counts of softirqs serviced since boot time, for each
of the possible system softirqs. The first column is the total of all
softirqs serviced; each subsequent column is the total for that particular
softirq.
1.9 Ext4 file system parameters
------------------------------
@ -926,9 +1090,9 @@ Information about mounted ext4 file systems can be found in
/proc/fs/ext4. Each mounted filesystem will have a directory in
/proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or
/proc/fs/ext4/dm-0). The files in each per-device directory are shown
in Table 1-10, below.
in Table 1-12, below.
Table 1-10: Files in /proc/fs/ext4/<devname>
Table 1-12: Files in /proc/fs/ext4/<devname>
..............................................................................
File Content
mb_groups details of multiblock allocator buddy cache of free blocks
@ -1003,11 +1167,13 @@ CHAPTER 3: PER-PROCESS PARAMETERS
3.1 /proc/<pid>/oom_adj - Adjust the oom-killer score
------------------------------------------------------
This file can be used to adjust the score used to select which processes
should be killed in an out-of-memory situation. Giving it a high score will
increase the likelihood of this process being killed by the oom-killer. Valid
values are in the range -16 to +15, plus the special value -17, which disables
oom-killing altogether for this process.
This file can be used to adjust the score used to select which processes should
be killed in an out-of-memory situation. The oom_adj value is a characteristic
of the task's mm, so all threads that share an mm with pid will have the same
oom_adj value. A high value will increase the likelihood of this process being
killed by the oom-killer. Valid values are in the range -16 to +15 as
explained below and a special value of -17, which disables oom-killing
altogether for threads sharing pid's mm.
The process to be killed in an out-of-memory situation is selected among all others
based on its badness score. This value equals the original memory size of the process
@ -1021,6 +1187,9 @@ the parent's score if they do not share the same memory. Thus forking servers
are the prime candidates to be killed. Having only one 'hungry' child will make
parent less preferable than the child.
/proc/<pid>/oom_adj cannot be changed for kthreads since they are immune from
oom-killing already.
/proc/<pid>/oom_score shows process' current badness score.
The following heuristics are then applied:

View File

@ -132,6 +132,11 @@ rodir -- FAT has the ATTR_RO (read-only) attribute. On Windows,
If you want to use ATTR_RO as read-only flag even for
the directory, set this option.
errors=panic|continue|remount-ro
-- specify FAT behavior on critical errors: panic, continue
without doing anything or remount the partition in
read-only mode (default behavior).
<bool>: 0,1,yes,no,true,false
TODO

View File

@ -77,7 +77,8 @@
seconds for the whole load operation.
- request_firmware_nowait() is also provided for convenience in
non-user contexts.
user contexts to request firmware asynchronously, but can't be called
in atomic contexts.
about in-kernel persistence:

246
Documentation/gcov.txt Normal file
View File

@ -0,0 +1,246 @@
Using gcov with the Linux kernel
================================
1. Introduction
2. Preparation
3. Customization
4. Files
5. Modules
6. Separated build and test machines
7. Troubleshooting
Appendix A: sample script: gather_on_build.sh
Appendix B: sample script: gather_on_test.sh
1. Introduction
===============
gcov profiling kernel support enables the use of GCC's coverage testing
tool gcov [1] with the Linux kernel. Coverage data of a running kernel
is exported in gcov-compatible format via the "gcov" debugfs directory.
To get coverage data for a specific file, change to the kernel build
directory and use gcov with the -o option as follows (requires root):
# cd /tmp/linux-out
# gcov -o /sys/kernel/debug/gcov/tmp/linux-out/kernel spinlock.c
This will create source code files annotated with execution counts
in the current directory. In addition, graphical gcov front-ends such
as lcov [2] can be used to automate the process of collecting data
for the entire kernel and provide coverage overviews in HTML format.
Possible uses:
* debugging (has this line been reached at all?)
* test improvement (how do I change my test to cover these lines?)
* minimizing kernel configurations (do I need this option if the
associated code is never run?)
--
[1] http://gcc.gnu.org/onlinedocs/gcc/Gcov.html
[2] http://ltp.sourceforge.net/coverage/lcov.php
2. Preparation
==============
Configure the kernel with:
CONFIG_DEBUGFS=y
CONFIG_GCOV_KERNEL=y
and to get coverage data for the entire kernel:
CONFIG_GCOV_PROFILE_ALL=y
Note that kernels compiled with profiling flags will be significantly
larger and run slower. Also CONFIG_GCOV_PROFILE_ALL may not be supported
on all architectures.
Profiling data will only become accessible once debugfs has been
mounted:
mount -t debugfs none /sys/kernel/debug
3. Customization
================
To enable profiling for specific files or directories, add a line
similar to the following to the respective kernel Makefile:
For a single file (e.g. main.o):
GCOV_PROFILE_main.o := y
For all files in one directory:
GCOV_PROFILE := y
To exclude files from being profiled even when CONFIG_GCOV_PROFILE_ALL
is specified, use:
GCOV_PROFILE_main.o := n
and:
GCOV_PROFILE := n
Only files which are linked to the main kernel image or are compiled as
kernel modules are supported by this mechanism.
4. Files
========
The gcov kernel support creates the following files in debugfs:
/sys/kernel/debug/gcov
Parent directory for all gcov-related files.
/sys/kernel/debug/gcov/reset
Global reset file: resets all coverage data to zero when
written to.
/sys/kernel/debug/gcov/path/to/compile/dir/file.gcda
The actual gcov data file as understood by the gcov
tool. Resets file coverage data to zero when written to.
/sys/kernel/debug/gcov/path/to/compile/dir/file.gcno
Symbolic link to a static data file required by the gcov
tool. This file is generated by gcc when compiling with
option -ftest-coverage.
5. Modules
==========
Kernel modules may contain cleanup code which is only run during
module unload time. The gcov mechanism provides a means to collect
coverage data for such code by keeping a copy of the data associated
with the unloaded module. This data remains available through debugfs.
Once the module is loaded again, the associated coverage counters are
initialized with the data from its previous instantiation.
This behavior can be deactivated by specifying the gcov_persist kernel
parameter:
gcov_persist=0
At run-time, a user can also choose to discard data for an unloaded
module by writing to its data file or the global reset file.
6. Separated build and test machines
====================================
The gcov kernel profiling infrastructure is designed to work out-of-the
box for setups where kernels are built and run on the same machine. In
cases where the kernel runs on a separate machine, special preparations
must be made, depending on where the gcov tool is used:
a) gcov is run on the TEST machine
The gcov tool version on the test machine must be compatible with the
gcc version used for kernel build. Also the following files need to be
copied from build to test machine:
from the source tree:
- all C source files + headers
from the build tree:
- all C source files + headers
- all .gcda and .gcno files
- all links to directories
It is important to note that these files need to be placed into the
exact same file system location on the test machine as on the build
machine. If any of the path components is symbolic link, the actual
directory needs to be used instead (due to make's CURDIR handling).
b) gcov is run on the BUILD machine
The following files need to be copied after each test case from test
to build machine:
from the gcov directory in sysfs:
- all .gcda files
- all links to .gcno files
These files can be copied to any location on the build machine. gcov
must then be called with the -o option pointing to that directory.
Example directory setup on the build machine:
/tmp/linux: kernel source tree
/tmp/out: kernel build directory as specified by make O=
/tmp/coverage: location of the files copied from the test machine
[user@build] cd /tmp/out
[user@build] gcov -o /tmp/coverage/tmp/out/init main.c
7. Troubleshooting
==================
Problem: Compilation aborts during linker step.
Cause: Profiling flags are specified for source files which are not
linked to the main kernel or which are linked by a custom
linker procedure.
Solution: Exclude affected source files from profiling by specifying
GCOV_PROFILE := n or GCOV_PROFILE_basename.o := n in the
corresponding Makefile.
Appendix A: gather_on_build.sh
==============================
Sample script to gather coverage meta files on the build machine
(see 6a):
#!/bin/bash
KSRC=$1
KOBJ=$2
DEST=$3
if [ -z "$KSRC" ] || [ -z "$KOBJ" ] || [ -z "$DEST" ]; then
echo "Usage: $0 <ksrc directory> <kobj directory> <output.tar.gz>" >&2
exit 1
fi
KSRC=$(cd $KSRC; printf "all:\n\t@echo \${CURDIR}\n" | make -f -)
KOBJ=$(cd $KOBJ; printf "all:\n\t@echo \${CURDIR}\n" | make -f -)
find $KSRC $KOBJ \( -name '*.gcno' -o -name '*.[ch]' -o -type l \) -a \
-perm /u+r,g+r | tar cfz $DEST -P -T -
if [ $? -eq 0 ] ; then
echo "$DEST successfully created, copy to test system and unpack with:"
echo " tar xfz $DEST -P"
else
echo "Could not create file $DEST"
fi
Appendix B: gather_on_test.sh
=============================
Sample script to gather coverage data files on the test machine
(see 6b):
#!/bin/bash
DEST=$1
GCDA=/sys/kernel/debug/gcov
if [ -z "$DEST" ] ; then
echo "Usage: $0 <output.tar.gz>" >&2
exit 1
fi
find $GCDA -name '*.gcno' -o -name '*.gcda' | tar cfz $DEST -T -
if [ $? -eq 0 ] ; then
echo "$DEST successfully created, copy to build system and unpack with:"
echo " tar xfz $DEST"
else
echo "Could not create file $DEST"
fi

View File

@ -2,14 +2,18 @@ Kernel driver f71882fg
======================
Supported chips:
* Fintek F71882FG and F71883FG
Prefix: 'f71882fg'
* Fintek F71858FG
Prefix: 'f71858fg'
Addresses scanned: none, address read from Super I/O config space
Datasheet: Available from the Fintek website
* Fintek F71862FG and F71863FG
Prefix: 'f71862fg'
Addresses scanned: none, address read from Super I/O config space
Datasheet: Available from the Fintek website
* Fintek F71882FG and F71883FG
Prefix: 'f71882fg'
Addresses scanned: none, address read from Super I/O config space
Datasheet: Available from the Fintek website
* Fintek F8000
Prefix: 'f8000'
Addresses scanned: none, address read from Super I/O config space
@ -66,13 +70,13 @@ printed when loading the driver.
Three different fan control modes are supported; the mode number is written
to the pwm#_enable file. Note that not all modes are supported on all
chips, and some modes may only be available in RPM / PWM mode on the F8000.
chips, and some modes may only be available in RPM / PWM mode.
Writing an unsupported mode will result in an invalid parameter error.
* 1: Manual mode
You ask for a specific PWM duty cycle / DC voltage or a specific % of
fan#_full_speed by writing to the pwm# file. This mode is only
available on the F8000 if the fan channel is in RPM mode.
available on the F71858FG / F8000 if the fan channel is in RPM mode.
* 2: Normal auto mode
You can define a number of temperature/fan speed trip points, which % the

View File

@ -7,7 +7,7 @@ henceforth as AEM.
Supported systems:
* Any recent IBM System X server with AEM support.
This includes the x3350, x3550, x3650, x3655, x3755, x3850 M2,
x3950 M2, and certain HS2x/LS2x/QS2x blades. The IPMI host interface
x3950 M2, and certain HC10/HS2x/LS2x/QS2x blades. The IPMI host interface
driver ("ipmi-si") needs to be loaded for this driver to do anything.
Prefix: 'ibmaem'
Datasheet: Not available

View File

@ -70,6 +70,7 @@ are interpreted as 0! For more on how written strings are interpreted see the
[0-*] denotes any positive number starting from 0
[1-*] denotes any positive number starting from 1
RO read only value
WO write only value
RW read/write value
Read/write values may be read-only for some chips, depending on the
@ -295,6 +296,24 @@ temp[1-*]_label Suggested temperature channel label.
user-space.
RO
temp[1-*]_lowest
Historical minimum temperature
Unit: millidegree Celsius
RO
temp[1-*]_highest
Historical maximum temperature
Unit: millidegree Celsius
RO
temp[1-*]_reset_history
Reset temp_lowest and temp_highest
WO
temp_reset_history
Reset temp_lowest and temp_highest for all sensors
WO
Some chips measure temperature using external thermistors and an ADC, and
report the temperature measurement as a voltage. Converting this voltage
back to a temperature (or the other way around for limits) requires

View File

@ -0,0 +1,42 @@
Kernel driver tmp401
====================
Supported chips:
* Texas Instruments TMP401
Prefix: 'tmp401'
Addresses scanned: I2C 0x4c
Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp401.html
* Texas Instruments TMP411
Prefix: 'tmp411'
Addresses scanned: I2C 0x4c
Datasheet: http://focus.ti.com/docs/prod/folders/print/tmp411.html
Authors:
Hans de Goede <hdegoede@redhat.com>
Andre Prendel <andre.prendel@gmx.de>
Description
-----------
This driver implements support for Texas Instruments TMP401 and
TMP411 chips. These chips implements one remote and one local
temperature sensor. Temperature is measured in degrees
Celsius. Resolution of the remote sensor is 0.0625 degree. Local
sensor resolution can be set to 0.5, 0.25, 0.125 or 0.0625 degree (not
supported by the driver so far, so using the default resolution of 0.5
degree).
The driver provides the common sysfs-interface for temperatures (see
/Documentation/hwmon/sysfs-interface under Temperatures).
The TMP411 chip is compatible with TMP401. It provides some additional
features.
* Minimum and Maximum temperature measured since power-on, chip-reset
Exported via sysfs attributes tempX_lowest and tempX_highest.
* Reset of historical minimum/maximum temperature measurements
Exported via sysfs attribute temp_reset_history. Writing 1 to this
file triggers a reset.

View File

@ -12,6 +12,10 @@ Supported chips:
Addresses scanned: ISA address retrieved from Super I/O registers
Datasheet:
http://www.nuvoton.com.tw/NR/rdonlyres/7885623D-A487-4CF9-A47F-30C5F73D6FE6/0/W83627DHG.pdf
* Winbond W83627DHG-P
Prefix: 'w83627dhg'
Addresses scanned: ISA address retrieved from Super I/O registers
Datasheet: not available
* Winbond W83667HG
Prefix: 'w83667hg'
Addresses scanned: ISA address retrieved from Super I/O registers
@ -28,8 +32,8 @@ Description
-----------
This driver implements support for the Winbond W83627EHF, W83627EHG,
W83627DHG and W83667HG super I/O chips. We will refer to them collectively
as Winbond chips.
W83627DHG, W83627DHG-P and W83667HG super I/O chips. We will refer to them
collectively as Winbond chips.
The chips implement three temperature sensors, five fan rotation
speed sensors, ten analog voltage sensors (only nine for the 627DHG), one
@ -135,3 +139,6 @@ done in the driver for all register addresses.
The DHG also supports PECI, where the DHG queries Intel CPU temperatures, and
the ICH8 southbridge gets that data via PECI from the DHG, so that the
southbridge drives the fans. And the DHG supports SST, a one-wire serial bus.
The DHG-P has an additional automatic fan speed control mode named Smart Fan
(TM) III+. This mode is not yet supported by the driver.

View File

@ -19,6 +19,9 @@ Supported adapters:
* VIA Technologies, Inc. VX800/VX820
Datasheet: available on http://linux.via.com.tw
* VIA Technologies, Inc. VX855/VX875
Datasheet: Availability unknown
Authors:
Kyösti Mälkki <kmalkki@cc.hut.fi>,
Mark D. Studebaker <mdsxyz123@yahoo.com>,
@ -53,6 +56,7 @@ Your lspci -n listing must show one of these :
device 1106:3287 (VT8251)
device 1106:8324 (CX700)
device 1106:8353 (VX800/VX820)
device 1106:8409 (VX855/VX875)
If none of these show up, you should look in the BIOS for settings like
enable ACPI / SMBus or even USB.

View File

@ -165,3 +165,47 @@ was done there. Two significant differences are:
Once again, method 3 should be avoided wherever possible. Explicit device
instantiation (methods 1 and 2) is much preferred for it is safer and
faster.
Method 4: Instantiate from user-space
-------------------------------------
In general, the kernel should know which I2C devices are connected and
what addresses they live at. However, in certain cases, it does not, so a
sysfs interface was added to let the user provide the information. This
interface is made of 2 attribute files which are created in every I2C bus
directory: new_device and delete_device. Both files are write only and you
must write the right parameters to them in order to properly instantiate,
respectively delete, an I2C device.
File new_device takes 2 parameters: the name of the I2C device (a string)
and the address of the I2C device (a number, typically expressed in
hexadecimal starting with 0x, but can also be expressed in decimal.)
File delete_device takes a single parameter: the address of the I2C
device. As no two devices can live at the same address on a given I2C
segment, the address is sufficient to uniquely identify the device to be
deleted.
Example:
# echo eeprom 0x50 > /sys/class/i2c-adapter/i2c-3/new_device
While this interface should only be used when in-kernel device declaration
can't be done, there is a variety of cases where it can be helpful:
* The I2C driver usually detects devices (method 3 above) but the bus
segment your device lives on doesn't have the proper class bit set and
thus detection doesn't trigger.
* The I2C driver usually detects devices, but your device lives at an
unexpected address.
* The I2C driver usually detects devices, but your device is not detected,
either because the detection routine is too strict, or because your
device is not officially supported yet but you know it is compatible.
* You are developing a driver on a test board, where you soldered the I2C
device yourself.
This interface is a replacement for the force_* module parameters some I2C
drivers implement. Being implemented in i2c-core rather than in each
device driver individually, it is much more efficient, and also has the
advantage that you do not have to reload the driver to change a setting.
You can also instantiate the device before the driver is loaded or even
available, and you don't need to know what driver the device needs.

View File

@ -126,19 +126,9 @@ different) configuration information, as do drivers handling chip variants
that can't be distinguished by protocol probing, or which need some board
specific information to operate correctly.
Accordingly, the I2C stack now has two models for associating I2C devices
with their drivers: the original "legacy" model, and a newer one that's
fully compatible with the Linux 2.6 driver model. These models do not mix,
since the "legacy" model requires drivers to create "i2c_client" device
objects after SMBus style probing, while the Linux driver model expects
drivers to be given such device objects in their probe() routines.
The legacy model is deprecated now and will soon be removed, so we no
longer document it here.
Standard Driver Model Binding ("New Style")
-------------------------------------------
Device/Driver Binding
---------------------
System infrastructure, typically board-specific initialization code or
boot firmware, reports what I2C devices exist. For example, there may be
@ -201,7 +191,7 @@ a given I2C bus. This is for example the case of hardware monitoring
devices on a PC's SMBus. In that case, you may want to let your driver
detect supported devices automatically. This is how the legacy model
was working, and is now available as an extension to the standard
driver model (so that we can finally get rid of the legacy model.)
driver model.
You simply have to define a detect callback which will attempt to
identify supported devices (returning 0 for supported ones and -ENODEV

View File

@ -278,7 +278,7 @@ struct input_event {
};
'time' is the timestamp, it returns the time at which the event happened.
Type is for example EV_REL for relative moment, REL_KEY for a keypress or
Type is for example EV_REL for relative moment, EV_KEY for a keypress or
release. More types are defined in include/linux/input.h.
'code' is event code, for example REL_X or KEY_BACKSPACE, again a complete

View File

@ -67,7 +67,12 @@ data with it.
struct rotary_encoder_platform_data is declared in
include/linux/rotary-encoder.h and needs to be filled with the number of
steps the encoder has and can carry information about externally inverted
signals (because of used invertig buffer or other reasons).
signals (because of an inverting buffer or other reasons). The encoder
can be set up to deliver input information as either an absolute or relative
axes. For relative axes the input event returns +/-1 for each step. For
absolute axes the position of the encoder can either roll over between zero
and the number of steps or will clamp at the maximum and zero depending on
the configuration.
Because GPIO to IRQ mapping is platform specific, this information must
be given in seperately to the driver. See the example below.
@ -85,6 +90,8 @@ be given in seperately to the driver. See the example below.
static struct rotary_encoder_platform_data my_rotary_encoder_info = {
.steps = 24,
.axis = ABS_X,
.relative_axis = false,
.rollover = false,
.gpio_a = GPIO_ROTARY_A,
.gpio_b = GPIO_ROTARY_B,
.inverted_a = 0,

View File

@ -149,6 +149,8 @@ Code Seq# Include File Comments
'p' 40-7F linux/nvram.h
'p' 80-9F user-space parport
<mailto:tim@cyberelk.net>
'p' a1-a4 linux/pps.h LinuxPPS
<mailto:giometti@linux.it>
'q' 00-1F linux/serio.h
'q' 80-FF Internet PhoneJACK, Internet LineJACK
<http://www.quicknet.net>

View File

@ -14,39 +14,37 @@ README
- general info on what you need and what to do for Linux ISDN.
README.FAQ
- general info for FAQ.
README.audio
- info for running audio over ISDN.
README.fax
- info for using Fax over ISDN.
README.gigaset
- info on the drivers for Siemens Gigaset ISDN adapters.
README.icn
- info on the ICN-ISDN-card and its driver.
README.HiSax
- info on the HiSax driver which replaces the old teles.
README.hfc-pci
- info on hfc-pci based cards.
README.pcbit
- info on the PCBIT-D ISDN adapter and driver.
README.syncppp
- info on running Sync PPP over ISDN.
syncPPP.FAQ
- frequently asked questions about running PPP over ISDN.
README.avmb1
- info on driver for AVM-B1 ISDN card.
README.act2000
- info on driver for IBM ACT-2000 card.
README.eicon
- info on driver for Eicon active cards.
README.audio
- info for running audio over ISDN.
README.avmb1
- info on driver for AVM-B1 ISDN card.
README.concap
- info on "CONCAP" encapsulation protocol interface used for X.25.
README.diversion
- info on module for isdn diversion services.
README.fax
- info for using Fax over ISDN.
README.gigaset
- info on the drivers for Siemens Gigaset ISDN adapters
README.hfc-pci
- info on hfc-pci based cards.
README.hysdn
- info on driver for Hypercope active HYSDN cards
README.icn
- info on the ICN-ISDN-card and its driver.
README.mISDN
- info on the Modular ISDN subsystem (mISDN)
README.pcbit
- info on the PCBIT-D ISDN adapter and driver.
README.sc
- info on driver for Spellcaster cards.
README.syncppp
- info on running Sync PPP over ISDN.
README.x25
- info for running X.25 over ISDN.
README.hysdn
- info on driver for Hypercope active HYSDN cards
README.mISDN
- info on the Modular ISDN subsystem (mISDN).
syncPPP.FAQ
- frequently asked questions about running PPP over ISDN.

View File

@ -45,7 +45,7 @@ From then on, Kernel CAPI may call the registered callback functions for the
device.
If the device becomes unusable for any reason (shutdown, disconnect ...), the
driver has to call capi_ctr_reseted(). This will prevent further calls to the
driver has to call capi_ctr_down(). This will prevent further calls to the
callback functions by Kernel CAPI.
@ -114,20 +114,36 @@ char *driver_name
int (*load_firmware)(struct capi_ctr *ctrlr, capiloaddata *ldata)
(optional) pointer to a callback function for sending firmware and
configuration data to the device
Return value: 0 on success, error code on error
Called in process context.
void (*reset_ctr)(struct capi_ctr *ctrlr)
pointer to a callback function for performing a reset on the device,
releasing all registered applications
(optional) pointer to a callback function for performing a reset on
the device, releasing all registered applications
Called in process context.
void (*register_appl)(struct capi_ctr *ctrlr, u16 applid,
capi_register_params *rparam)
void (*release_appl)(struct capi_ctr *ctrlr, u16 applid)
pointers to callback functions for registration and deregistration of
applications with the device
Calls to these functions are serialized by Kernel CAPI so that only
one call to any of them is active at any time.
u16 (*send_message)(struct capi_ctr *ctrlr, struct sk_buff *skb)
pointer to a callback function for sending a CAPI message to the
device
Return value: CAPI error code
If the method returns 0 (CAPI_NOERROR) the driver has taken ownership
of the skb and the caller may no longer access it. If it returns a
non-zero (error) value then ownership of the skb returns to the caller
who may reuse or free it.
The return value should only be used to signal problems with respect
to accepting or queueing the message. Errors occurring during the
actual processing of the message should be signaled with an
appropriate reply message.
Calls to this function are not serialized by Kernel CAPI, ie. it must
be prepared to be re-entered.
char *(*procinfo)(struct capi_ctr *ctrlr)
pointer to a callback function returning the entry for the device in
@ -138,6 +154,8 @@ read_proc_t *ctr_read_proc
system entry, /proc/capi/controllers/<n>; will be called with a
pointer to the device's capi_ctr structure as the last (data) argument
Note: Callback functions are never called in interrupt context.
- to be filled in before calling capi_ctr_ready():
u8 manu[CAPI_MANUFACTURER_LEN]
@ -153,6 +171,45 @@ u8 serial[CAPI_SERIAL_LEN]
value to return for CAPI_GET_SERIAL
4.3 The _cmsg Structure
(declared in <linux/isdn/capiutil.h>)
The _cmsg structure stores the contents of a CAPI 2.0 message in an easily
accessible form. It contains members for all possible CAPI 2.0 parameters, of
which only those appearing in the message type currently being processed are
actually used. Unused members should be set to zero.
Members are named after the CAPI 2.0 standard names of the parameters they
represent. See <linux/isdn/capiutil.h> for the exact spelling. Member data
types are:
u8 for CAPI parameters of type 'byte'
u16 for CAPI parameters of type 'word'
u32 for CAPI parameters of type 'dword'
_cstruct for CAPI parameters of type 'struct' not containing any
variably-sized (struct) subparameters (eg. 'Called Party Number')
The member is a pointer to a buffer containing the parameter in
CAPI encoding (length + content). It may also be NULL, which will
be taken to represent an empty (zero length) parameter.
_cmstruct for CAPI parameters of type 'struct' containing 'struct'
subparameters ('Additional Info' and 'B Protocol')
The representation is a single byte containing one of the values:
CAPI_DEFAULT: the parameter is empty
CAPI_COMPOSE: the values of the subparameters are stored
individually in the corresponding _cmsg structure members
Functions capi_cmsg2message() and capi_message2cmsg() are provided to convert
messages between their transport encoding described in the CAPI 2.0 standard
and their _cmsg structure representation. Note that capi_cmsg2message() does
not know or check the size of its destination buffer. The caller must make
sure it is big enough to accomodate the resulting CAPI message.
5. Lower Layer Interface Functions
(declared in <linux/isdn/capilli.h>)
@ -166,7 +223,7 @@ int detach_capi_ctr(struct capi_ctr *ctrlr)
register/unregister a device (controller) with Kernel CAPI
void capi_ctr_ready(struct capi_ctr *ctrlr)
void capi_ctr_reseted(struct capi_ctr *ctrlr)
void capi_ctr_down(struct capi_ctr *ctrlr)
signal controller ready/not ready
void capi_ctr_suspend_output(struct capi_ctr *ctrlr)
@ -211,3 +268,32 @@ CAPIMSG_CONTROL(m) CAPIMSG_SETCONTROL(m, contr) Controller/PLCI/NCCI
(u32)
CAPIMSG_DATALEN(m) CAPIMSG_SETDATALEN(m, len) Data Length (u16)
Library functions for working with _cmsg structures
(from <linux/isdn/capiutil.h>):
unsigned capi_cmsg2message(_cmsg *cmsg, u8 *msg)
Assembles a CAPI 2.0 message from the parameters in *cmsg, storing the
result in *msg.
unsigned capi_message2cmsg(_cmsg *cmsg, u8 *msg)
Disassembles the CAPI 2.0 message in *msg, storing the parameters in
*cmsg.
unsigned capi_cmsg_header(_cmsg *cmsg, u16 ApplId, u8 Command, u8 Subcommand,
u16 Messagenumber, u32 Controller)
Fills the header part and address field of the _cmsg structure *cmsg
with the given values, zeroing the remainder of the structure so only
parameters with non-default values need to be changed before sending
the message.
void capi_cmsg_answer(_cmsg *cmsg)
Sets the low bit of the Subcommand field in *cmsg, thereby converting
_REQ to _CONF and _IND to _RESP.
char *capi_cmd2str(u8 Command, u8 Subcommand)
Returns the CAPI 2.0 message name corresponding to the given command
and subcommand values, as a static ASCII string. The return value may
be NULL if the command/subcommand is not one of those defined in the
CAPI 2.0 standard.

View File

@ -149,10 +149,8 @@ GigaSet 307x Device Driver
configuration files and chat scripts in the gigaset-VERSION/ppp directory
in the driver packages from http://sourceforge.net/projects/gigaset307x/.
Please note that the USB drivers are not able to change the state of the
control lines (the M105 driver can be configured to use some undocumented
control requests, if you really need the control lines, though). This means
you must use "Stupid Mode" if you are using wvdial or you should use the
nocrtscts option of pppd.
control lines. This means you must use "Stupid Mode" if you are using
wvdial or you should use the nocrtscts option of pppd.
You must also assure that the ppp_async module is loaded with the parameter
flag_time=0. You can do this e.g. by adding a line like
@ -190,20 +188,19 @@ GigaSet 307x Device Driver
You can also use /sys/class/tty/ttyGxy/cidmode for changing the CID mode
setting (ttyGxy is ttyGU0 or ttyGB0).
2.6. M105 Undocumented USB Requests
------------------------------
The Gigaset M105 USB data box understands a couple of useful, but
undocumented USB commands. These requests are not used in normal
operation (for wireless access to the base), but are needed for access
to the M105's own configuration mode (registration to the base, baudrate
and line format settings, device status queries) via the gigacontr
utility. Their use is controlled by the kernel configuration option
"Support for undocumented USB requests" (CONFIG_GIGASET_UNDOCREQ). If you
encounter error code -ENOTTY when trying to use some features of the
M105, try setting that option to "y" via 'make {x,menu}config' and
recompiling the driver.
2.6. Unregistered Wireless Devices (M101/M105)
-----------------------------------------
The main purpose of the ser_gigaset and usb_gigaset drivers is to allow
the M101 and M105 wireless devices to be used as ISDN devices for ISDN
connections through a Gigaset base. Therefore they assume that the device
is registered to a DECT base.
If the M101/M105 device is not registered to a base, initialization of
the device fails, and a corresponding error message is logged by the
driver. In that situation, a restricted set of functions is available
which includes, in particular, those necessary for registering the device
to a base or for switching it between Fixed Part and Portable Part
modes.
3. Troubleshooting
---------------
@ -234,11 +231,12 @@ GigaSet 307x Device Driver
Select Unimodem mode for all DECT data adapters. (see section 2.4.)
Problem:
You want to configure your USB DECT data adapter (M105) but gigacontr
reports an error: "/dev/ttyGU0: Inappropriate ioctl for device".
Messages like this:
usb_gigaset 3-2:1.0: Could not initialize the device.
appear in your syslog.
Solution:
Recompile the usb_gigaset driver with the kernel configuration option
CONFIG_GIGASET_UNDOCREQ set to 'y'. (see section 2.6.)
Check whether your M10x wireless device is correctly registered to the
Gigaset base. (see section 2.6.)
3.2. Telling the driver to provide more information
----------------------------------------------

View File

@ -75,7 +75,7 @@ Linux カーネルパッチ投稿者向けチェックリスト
ビルドした上、動作確認を行ってください。
14: もしパッチがディスクのI/O性能などに影響を与えるようであれば、
'CONFIG_LBD'オプションを有効にした場合と無効にした場合の両方で
'CONFIG_LBDAF'オプションを有効にした場合と無効にした場合の両方で
テストを実施してみてください。
15: lockdepの機能を全て有効にした上で、全てのコードパスを評価してください。

View File

@ -48,6 +48,7 @@ parameter is applicable:
EFI EFI Partitioning (GPT) is enabled
EIDE EIDE/ATAPI support is enabled.
FB The frame buffer device is enabled.
GCOV GCOV profiling is enabled.
HW Appropriate hardware is enabled.
IA-64 IA-64 architecture is enabled.
IMA Integrity measurement architecture is enabled.
@ -228,14 +229,6 @@ and is between 256 and 4096 characters. It is defined in the file
to assume that this machine's pmtimer latches its value
and always returns good values.
acpi.power_nocheck= [HW,ACPI]
Format: 1/0 enable/disable the check of power state.
On some bogus BIOS the _PSC object/_STA object of
power resource can't return the correct device power
state. In such case it is unneccessary to check its
power state again in power transition.
1 : disable the power state check
acpi_sci= [HW,ACPI] ACPI System Control Interrupt trigger mode
Format: { level | edge | high | low }
@ -491,6 +484,13 @@ and is between 256 and 4096 characters. It is defined in the file
Also note the kernel might malfunction if you disable
some critical bits.
cmo_free_hint= [PPC] Format: { yes | no }
Specify whether pages are marked as being inactive
when they are freed. This is used in CMO environments
to determine OS memory pressure for page stealing by
a hypervisor.
Default: yes
code_bytes [X86] How many bytes of object code to print
in an oops report.
Range: 0 - 8192
@ -539,6 +539,10 @@ and is between 256 and 4096 characters. It is defined in the file
console=brl,ttyS0
For now, only VisioBraille is supported.
consoleblank= [KNL] The console blank (screen saver) timeout in
seconds. Defaults to 10*60 = 10mins. A value of 0
disables the blank timer.
coredump_filter=
[KNL] Change the default value for
/proc/<pid>/coredump_filter.
@ -785,6 +789,12 @@ and is between 256 and 4096 characters. It is defined in the file
Format: off | on
default: on
gcov_persist= [GCOV] When non-zero (default), profiling data for
kernel modules is saved and remains accessible via
debugfs, even when the module is unloaded/reloaded.
When zero, profiling data is discarded and associated
debugfs files are removed at module unload time.
gdth= [HW,SCSI]
See header of drivers/scsi/gdth.c.
@ -988,6 +998,7 @@ and is between 256 and 4096 characters. It is defined in the file
nomerge
forcesac
soft
pt [x86, IA64]
io7= [HW] IO7 for Marvel based alpha systems
See comment before marvel_specify_io7 in
@ -1351,6 +1362,27 @@ and is between 256 and 4096 characters. It is defined in the file
min_addr=nn[KMG] [KNL,BOOT,ia64] All physical memory below this
physical address is ignored.
mini2440= [ARM,HW,KNL]
Format:[0..2][b][c][t]
Default: "0tb"
MINI2440 configuration specification:
0 - The attached screen is the 3.5" TFT
1 - The attached screen is the 7" TFT
2 - The VGA Shield is attached (1024x768)
Leaving out the screen size parameter will not load
the TFT driver, and the framebuffer will be left
unconfigured.
b - Enable backlight. The TFT backlight pin will be
linked to the kernel VESA blanking code and a GPIO
LED. This parameter is not necessary when using the
VGA shield.
c - Enable the s3c camera interface.
t - Reserved for enabling touchscreen support. The
touchscreen support is not enabled in the mainstream
kernel as of 2.6.30, a preliminary port can be found
in the "bleeding edge" mini2440 support kernel at
http://repo.or.cz/w/linux-2.6/mini2440.git
mminit_loglevel=
[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
parameter allows control of the logging verbosity for
@ -1392,6 +1424,16 @@ and is between 256 and 4096 characters. It is defined in the file
mtdparts= [MTD]
See drivers/mtd/cmdlinepart.c.
onenand.bdry= [HW,MTD] Flex-OneNAND Boundary Configuration
Format: [die0_boundary][,die0_lock][,die1_boundary][,die1_lock]
boundary - index of last SLC block on Flex-OneNAND.
The remaining blocks are configured as MLC blocks.
lock - Configure if Flex-OneNAND boundary should be locked.
Once locked, the boundary cannot be changed.
1 indicates lock status, 0 indicates unlock status.
mtdset= [ARM]
ARM/S3C2412 JIVE boot control
@ -1758,6 +1800,9 @@ and is between 256 and 4096 characters. It is defined in the file
root domains (aka PCI segments, in ACPI-speak).
nommconf [X86] Disable use of MMCONFIG for PCI
Configuration
check_enable_amd_mmconf [X86] check for and enable
properly configured MMIO access to PCI
config space on AMD family 10h CPU
nomsi [MSI] If the PCI_MSI kernel config parameter is
enabled, this kernel boot option can be used to
disable the use of MSI interrupts system-wide.
@ -1847,6 +1892,12 @@ and is between 256 and 4096 characters. It is defined in the file
PAGE_SIZE is used as alignment.
PCI-PCI bridge can be specified, if resource
windows need to be expanded.
ecrc= Enable/disable PCIe ECRC (transaction layer
end-to-end CRC checking).
bios: Use BIOS/firmware settings. This is the
the default.
off: Turn ECRC off
on: Turn ECRC on.
pcie_aspm= [PCIE] Forcibly enable or disable PCIe Active State Power
Management.
@ -1864,6 +1915,12 @@ and is between 256 and 4096 characters. It is defined in the file
Format: { 0 | 1 }
See arch/parisc/kernel/pdc_chassis.c
percpu_alloc= [X86] Select which percpu first chunk allocator to use.
Allowed values are one of "lpage", "embed" and "4k".
See comments in arch/x86/kernel/setup_percpu.c for
details on each allocator. This parameter is primarily
for debugging and performance comparison.
pf. [PARIDE]
See Documentation/blockdev/paride.txt.
@ -2416,7 +2473,8 @@ and is between 256 and 4096 characters. It is defined in the file
tp720= [HW,PS2]
trace_buf_size=nn[KMG] [ftrace] will set tracing buffer size.
trace_buf_size=nn[KMG]
[FTRACE] will set tracing buffer size.
trix= [HW,OSS] MediaTrix AudioTrix Pro
Format:

773
Documentation/kmemcheck.txt Normal file
View File

@ -0,0 +1,773 @@
GETTING STARTED WITH KMEMCHECK
==============================
Vegard Nossum <vegardno@ifi.uio.no>
Contents
========
0. Introduction
1. Downloading
2. Configuring and compiling
3. How to use
3.1. Booting
3.2. Run-time enable/disable
3.3. Debugging
3.4. Annotating false positives
4. Reporting errors
5. Technical description
0. Introduction
===============
kmemcheck is a debugging feature for the Linux Kernel. More specifically, it
is a dynamic checker that detects and warns about some uses of uninitialized
memory.
Userspace programmers might be familiar with Valgrind's memcheck. The main
difference between memcheck and kmemcheck is that memcheck works for userspace
programs only, and kmemcheck works for the kernel only. The implementations
are of course vastly different. Because of this, kmemcheck is not as accurate
as memcheck, but it turns out to be good enough in practice to discover real
programmer errors that the compiler is not able to find through static
analysis.
Enabling kmemcheck on a kernel will probably slow it down to the extent that
the machine will not be usable for normal workloads such as e.g. an
interactive desktop. kmemcheck will also cause the kernel to use about twice
as much memory as normal. For this reason, kmemcheck is strictly a debugging
feature.
1. Downloading
==============
kmemcheck can only be downloaded using git. If you want to write patches
against the current code, you should use the kmemcheck development branch of
the tip tree. It is also possible to use the linux-next tree, which also
includes the latest version of kmemcheck.
Assuming that you've already cloned the linux-2.6.git repository, all you
have to do is add the -tip tree as a remote, like this:
$ git remote add tip git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip.git
To actually download the tree, fetch the remote:
$ git fetch tip
And to check out a new local branch with the kmemcheck code:
$ git checkout -b kmemcheck tip/kmemcheck
General instructions for the -tip tree can be found here:
http://people.redhat.com/mingo/tip.git/readme.txt
2. Configuring and compiling
============================
kmemcheck only works for the x86 (both 32- and 64-bit) platform. A number of
configuration variables must have specific settings in order for the kmemcheck
menu to even appear in "menuconfig". These are:
o CONFIG_CC_OPTIMIZE_FOR_SIZE=n
This option is located under "General setup" / "Optimize for size".
Without this, gcc will use certain optimizations that usually lead to
false positive warnings from kmemcheck. An example of this is a 16-bit
field in a struct, where gcc may load 32 bits, then discard the upper
16 bits. kmemcheck sees only the 32-bit load, and may trigger a
warning for the upper 16 bits (if they're uninitialized).
o CONFIG_SLAB=y or CONFIG_SLUB=y
This option is located under "General setup" / "Choose SLAB
allocator".
o CONFIG_FUNCTION_TRACER=n
This option is located under "Kernel hacking" / "Tracers" / "Kernel
Function Tracer"
When function tracing is compiled in, gcc emits a call to another
function at the beginning of every function. This means that when the
page fault handler is called, the ftrace framework will be called
before kmemcheck has had a chance to handle the fault. If ftrace then
modifies memory that was tracked by kmemcheck, the result is an
endless recursive page fault.
o CONFIG_DEBUG_PAGEALLOC=n
This option is located under "Kernel hacking" / "Debug page memory
allocations".
In addition, I highly recommend turning on CONFIG_DEBUG_INFO=y. This is also
located under "Kernel hacking". With this, you will be able to get line number
information from the kmemcheck warnings, which is extremely valuable in
debugging a problem. This option is not mandatory, however, because it slows
down the compilation process and produces a much bigger kernel image.
Now the kmemcheck menu should be visible (under "Kernel hacking" / "kmemcheck:
trap use of uninitialized memory"). Here follows a description of the
kmemcheck configuration variables:
o CONFIG_KMEMCHECK
This must be enabled in order to use kmemcheck at all...
o CONFIG_KMEMCHECK_[DISABLED | ENABLED | ONESHOT]_BY_DEFAULT
This option controls the status of kmemcheck at boot-time. "Enabled"
will enable kmemcheck right from the start, "disabled" will boot the
kernel as normal (but with the kmemcheck code compiled in, so it can
be enabled at run-time after the kernel has booted), and "one-shot" is
a special mode which will turn kmemcheck off automatically after
detecting the first use of uninitialized memory.
If you are using kmemcheck to actively debug a problem, then you
probably want to choose "enabled" here.
The one-shot mode is mostly useful in automated test setups because it
can prevent floods of warnings and increase the chances of the machine
surviving in case something is really wrong. In other cases, the one-
shot mode could actually be counter-productive because it would turn
itself off at the very first error -- in the case of a false positive
too -- and this would come in the way of debugging the specific
problem you were interested in.
If you would like to use your kernel as normal, but with a chance to
enable kmemcheck in case of some problem, it might be a good idea to
choose "disabled" here. When kmemcheck is disabled, most of the run-
time overhead is not incurred, and the kernel will be almost as fast
as normal.
o CONFIG_KMEMCHECK_QUEUE_SIZE
Select the maximum number of error reports to store in an internal
(fixed-size) buffer. Since errors can occur virtually anywhere and in
any context, we need a temporary storage area which is guaranteed not
to generate any other page faults when accessed. The queue will be
emptied as soon as a tasklet may be scheduled. If the queue is full,
new error reports will be lost.
The default value of 64 is probably fine. If some code produces more
than 64 errors within an irqs-off section, then the code is likely to
produce many, many more, too, and these additional reports seldom give
any more information (the first report is usually the most valuable
anyway).
This number might have to be adjusted if you are not using serial
console or similar to capture the kernel log. If you are using the
"dmesg" command to save the log, then getting a lot of kmemcheck
warnings might overflow the kernel log itself, and the earlier reports
will get lost in that way instead. Try setting this to 10 or so on
such a setup.
o CONFIG_KMEMCHECK_SHADOW_COPY_SHIFT
Select the number of shadow bytes to save along with each entry of the
error-report queue. These bytes indicate what parts of an allocation
are initialized, uninitialized, etc. and will be displayed when an
error is detected to help the debugging of a particular problem.
The number entered here is actually the logarithm of the number of
bytes that will be saved. So if you pick for example 5 here, kmemcheck
will save 2^5 = 32 bytes.
The default value should be fine for debugging most problems. It also
fits nicely within 80 columns.
o CONFIG_KMEMCHECK_PARTIAL_OK
This option (when enabled) works around certain GCC optimizations that
produce 32-bit reads from 16-bit variables where the upper 16 bits are
thrown away afterwards.
The default value (enabled) is recommended. This may of course hide
some real errors, but disabling it would probably produce a lot of
false positives.
o CONFIG_KMEMCHECK_BITOPS_OK
This option silences warnings that would be generated for bit-field
accesses where not all the bits are initialized at the same time. This
may also hide some real bugs.
This option is probably obsolete, or it should be replaced with
the kmemcheck-/bitfield-annotations for the code in question. The
default value is therefore fine.
Now compile the kernel as usual.
3. How to use
=============
3.1. Booting
============
First some information about the command-line options. There is only one
option specific to kmemcheck, and this is called "kmemcheck". It can be used
to override the default mode as chosen by the CONFIG_KMEMCHECK_*_BY_DEFAULT
option. Its possible settings are:
o kmemcheck=0 (disabled)
o kmemcheck=1 (enabled)
o kmemcheck=2 (one-shot mode)
If SLUB debugging has been enabled in the kernel, it may take precedence over
kmemcheck in such a way that the slab caches which are under SLUB debugging
will not be tracked by kmemcheck. In order to ensure that this doesn't happen
(even though it shouldn't by default), use SLUB's boot option "slub_debug",
like this: slub_debug=-
In fact, this option may also be used for fine-grained control over SLUB vs.
kmemcheck. For example, if the command line includes "kmemcheck=1
slub_debug=,dentry", then SLUB debugging will be used only for the "dentry"
slab cache, and with kmemcheck tracking all the other caches. This is advanced
usage, however, and is not generally recommended.
3.2. Run-time enable/disable
============================
When the kernel has booted, it is possible to enable or disable kmemcheck at
run-time. WARNING: This feature is still experimental and may cause false
positive warnings to appear. Therefore, try not to use this. If you find that
it doesn't work properly (e.g. you see an unreasonable amount of warnings), I
will be happy to take bug reports.
Use the file /proc/sys/kernel/kmemcheck for this purpose, e.g.:
$ echo 0 > /proc/sys/kernel/kmemcheck # disables kmemcheck
The numbers are the same as for the kmemcheck= command-line option.
3.3. Debugging
==============
A typical report will look something like this:
WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88003e4a2024)
80000000000000000000000000000000000000000088ffff0000000000000000
i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u
^
Pid: 1856, comm: ntpdate Not tainted 2.6.29-rc5 #264 945P-A
RIP: 0010:[<ffffffff8104ede8>] [<ffffffff8104ede8>] __dequeue_signal+0xc8/0x190
RSP: 0018:ffff88003cdf7d98 EFLAGS: 00210002
RAX: 0000000000000030 RBX: ffff88003d4ea968 RCX: 0000000000000009
RDX: ffff88003e5d6018 RSI: ffff88003e5d6024 RDI: ffff88003cdf7e84
RBP: ffff88003cdf7db8 R08: ffff88003e5d6000 R09: 0000000000000000
R10: 0000000000000080 R11: 0000000000000000 R12: 000000000000000e
R13: ffff88003cdf7e78 R14: ffff88003d530710 R15: ffff88003d5a98c8
FS: 0000000000000000(0000) GS:ffff880001982000(0063) knlGS:00000
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: ffff88003f806ea0 CR3: 000000003c036000 CR4: 00000000000006a0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
[<ffffffff8104f04e>] dequeue_signal+0x8e/0x170
[<ffffffff81050bd8>] get_signal_to_deliver+0x98/0x390
[<ffffffff8100b87d>] do_notify_resume+0xad/0x7d0
[<ffffffff8100c7b5>] int_signal+0x12/0x17
[<ffffffffffffffff>] 0xffffffffffffffff
The single most valuable information in this report is the RIP (or EIP on 32-
bit) value. This will help us pinpoint exactly which instruction that caused
the warning.
If your kernel was compiled with CONFIG_DEBUG_INFO=y, then all we have to do
is give this address to the addr2line program, like this:
$ addr2line -e vmlinux -i ffffffff8104ede8
arch/x86/include/asm/string_64.h:12
include/asm-generic/siginfo.h:287
kernel/signal.c:380
kernel/signal.c:410
The "-e vmlinux" tells addr2line which file to look in. IMPORTANT: This must
be the vmlinux of the kernel that produced the warning in the first place! If
not, the line number information will almost certainly be wrong.
The "-i" tells addr2line to also print the line numbers of inlined functions.
In this case, the flag was very important, because otherwise, it would only
have printed the first line, which is just a call to memcpy(), which could be
called from a thousand places in the kernel, and is therefore not very useful.
These inlined functions would not show up in the stack trace above, simply
because the kernel doesn't load the extra debugging information. This
technique can of course be used with ordinary kernel oopses as well.
In this case, it's the caller of memcpy() that is interesting, and it can be
found in include/asm-generic/siginfo.h, line 287:
281 static inline void copy_siginfo(struct siginfo *to, struct siginfo *from)
282 {
283 if (from->si_code < 0)
284 memcpy(to, from, sizeof(*to));
285 else
286 /* _sigchld is currently the largest know union member */
287 memcpy(to, from, __ARCH_SI_PREAMBLE_SIZE + sizeof(from->_sifields._sigchld));
288 }
Since this was a read (kmemcheck usually warns about reads only, though it can
warn about writes to unallocated or freed memory as well), it was probably the
"from" argument which contained some uninitialized bytes. Following the chain
of calls, we move upwards to see where "from" was allocated or initialized,
kernel/signal.c, line 380:
359 static void collect_signal(int sig, struct sigpending *list, siginfo_t *info)
360 {
...
367 list_for_each_entry(q, &list->list, list) {
368 if (q->info.si_signo == sig) {
369 if (first)
370 goto still_pending;
371 first = q;
...
377 if (first) {
378 still_pending:
379 list_del_init(&first->list);
380 copy_siginfo(info, &first->info);
381 __sigqueue_free(first);
...
392 }
393 }
Here, it is &first->info that is being passed on to copy_siginfo(). The
variable "first" was found on a list -- passed in as the second argument to
collect_signal(). We continue our journey through the stack, to figure out
where the item on "list" was allocated or initialized. We move to line 410:
395 static int __dequeue_signal(struct sigpending *pending, sigset_t *mask,
396 siginfo_t *info)
397 {
...
410 collect_signal(sig, pending, info);
...
414 }
Now we need to follow the "pending" pointer, since that is being passed on to
collect_signal() as "list". At this point, we've run out of lines from the
"addr2line" output. Not to worry, we just paste the next addresses from the
kmemcheck stack dump, i.e.:
[<ffffffff8104f04e>] dequeue_signal+0x8e/0x170
[<ffffffff81050bd8>] get_signal_to_deliver+0x98/0x390
[<ffffffff8100b87d>] do_notify_resume+0xad/0x7d0
[<ffffffff8100c7b5>] int_signal+0x12/0x17
$ addr2line -e vmlinux -i ffffffff8104f04e ffffffff81050bd8 \
ffffffff8100b87d ffffffff8100c7b5
kernel/signal.c:446
kernel/signal.c:1806
arch/x86/kernel/signal.c:805
arch/x86/kernel/signal.c:871
arch/x86/kernel/entry_64.S:694
Remember that since these addresses were found on the stack and not as the
RIP value, they actually point to the _next_ instruction (they are return
addresses). This becomes obvious when we look at the code for line 446:
422 int dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info)
423 {
...
431 signr = __dequeue_signal(&tsk->signal->shared_pending,
432 mask, info);
433 /*
434 * itimer signal ?
435 *
436 * itimers are process shared and we restart periodic
437 * itimers in the signal delivery path to prevent DoS
438 * attacks in the high resolution timer case. This is
439 * compliant with the old way of self restarting
440 * itimers, as the SIGALRM is a legacy signal and only
441 * queued once. Changing the restart behaviour to
442 * restart the timer in the signal dequeue path is
443 * reducing the timer noise on heavy loaded !highres
444 * systems too.
445 */
446 if (unlikely(signr == SIGALRM)) {
...
489 }
So instead of looking at 446, we should be looking at 431, which is the line
that executes just before 446. Here we see that what we are looking for is
&tsk->signal->shared_pending.
Our next task is now to figure out which function that puts items on this
"shared_pending" list. A crude, but efficient tool, is git grep:
$ git grep -n 'shared_pending' kernel/
...
kernel/signal.c:828: pending = group ? &t->signal->shared_pending : &t->pending;
kernel/signal.c:1339: pending = group ? &t->signal->shared_pending : &t->pending;
...
There were more results, but none of them were related to list operations,
and these were the only assignments. We inspect the line numbers more closely
and find that this is indeed where items are being added to the list:
816 static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
817 int group)
818 {
...
828 pending = group ? &t->signal->shared_pending : &t->pending;
...
851 q = __sigqueue_alloc(t, GFP_ATOMIC, (sig < SIGRTMIN &&
852 (is_si_special(info) ||
853 info->si_code >= 0)));
854 if (q) {
855 list_add_tail(&q->list, &pending->list);
...
890 }
and:
1309 int send_sigqueue(struct sigqueue *q, struct task_struct *t, int group)
1310 {
....
1339 pending = group ? &t->signal->shared_pending : &t->pending;
1340 list_add_tail(&q->list, &pending->list);
....
1347 }
In the first case, the list element we are looking for, "q", is being returned
from the function __sigqueue_alloc(), which looks like an allocation function.
Let's take a look at it:
187 static struct sigqueue *__sigqueue_alloc(struct task_struct *t, gfp_t flags,
188 int override_rlimit)
189 {
190 struct sigqueue *q = NULL;
191 struct user_struct *user;
192
193 /*
194 * We won't get problems with the target's UID changing under us
195 * because changing it requires RCU be used, and if t != current, the
196 * caller must be holding the RCU readlock (by way of a spinlock) and
197 * we use RCU protection here
198 */
199 user = get_uid(__task_cred(t)->user);
200 atomic_inc(&user->sigpending);
201 if (override_rlimit ||
202 atomic_read(&user->sigpending) <=
203 t->signal->rlim[RLIMIT_SIGPENDING].rlim_cur)
204 q = kmem_cache_alloc(sigqueue_cachep, flags);
205 if (unlikely(q == NULL)) {
206 atomic_dec(&user->sigpending);
207 free_uid(user);
208 } else {
209 INIT_LIST_HEAD(&q->list);
210 q->flags = 0;
211 q->user = user;
212 }
213
214 return q;
215 }
We see that this function initializes q->list, q->flags, and q->user. It seems
that now is the time to look at the definition of "struct sigqueue", e.g.:
14 struct sigqueue {
15 struct list_head list;
16 int flags;
17 siginfo_t info;
18 struct user_struct *user;
19 };
And, you might remember, it was a memcpy() on &first->info that caused the
warning, so this makes perfect sense. It also seems reasonable to assume that
it is the caller of __sigqueue_alloc() that has the responsibility of filling
out (initializing) this member.
But just which fields of the struct were uninitialized? Let's look at
kmemcheck's report again:
WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88003e4a2024)
80000000000000000000000000000000000000000088ffff0000000000000000
i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u
^
These first two lines are the memory dump of the memory object itself, and the
shadow bytemap, respectively. The memory object itself is in this case
&first->info. Just beware that the start of this dump is NOT the start of the
object itself! The position of the caret (^) corresponds with the address of
the read (ffff88003e4a2024).
The shadow bytemap dump legend is as follows:
i - initialized
u - uninitialized
a - unallocated (memory has been allocated by the slab layer, but has not
yet been handed off to anybody)
f - freed (memory has been allocated by the slab layer, but has been freed
by the previous owner)
In order to figure out where (relative to the start of the object) the
uninitialized memory was located, we have to look at the disassembly. For
that, we'll need the RIP address again:
RIP: 0010:[<ffffffff8104ede8>] [<ffffffff8104ede8>] __dequeue_signal+0xc8/0x190
$ objdump -d --no-show-raw-insn vmlinux | grep -C 8 ffffffff8104ede8:
ffffffff8104edc8: mov %r8,0x8(%r8)
ffffffff8104edcc: test %r10d,%r10d
ffffffff8104edcf: js ffffffff8104ee88 <__dequeue_signal+0x168>
ffffffff8104edd5: mov %rax,%rdx
ffffffff8104edd8: mov $0xc,%ecx
ffffffff8104eddd: mov %r13,%rdi
ffffffff8104ede0: mov $0x30,%eax
ffffffff8104ede5: mov %rdx,%rsi
ffffffff8104ede8: rep movsl %ds:(%rsi),%es:(%rdi)
ffffffff8104edea: test $0x2,%al
ffffffff8104edec: je ffffffff8104edf0 <__dequeue_signal+0xd0>
ffffffff8104edee: movsw %ds:(%rsi),%es:(%rdi)
ffffffff8104edf0: test $0x1,%al
ffffffff8104edf2: je ffffffff8104edf5 <__dequeue_signal+0xd5>
ffffffff8104edf4: movsb %ds:(%rsi),%es:(%rdi)
ffffffff8104edf5: mov %r8,%rdi
ffffffff8104edf8: callq ffffffff8104de60 <__sigqueue_free>
As expected, it's the "rep movsl" instruction from the memcpy() that causes
the warning. We know about REP MOVSL that it uses the register RCX to count
the number of remaining iterations. By taking a look at the register dump
again (from the kmemcheck report), we can figure out how many bytes were left
to copy:
RAX: 0000000000000030 RBX: ffff88003d4ea968 RCX: 0000000000000009
By looking at the disassembly, we also see that %ecx is being loaded with the
value $0xc just before (ffffffff8104edd8), so we are very lucky. Keep in mind
that this is the number of iterations, not bytes. And since this is a "long"
operation, we need to multiply by 4 to get the number of bytes. So this means
that the uninitialized value was encountered at 4 * (0xc - 0x9) = 12 bytes
from the start of the object.
We can now try to figure out which field of the "struct siginfo" that was not
initialized. This is the beginning of the struct:
40 typedef struct siginfo {
41 int si_signo;
42 int si_errno;
43 int si_code;
44
45 union {
..
92 } _sifields;
93 } siginfo_t;
On 64-bit, the int is 4 bytes long, so it must the the union member that has
not been initialized. We can verify this using gdb:
$ gdb vmlinux
...
(gdb) p &((struct siginfo *) 0)->_sifields
$1 = (union {...} *) 0x10
Actually, it seems that the union member is located at offset 0x10 -- which
means that gcc has inserted 4 bytes of padding between the members si_code
and _sifields. We can now get a fuller picture of the memory dump:
_----------------------------=> si_code
/ _--------------------=> (padding)
| / _------------=> _sifields(._kill._pid)
| | / _----=> _sifields(._kill._uid)
| | | /
-------|-------|-------|-------|
80000000000000000000000000000000000000000088ffff0000000000000000
i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u
This allows us to realize another important fact: si_code contains the value
0x80. Remember that x86 is little endian, so the first 4 bytes "80000000" are
really the number 0x00000080. With a bit of research, we find that this is
actually the constant SI_KERNEL defined in include/asm-generic/siginfo.h:
144 #define SI_KERNEL 0x80 /* sent by the kernel from somewhere */
This macro is used in exactly one place in the x86 kernel: In send_signal()
in kernel/signal.c:
816 static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
817 int group)
818 {
...
828 pending = group ? &t->signal->shared_pending : &t->pending;
...
851 q = __sigqueue_alloc(t, GFP_ATOMIC, (sig < SIGRTMIN &&
852 (is_si_special(info) ||
853 info->si_code >= 0)));
854 if (q) {
855 list_add_tail(&q->list, &pending->list);
856 switch ((unsigned long) info) {
...
865 case (unsigned long) SEND_SIG_PRIV:
866 q->info.si_signo = sig;
867 q->info.si_errno = 0;
868 q->info.si_code = SI_KERNEL;
869 q->info.si_pid = 0;
870 q->info.si_uid = 0;
871 break;
...
890 }
Not only does this match with the .si_code member, it also matches the place
we found earlier when looking for where siginfo_t objects are enqueued on the
"shared_pending" list.
So to sum up: It seems that it is the padding introduced by the compiler
between two struct fields that is uninitialized, and this gets reported when
we do a memcpy() on the struct. This means that we have identified a false
positive warning.
Normally, kmemcheck will not report uninitialized accesses in memcpy() calls
when both the source and destination addresses are tracked. (Instead, we copy
the shadow bytemap as well). In this case, the destination address clearly
was not tracked. We can dig a little deeper into the stack trace from above:
arch/x86/kernel/signal.c:805
arch/x86/kernel/signal.c:871
arch/x86/kernel/entry_64.S:694
And we clearly see that the destination siginfo object is located on the
stack:
782 static void do_signal(struct pt_regs *regs)
783 {
784 struct k_sigaction ka;
785 siginfo_t info;
...
804 signr = get_signal_to_deliver(&info, &ka, regs, NULL);
...
854 }
And this &info is what eventually gets passed to copy_siginfo() as the
destination argument.
Now, even though we didn't find an actual error here, the example is still a
good one, because it shows how one would go about to find out what the report
was all about.
3.4. Annotating false positives
===============================
There are a few different ways to make annotations in the source code that
will keep kmemcheck from checking and reporting certain allocations. Here
they are:
o __GFP_NOTRACK_FALSE_POSITIVE
This flag can be passed to kmalloc() or kmem_cache_alloc() (therefore
also to other functions that end up calling one of these) to indicate
that the allocation should not be tracked because it would lead to
a false positive report. This is a "big hammer" way of silencing
kmemcheck; after all, even if the false positive pertains to
particular field in a struct, for example, we will now lose the
ability to find (real) errors in other parts of the same struct.
Example:
/* No warnings will ever trigger on accessing any part of x */
x = kmalloc(sizeof *x, GFP_KERNEL | __GFP_NOTRACK_FALSE_POSITIVE);
o kmemcheck_bitfield_begin(name)/kmemcheck_bitfield_end(name) and
kmemcheck_annotate_bitfield(ptr, name)
The first two of these three macros can be used inside struct
definitions to signal, respectively, the beginning and end of a
bitfield. Additionally, this will assign the bitfield a name, which
is given as an argument to the macros.
Having used these markers, one can later use
kmemcheck_annotate_bitfield() at the point of allocation, to indicate
which parts of the allocation is part of a bitfield.
Example:
struct foo {
int x;
kmemcheck_bitfield_begin(flags);
int flag_a:1;
int flag_b:1;
kmemcheck_bitfield_end(flags);
int y;
};
struct foo *x = kmalloc(sizeof *x);
/* No warnings will trigger on accessing the bitfield of x */
kmemcheck_annotate_bitfield(x, flags);
Note that kmemcheck_annotate_bitfield() can be used even before the
return value of kmalloc() is checked -- in other words, passing NULL
as the first argument is legal (and will do nothing).
4. Reporting errors
===================
As we have seen, kmemcheck will produce false positive reports. Therefore, it
is not very wise to blindly post kmemcheck warnings to mailing lists and
maintainers. Instead, I encourage maintainers and developers to find errors
in their own code. If you get a warning, you can try to work around it, try
to figure out if it's a real error or not, or simply ignore it. Most
developers know their own code and will quickly and efficiently determine the
root cause of a kmemcheck report. This is therefore also the most efficient
way to work with kmemcheck.
That said, we (the kmemcheck maintainers) will always be on the lookout for
false positives that we can annotate and silence. So whatever you find,
please drop us a note privately! Kernel configs and steps to reproduce (if
available) are of course a great help too.
Happy hacking!
5. Technical description
========================
kmemcheck works by marking memory pages non-present. This means that whenever
somebody attempts to access the page, a page fault is generated. The page
fault handler notices that the page was in fact only hidden, and so it calls
on the kmemcheck code to make further investigations.
When the investigations are completed, kmemcheck "shows" the page by marking
it present (as it would be under normal circumstances). This way, the
interrupted code can continue as usual.
But after the instruction has been executed, we should hide the page again, so
that we can catch the next access too! Now kmemcheck makes use of a debugging
feature of the processor, namely single-stepping. When the processor has
finished the one instruction that generated the memory access, a debug
exception is raised. From here, we simply hide the page again and continue
execution, this time with the single-stepping feature turned off.
kmemcheck requires some assistance from the memory allocator in order to work.
The memory allocator needs to
1. Tell kmemcheck about newly allocated pages and pages that are about to
be freed. This allows kmemcheck to set up and tear down the shadow memory
for the pages in question. The shadow memory stores the status of each
byte in the allocation proper, e.g. whether it is initialized or
uninitialized.
2. Tell kmemcheck which parts of memory should be marked uninitialized.
There are actually a few more states, such as "not yet allocated" and
"recently freed".
If a slab cache is set up using the SLAB_NOTRACK flag, it will never return
memory that can take page faults because of kmemcheck.
If a slab cache is NOT set up using the SLAB_NOTRACK flag, callers can still
request memory with the __GFP_NOTRACK or __GFP_NOTRACK_FALSE_POSITIVE flags.
This does not prevent the page faults from occurring, however, but marks the
object in question as being initialized so that no warnings will ever be
produced for this object.
Currently, the SLAB and SLUB allocators are supported by kmemcheck.

View File

@ -507,9 +507,9 @@ http://www.linuxsymposium.org/2006/linuxsymposium_procv2.pdf (pages 101-115)
Appendix A: The kprobes debugfs interface
With recent kernels (> 2.6.20) the list of registered kprobes is visible
under the /debug/kprobes/ directory (assuming debugfs is mounted at /debug).
under the /sys/kernel/debug/kprobes/ directory (assuming debugfs is mounted at //sys/kernel/debug).
/debug/kprobes/list: Lists all registered probes on the system
/sys/kernel/debug/kprobes/list: Lists all registered probes on the system
c015d71a k vfs_read+0x0
c011a316 j do_fork+0x0
@ -525,7 +525,7 @@ virtual addresses that correspond to modules that've been unloaded),
such probes are marked with [GONE]. If the probe is temporarily disabled,
such probes are marked with [DISABLED].
/debug/kprobes/enabled: Turn kprobes ON/OFF forcibly.
/sys/kernel/debug/kprobes/enabled: Turn kprobes ON/OFF forcibly.
Provides a knob to globally and forcibly turn registered kprobes ON or OFF.
By default, all kprobes are enabled. By echoing "0" to this file, all

View File

@ -920,7 +920,7 @@ The available commands are:
echo '<LED number> off' >/proc/acpi/ibm/led
echo '<LED number> blink' >/proc/acpi/ibm/led
The <LED number> range is 0 to 7. The set of LEDs that can be
The <LED number> range is 0 to 15. The set of LEDs that can be
controlled varies from model to model. Here is the common ThinkPad
mapping:
@ -932,6 +932,11 @@ mapping:
5 - UltraBase battery slot
6 - (unknown)
7 - standby
8 - dock status 1
9 - dock status 2
10, 11 - (unknown)
12 - thinkvantage
13, 14, 15 - (unknown)
All of the above can be turned on and off and can be made to blink.
@ -940,10 +945,12 @@ sysfs notes:
The ThinkPad LED sysfs interface is described in detail by the LED class
documentation, in Documentation/leds-class.txt.
The leds are named (in LED ID order, from 0 to 7):
The LEDs are named (in LED ID order, from 0 to 12):
"tpacpi::power", "tpacpi:orange:batt", "tpacpi:green:batt",
"tpacpi::dock_active", "tpacpi::bay_active", "tpacpi::dock_batt",
"tpacpi::unknown_led", "tpacpi::standby".
"tpacpi::unknown_led", "tpacpi::standby", "tpacpi::dock_status1",
"tpacpi::dock_status2", "tpacpi::unknown_led2", "tpacpi::unknown_led3",
"tpacpi::thinkvantage".
Due to limitations in the sysfs LED class, if the status of the LED
indicators cannot be read due to an error, thinkpad-acpi will report it as
@ -958,6 +965,12 @@ ThinkPad indicator LED should blink in hardware accelerated mode, use the
"timer" trigger, and leave the delay_on and delay_off parameters set to
zero (to request hardware acceleration autodetection).
LEDs that are known not to exist in a given ThinkPad model are not
made available through the sysfs interface. If you have a dock and you
notice there are LEDs listed for your ThinkPad that do not exist (and
are not in the dock), or if you notice that there are missing LEDs,
a report to ibm-acpi-devel@lists.sourceforge.net is appreciated.
ACPI sounds -- /proc/acpi/ibm/beep
----------------------------------
@ -1156,17 +1169,19 @@ may not be distinct. Later Lenovo models that implement the ACPI
display backlight brightness control methods have 16 levels, ranging
from 0 to 15.
There are two interfaces to the firmware for direct brightness control,
EC and UCMS (or CMOS). To select which one should be used, use the
brightness_mode module parameter: brightness_mode=1 selects EC mode,
brightness_mode=2 selects UCMS mode, brightness_mode=3 selects EC
mode with NVRAM backing (so that brightness changes are remembered
across shutdown/reboot).
For IBM ThinkPads, there are two interfaces to the firmware for direct
brightness control, EC and UCMS (or CMOS). To select which one should be
used, use the brightness_mode module parameter: brightness_mode=1 selects
EC mode, brightness_mode=2 selects UCMS mode, brightness_mode=3 selects EC
mode with NVRAM backing (so that brightness changes are remembered across
shutdown/reboot).
The driver tries to select which interface to use from a table of
defaults for each ThinkPad model. If it makes a wrong choice, please
report this as a bug, so that we can fix it.
Lenovo ThinkPads only support brightness_mode=2 (UCMS).
When display backlight brightness controls are available through the
standard ACPI interface, it is best to use it instead of this direct
ThinkPad-specific interface. The driver will disable its native
@ -1254,7 +1269,7 @@ Fan control and monitoring: fan speed, fan enable/disable
procfs: /proc/acpi/ibm/fan
sysfs device attributes: (hwmon "thinkpad") fan1_input, pwm1,
pwm1_enable
pwm1_enable, fan2_input
sysfs hwmon driver attributes: fan_watchdog
NOTE NOTE NOTE: fan control operations are disabled by default for
@ -1267,6 +1282,9 @@ from the hardware registers of the embedded controller. This is known
to work on later R, T, X and Z series ThinkPads but may show a bogus
value on other models.
Some Lenovo ThinkPads support a secondary fan. This fan cannot be
controlled separately, it shares the main fan control.
Fan levels:
Most ThinkPad fans work in "levels" at the firmware interface. Level 0
@ -1397,6 +1415,11 @@ hwmon device attribute fan1_input:
which can take up to two minutes. May return rubbish on older
ThinkPads.
hwmon device attribute fan2_input:
Fan tachometer reading, in RPM, for the secondary fan.
Available only on some ThinkPads. If the secondary fan is
not installed, will always read 0.
hwmon driver attribute fan_watchdog:
Fan safety watchdog timer interval, in seconds. Minimum is
1 second, maximum is 120 seconds. 0 disables the watchdog.
@ -1555,3 +1578,7 @@ Sysfs interface changelog:
0x020300: hotkey enable/disable support removed, attributes
hotkey_bios_enabled and hotkey_enable deprecated and
marked for removal.
0x020400: Marker for 16 LEDs support. Also, LEDs that are known
to not exist in a given model are not registered with
the LED sysfs class anymore.

View File

@ -0,0 +1,50 @@
Kernel driver lp3944
====================
* National Semiconductor LP3944 Fun-light Chip
Prefix: 'lp3944'
Addresses scanned: None (see the Notes section below)
Datasheet: Publicly available at the National Semiconductor website
http://www.national.com/pf/LP/LP3944.html
Authors:
Antonio Ospite <ospite@studenti.unina.it>
Description
-----------
The LP3944 is a helper chip that can drive up to 8 leds, with two programmable
DIM modes; it could even be used as a gpio expander but this driver assumes it
is used as a led controller.
The DIM modes are used to set _blink_ patterns for leds, the pattern is
specified supplying two parameters:
- period: from 0s to 1.6s
- duty cycle: percentage of the period the led is on, from 0 to 100
Setting a led in DIM0 or DIM1 mode makes it blink according to the pattern.
See the datasheet for details.
LP3944 can be found on Motorola A910 smartphone, where it drives the rgb
leds, the camera flash light and the lcds power.
Notes
-----
The chip is used mainly in embedded contexts, so this driver expects it is
registered using the i2c_board_info mechanism.
To register the chip at address 0x60 on adapter 0, set the platform data
according to include/linux/leds-lp3944.h, set the i2c board info:
static struct i2c_board_info __initdata a910_i2c_board_info[] = {
{
I2C_BOARD_INFO("lp3944", 0x60),
.platform_data = &a910_lp3944_leds,
},
};
and register it in the platform init function
i2c_register_board_info(0, a910_i2c_board_info,
ARRAY_SIZE(a910_i2c_board_info));

View File

@ -36,10 +36,15 @@ This file contains
6.2 local loopback of sent frames
6.3 CAN controller hardware filters
6.4 The virtual CAN driver (vcan)
6.5 currently supported CAN hardware
6.6 todo
6.5 The CAN network device driver interface
6.5.1 Netlink interface to set/get devices properties
6.5.2 Setting the CAN bit-timing
6.5.3 Starting and stopping the CAN network device
6.6 supported CAN hardware
7 Credits
7 Socket CAN resources
8 Credits
============================================================================
@ -234,6 +239,8 @@ solution for a couple of reasons:
the user application using the common CAN filter mechanisms. Inside
this filter definition the (interested) type of errors may be
selected. The reception of error frames is disabled by default.
The format of the CAN error frame is briefly decribed in the Linux
header file "include/linux/can/error.h".
4. How to use Socket CAN
------------------------
@ -605,61 +612,213 @@ solution for a couple of reasons:
removal of vcan network devices can be managed with the ip(8) tool:
- Create a virtual CAN network interface:
ip link add type vcan
$ ip link add type vcan
- Create a virtual CAN network interface with a specific name 'vcan42':
ip link add dev vcan42 type vcan
$ ip link add dev vcan42 type vcan
- Remove a (virtual CAN) network interface 'vcan42':
ip link del vcan42
$ ip link del vcan42
The tool 'vcan' from the SocketCAN SVN repository on BerliOS is obsolete.
6.5 The CAN network device driver interface
Virtual CAN network device creation in older Kernels:
In Linux Kernel versions < 2.6.24 the vcan driver creates 4 vcan
netdevices at module load time by default. This value can be changed
with the module parameter 'numdev'. E.g. 'modprobe vcan numdev=8'
The CAN network device driver interface provides a generic interface
to setup, configure and monitor CAN network devices. The user can then
configure the CAN device, like setting the bit-timing parameters, via
the netlink interface using the program "ip" from the "IPROUTE2"
utility suite. The following chapter describes briefly how to use it.
Furthermore, the interface uses a common data structure and exports a
set of common functions, which all real CAN network device drivers
should use. Please have a look to the SJA1000 or MSCAN driver to
understand how to use them. The name of the module is can-dev.ko.
6.5 currently supported CAN hardware
6.5.1 Netlink interface to set/get devices properties
On the project website http://developer.berlios.de/projects/socketcan
there are different drivers available:
The CAN device must be configured via netlink interface. The supported
netlink message types are defined and briefly described in
"include/linux/can/netlink.h". CAN link support for the program "ip"
of the IPROUTE2 utility suite is avaiable and it can be used as shown
below:
vcan: Virtual CAN interface driver (if no real hardware is available)
sja1000: Philips SJA1000 CAN controller (recommended)
i82527: Intel i82527 CAN controller
mscan: Motorola/Freescale CAN controller (e.g. inside SOC MPC5200)
ccan: CCAN controller core (e.g. inside SOC h7202)
slcan: For a bunch of CAN adaptors that are attached via a
serial line ASCII protocol (for serial / USB adaptors)
- Setting CAN device properties:
Additionally the different CAN adaptors (ISA/PCI/PCMCIA/USB/Parport)
from PEAK Systemtechnik support the CAN netdevice driver model
since Linux driver v6.0: http://www.peak-system.com/linux/index.htm
$ ip link set can0 type can help
Usage: ip link set DEVICE type can
[ bitrate BITRATE [ sample-point SAMPLE-POINT] ] |
[ tq TQ prop-seg PROP_SEG phase-seg1 PHASE-SEG1
phase-seg2 PHASE-SEG2 [ sjw SJW ] ]
Please check the Mailing Lists on the berlios OSS project website.
[ loopback { on | off } ]
[ listen-only { on | off } ]
[ triple-sampling { on | off } ]
6.6 todo
[ restart-ms TIME-MS ]
[ restart ]
The configuration interface for CAN network drivers is still an open
issue that has not been finalized in the socketcan project. Also the
idea of having a library module (candev.ko) that holds functions
that are needed by all CAN netdevices is not ready to ship.
Your contribution is welcome.
Where: BITRATE := { 1..1000000 }
SAMPLE-POINT := { 0.000..0.999 }
TQ := { NUMBER }
PROP-SEG := { 1..8 }
PHASE-SEG1 := { 1..8 }
PHASE-SEG2 := { 1..8 }
SJW := { 1..4 }
RESTART-MS := { 0 | NUMBER }
7. Credits
- Display CAN device details and statistics:
$ ip -details -statistics link show can0
2: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UP qlen 10
link/can
can <TRIPLE-SAMPLING> state ERROR-ACTIVE restart-ms 100
bitrate 125000 sample_point 0.875
tq 125 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1
sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
clock 8000000
re-started bus-errors arbit-lost error-warn error-pass bus-off
41 17457 0 41 42 41
RX: bytes packets errors dropped overrun mcast
140859 17608 17457 0 0 0
TX: bytes packets errors dropped carrier collsns
861 112 0 41 0 0
More info to the above output:
"<TRIPLE-SAMPLING>"
Shows the list of selected CAN controller modes: LOOPBACK,
LISTEN-ONLY, or TRIPLE-SAMPLING.
"state ERROR-ACTIVE"
The current state of the CAN controller: "ERROR-ACTIVE",
"ERROR-WARNING", "ERROR-PASSIVE", "BUS-OFF" or "STOPPED"
"restart-ms 100"
Automatic restart delay time. If set to a non-zero value, a
restart of the CAN controller will be triggered automatically
in case of a bus-off condition after the specified delay time
in milliseconds. By default it's off.
"bitrate 125000 sample_point 0.875"
Shows the real bit-rate in bits/sec and the sample-point in the
range 0.000..0.999. If the calculation of bit-timing parameters
is enabled in the kernel (CONFIG_CAN_CALC_BITTIMING=y), the
bit-timing can be defined by setting the "bitrate" argument.
Optionally the "sample-point" can be specified. By default it's
0.000 assuming CIA-recommended sample-points.
"tq 125 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1"
Shows the time quanta in ns, propagation segment, phase buffer
segment 1 and 2 and the synchronisation jump width in units of
tq. They allow to define the CAN bit-timing in a hardware
independent format as proposed by the Bosch CAN 2.0 spec (see
chapter 8 of http://www.semiconductors.bosch.de/pdf/can2spec.pdf).
"sja1000: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
clock 8000000"
Shows the bit-timing constants of the CAN controller, here the
"sja1000". The minimum and maximum values of the time segment 1
and 2, the synchronisation jump width in units of tq, the
bitrate pre-scaler and the CAN system clock frequency in Hz.
These constants could be used for user-defined (non-standard)
bit-timing calculation algorithms in user-space.
"re-started bus-errors arbit-lost error-warn error-pass bus-off"
Shows the number of restarts, bus and arbitration lost errors,
and the state changes to the error-warning, error-passive and
bus-off state. RX overrun errors are listed in the "overrun"
field of the standard network statistics.
6.5.2 Setting the CAN bit-timing
The CAN bit-timing parameters can always be defined in a hardware
independent format as proposed in the Bosch CAN 2.0 specification
specifying the arguments "tq", "prop_seg", "phase_seg1", "phase_seg2"
and "sjw":
$ ip link set canX type can tq 125 prop-seg 6 \
phase-seg1 7 phase-seg2 2 sjw 1
If the kernel option CONFIG_CAN_CALC_BITTIMING is enabled, CIA
recommended CAN bit-timing parameters will be calculated if the bit-
rate is specified with the argument "bitrate":
$ ip link set canX type can bitrate 125000
Note that this works fine for the most common CAN controllers with
standard bit-rates but may *fail* for exotic bit-rates or CAN system
clock frequencies. Disabling CONFIG_CAN_CALC_BITTIMING saves some
space and allows user-space tools to solely determine and set the
bit-timing parameters. The CAN controller specific bit-timing
constants can be used for that purpose. They are listed by the
following command:
$ ip -details link show can0
...
sja1000: clock 8000000 tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
6.5.3 Starting and stopping the CAN network device
A CAN network device is started or stopped as usual with the command
"ifconfig canX up/down" or "ip link set canX up/down". Be aware that
you *must* define proper bit-timing parameters for real CAN devices
before you can start it to avoid error-prone default settings:
$ ip link set canX up type can bitrate 125000
A device may enter the "bus-off" state if too much errors occurred on
the CAN bus. Then no more messages are received or sent. An automatic
bus-off recovery can be enabled by setting the "restart-ms" to a
non-zero value, e.g.:
$ ip link set canX type can restart-ms 100
Alternatively, the application may realize the "bus-off" condition
by monitoring CAN error frames and do a restart when appropriate with
the command:
$ ip link set canX type can restart
Note that a restart will also create a CAN error frame (see also
chapter 3.4).
6.6 Supported CAN hardware
Please check the "Kconfig" file in "drivers/net/can" to get an actual
list of the support CAN hardware. On the Socket CAN project website
(see chapter 7) there might be further drivers available, also for
older kernel versions.
7. Socket CAN resources
-----------------------
You can find further resources for Socket CAN like user space tools,
support for old kernel versions, more drivers, mailing lists, etc.
at the BerliOS OSS project website for Socket CAN:
http://developer.berlios.de/projects/socketcan
If you have questions, bug fixes, etc., don't hesitate to post them to
the Socketcan-Users mailing list. But please search the archives first.
8. Credits
----------
Oliver Hartkopp (PF_CAN core, filters, drivers, bcm)
Oliver Hartkopp (PF_CAN core, filters, drivers, bcm, SJA1000 driver)
Urs Thuermann (PF_CAN core, kernel integration, socket interfaces, raw, vcan)
Jan Kizka (RT-SocketCAN core, Socket-API reconciliation)
Wolfgang Grandegger (RT-SocketCAN core & drivers, Raw Socket-API reviews)
Wolfgang Grandegger (RT-SocketCAN core & drivers, Raw Socket-API reviews,
CAN device driver interface, MSCAN driver)
Robert Schwebel (design reviews, PTXdist integration)
Marc Kleine-Budde (design reviews, Kernel 2.6 cleanups, drivers)
Benedikt Spranger (reviews)
Thomas Gleixner (LKML reviews, coding style, posting hints)
Andrey Volkov (kernel subtree structure, ioctls, mscan driver)
Andrey Volkov (kernel subtree structure, ioctls, MSCAN driver)
Matthias Brukner (first SJA1000 CAN netdevice implementation Q2/2003)
Klaus Hitschler (PEAK driver integration)
Uwe Koppe (CAN netdevices with PF_PACKET approach)
Michael Schulze (driver layer loopback requirement, RT CAN drivers review)
Pavel Pisa (Bit-timing calculation)
Sascha Hauer (SJA1000 platform driver)
Sebastian Haas (SJA1000 EMS PCI driver)
Markus Plessing (SJA1000 EMS PCI driver)
Per Dalen (SJA1000 Kvaser PCI driver)
Sam Ravnborg (reviews, coding style, kbuild help)

View File

@ -0,0 +1,76 @@
Linux IEEE 802.15.4 implementation
Introduction
============
The Linux-ZigBee project goal is to provide complete implementation
of IEEE 802.15.4 / ZigBee / 6LoWPAN protocols. IEEE 802.15.4 is a stack
of protocols for organizing Low-Rate Wireless Personal Area Networks.
Currently only IEEE 802.15.4 layer is implemented. We have choosen
to use plain Berkeley socket API, the generic Linux networking stack
to transfer IEEE 802.15.4 messages and a special protocol over genetlink
for configuration/management
Socket API
==========
int sd = socket(PF_IEEE802154, SOCK_DGRAM, 0);
.....
The address family, socket addresses etc. are defined in the
include/net/ieee802154/af_ieee802154.h header or in the special header
in our userspace package (see either linux-zigbee sourceforge download page
or git tree at git://linux-zigbee.git.sourceforge.net/gitroot/linux-zigbee).
One can use SOCK_RAW for passing raw data towards device xmit function. YMMV.
MLME - MAC Level Management
============================
Most of IEEE 802.15.4 MLME interfaces are directly mapped on netlink commands.
See the include/net/ieee802154/nl802154.h header. Our userspace tools package
(see above) provides CLI configuration utility for radio interfaces and simple
coordinator for IEEE 802.15.4 networks as an example users of MLME protocol.
Kernel side
=============
Like with WiFi, there are several types of devices implementing IEEE 802.15.4.
1) 'HardMAC'. The MAC layer is implemented in the device itself, the device
exports MLME and data API.
2) 'SoftMAC' or just radio. These types of devices are just radio transceivers
possibly with some kinds of acceleration like automatic CRC computation and
comparation, automagic ACK handling, address matching, etc.
Those types of devices require different approach to be hooked into Linux kernel.
HardMAC
=======
See the header include/net/ieee802154/netdevice.h. You have to implement Linux
net_device, with .type = ARPHRD_IEEE802154. Data is exchanged with socket family
code via plain sk_buffs. The control block of sk_buffs will contain additional
info as described in the struct ieee802154_mac_cb.
To hook the MLME interface you have to populate the ml_priv field of your
net_device with a pointer to struct ieee802154_mlme_ops instance. All fields are
required.
We provide an example of simple HardMAC driver at drivers/ieee802154/fakehard.c
SoftMAC
=======
We are going to provide intermediate layer impelementing IEEE 802.15.4 MAC
in software. This is currently WIP.
See header include/net/ieee802154/mac802154.h and several drivers in
drivers/ieee802154/

View File

@ -168,7 +168,16 @@ tcp_dsack - BOOLEAN
Allows TCP to send "duplicate" SACKs.
tcp_ecn - BOOLEAN
Enable Explicit Congestion Notification in TCP.
Enable Explicit Congestion Notification (ECN) in TCP. ECN is only
used when both ends of the TCP flow support it. It is useful to
avoid losses due to congestion (when the bottleneck router supports
ECN).
Possible values are:
0 disable ECN
1 ECN enabled
2 Only server-side ECN enabled. If the other end does
not support ECN, behavior is like with ECN disabled.
Default: 2
tcp_fack - BOOLEAN
Enable FACK congestion avoidance and fast retransmission.
@ -1048,6 +1057,13 @@ disable_ipv6 - BOOLEAN
address.
Default: FALSE (enable IPv6 operation)
When this value is changed from 1 to 0 (IPv6 is being enabled),
it will dynamically create a link-local address on the given
interface and start Duplicate Address Detection, if necessary.
When this value is changed from 0 to 1 (IPv6 is being disabled),
it will dynamically delete all address on the given interface.
accept_dad - INTEGER
Whether to accept DAD (Duplicate Address Detection).
0: Disable DAD

View File

@ -33,3 +33,40 @@ disable
A reboot is required to enable IPv6.
autoconf
Specifies whether to enable IPv6 address autoconfiguration
on all interfaces. This might be used when one does not wish
for addresses to be automatically generated from prefixes
received in Router Advertisements.
The possible values and their effects are:
0
IPv6 address autoconfiguration is disabled on all interfaces.
Only the IPv6 loopback address (::1) and link-local addresses
will be added to interfaces.
1
IPv6 address autoconfiguration is enabled on all interfaces.
This is the default value.
disable_ipv6
Specifies whether to disable IPv6 on all interfaces.
This might be used when no IPv6 addresses are desired.
The possible values and their effects are:
0
IPv6 is enabled on all interfaces.
This is the default value.
1
IPv6 is disabled on all interfaces.
No IPv6 addresses will be added to interfaces.

View File

@ -12,38 +12,22 @@ following format:
The radiotap format is discussed in
./Documentation/networking/radiotap-headers.txt.
Despite 13 radiotap argument types are currently defined, most only make sense
Despite many radiotap parameters being currently defined, most only make sense
to appear on received packets. The following information is parsed from the
radiotap headers and used to control injection:
* IEEE80211_RADIOTAP_RATE
rate in 500kbps units, automatic if invalid or not present
* IEEE80211_RADIOTAP_ANTENNA
antenna to use, automatic if not present
* IEEE80211_RADIOTAP_DBM_TX_POWER
transmit power in dBm, automatic if not present
* IEEE80211_RADIOTAP_FLAGS
IEEE80211_RADIOTAP_F_FCS: FCS will be removed and recalculated
IEEE80211_RADIOTAP_F_WEP: frame will be encrypted if key available
IEEE80211_RADIOTAP_F_FRAG: frame will be fragmented if longer than the
current fragmentation threshold. Note that
this flag is only reliable when software
fragmentation is enabled)
current fragmentation threshold.
The injection code can also skip all other currently defined radiotap fields
facilitating replay of captured radiotap headers directly.
Here is an example valid radiotap header defining these three parameters
Here is an example valid radiotap header defining some parameters
0x00, 0x00, // <-- radiotap version
0x0b, 0x00, // <- radiotap header length
@ -72,8 +56,8 @@ interface), along the following lines:
...
r = pcap_inject(ppcap, u8aSendBuffer, nLength);
You can also find sources for a complete inject test applet here:
You can also find a link to a complete inject application here:
http://penumbra.warmcat.com/_twk/tiki-index.php?page=packetspammer
http://wireless.kernel.org/en/users/Documentation/packetspammer
Andy Green <andy@warmcat.com>

View File

@ -38,9 +38,6 @@ ifinfomsg::if_flags & IFF_LOWER_UP:
ifinfomsg::if_flags & IFF_DORMANT:
Driver has signaled netif_dormant_on()
These interface flags can also be queried without netlink using the
SIOCGIFFLAGS ioctl.
TLV IFLA_OPERSTATE
contains RFC2863 state of the interface in numeric representation:

View File

@ -4,16 +4,18 @@
This file documents the CONFIG_PACKET_MMAP option available with the PACKET
socket interface on 2.4 and 2.6 kernels. This type of sockets is used for
capture network traffic with utilities like tcpdump or any other that uses
the libpcap library.
You can find the latest version of this document at
capture network traffic with utilities like tcpdump or any other that needs
raw access to network interface.
You can find the latest version of this document at:
http://pusa.uv.es/~ulisses/packet_mmap/
Please send me your comments to
Howto can be found at:
http://wiki.gnu-log.net (packet_mmap)
Please send your comments to
Ulisses Alonso Camaró <uaca@i.hate.spam.alumni.uv.es>
Johann Baudy <johann.baudy@gnu-log.net>
-------------------------------------------------------------------------------
+ Why use PACKET_MMAP
@ -25,19 +27,24 @@ to capture each packet, it requires two if you want to get packet's
timestamp (like libpcap always does).
In the other hand PACKET_MMAP is very efficient. PACKET_MMAP provides a size
configurable circular buffer mapped in user space. This way reading packets just
needs to wait for them, most of the time there is no need to issue a single
system call. By using a shared buffer between the kernel and the user
also has the benefit of minimizing packet copies.
configurable circular buffer mapped in user space that can be used to either
send or receive packets. This way reading packets just needs to wait for them,
most of the time there is no need to issue a single system call. Concerning
transmission, multiple packets can be sent through one system call to get the
highest bandwidth.
By using a shared buffer between the kernel and the user also has the benefit
of minimizing packet copies.
It's fine to use PACKET_MMAP to improve the performance of the capture process,
but it isn't everything. At least, if you are capturing at high speeds (this
is relative to the cpu speed), you should check if the device driver of your
network interface card supports some sort of interrupt load mitigation or
(even better) if it supports NAPI, also make sure it is enabled.
It's fine to use PACKET_MMAP to improve the performance of the capture and
transmission process, but it isn't everything. At least, if you are capturing
at high speeds (this is relative to the cpu speed), you should check if the
device driver of your network interface card supports some sort of interrupt
load mitigation or (even better) if it supports NAPI, also make sure it is
enabled. For transmission, check the MTU (Maximum Transmission Unit) used and
supported by devices of your network.
--------------------------------------------------------------------------------
+ How to use CONFIG_PACKET_MMAP
+ How to use CONFIG_PACKET_MMAP to improve capture process
--------------------------------------------------------------------------------
From the user standpoint, you should use the higher level libpcap library, which
@ -57,7 +64,7 @@ the low level details or want to improve libpcap by including PACKET_MMAP
support.
--------------------------------------------------------------------------------
+ How to use CONFIG_PACKET_MMAP directly
+ How to use CONFIG_PACKET_MMAP directly to improve capture process
--------------------------------------------------------------------------------
From the system calls stand point, the use of PACKET_MMAP involves
@ -66,6 +73,7 @@ the following process:
[setup] socket() -------> creation of the capture socket
setsockopt() ---> allocation of the circular buffer (ring)
option: PACKET_RX_RING
mmap() ---------> mapping of the allocated buffer to the
user process
@ -96,6 +104,65 @@ Next I will describe PACKET_MMAP settings and it's constraints,
also the mapping of the circular buffer in the user process and
the use of this buffer.
--------------------------------------------------------------------------------
+ How to use CONFIG_PACKET_MMAP directly to improve transmission process
--------------------------------------------------------------------------------
Transmission process is similar to capture as shown below.
[setup] socket() -------> creation of the transmission socket
setsockopt() ---> allocation of the circular buffer (ring)
option: PACKET_TX_RING
bind() ---------> bind transmission socket with a network interface
mmap() ---------> mapping of the allocated buffer to the
user process
[transmission] poll() ---------> wait for free packets (optional)
send() ---------> send all packets that are set as ready in
the ring
The flag MSG_DONTWAIT can be used to return
before end of transfer.
[shutdown] close() --------> destruction of the transmission socket and
deallocation of all associated resources.
Binding the socket to your network interface is mandatory (with zero copy) to
know the header size of frames used in the circular buffer.
As capture, each frame contains two parts:
--------------------
| struct tpacket_hdr | Header. It contains the status of
| | of this frame
|--------------------|
| data buffer |
. . Data that will be sent over the network interface.
. .
--------------------
bind() associates the socket to your network interface thanks to
sll_ifindex parameter of struct sockaddr_ll.
Initialization example:
struct sockaddr_ll my_addr;
struct ifreq s_ifr;
...
strncpy (s_ifr.ifr_name, "eth0", sizeof(s_ifr.ifr_name));
/* get interface index of eth0 */
ioctl(this->socket, SIOCGIFINDEX, &s_ifr);
/* fill sockaddr_ll struct to prepare binding */
my_addr.sll_family = AF_PACKET;
my_addr.sll_protocol = ETH_P_ALL;
my_addr.sll_ifindex = s_ifr.ifr_ifindex;
/* bind socket to eth0 */
bind(this->socket, (struct sockaddr *)&my_addr, sizeof(struct sockaddr_ll));
A complete tutorial is available at: http://wiki.gnu-log.net/
--------------------------------------------------------------------------------
+ PACKET_MMAP settings
--------------------------------------------------------------------------------
@ -103,7 +170,10 @@ the use of this buffer.
To setup PACKET_MMAP from user level code is done with a call like
- Capture process
setsockopt(fd, SOL_PACKET, PACKET_RX_RING, (void *) &req, sizeof(req))
- Transmission process
setsockopt(fd, SOL_PACKET, PACKET_TX_RING, (void *) &req, sizeof(req))
The most significant argument in the previous call is the req parameter,
this parameter must to have the following structure:
@ -117,11 +187,11 @@ this parameter must to have the following structure:
};
This structure is defined in /usr/include/linux/if_packet.h and establishes a
circular buffer (ring) of unswappable memory mapped in the capture process.
circular buffer (ring) of unswappable memory.
Being mapped in the capture process allows reading the captured frames and
related meta-information like timestamps without requiring a system call.
Captured frames are grouped in blocks. Each block is a physically contiguous
Frames are grouped in blocks. Each block is a physically contiguous
region of memory and holds tp_block_size/tp_frame_size frames. The total number
of blocks is tp_block_nr. Note that tp_frame_nr is a redundant parameter because
@ -336,6 +406,7 @@ struct tpacket_hdr). If this field is 0 means that the frame is ready
to be used for the kernel, If not, there is a frame the user can read
and the following flags apply:
+++ Capture process:
from include/linux/if_packet.h
#define TP_STATUS_COPY 2
@ -391,6 +462,37 @@ packets are in the ring:
It doesn't incur in a race condition to first check the status value and
then poll for frames.
++ Transmission process
Those defines are also used for transmission:
#define TP_STATUS_AVAILABLE 0 // Frame is available
#define TP_STATUS_SEND_REQUEST 1 // Frame will be sent on next send()
#define TP_STATUS_SENDING 2 // Frame is currently in transmission
#define TP_STATUS_WRONG_FORMAT 4 // Frame format is not correct
First, the kernel initializes all frames to TP_STATUS_AVAILABLE. To send a
packet, the user fills a data buffer of an available frame, sets tp_len to
current data buffer size and sets its status field to TP_STATUS_SEND_REQUEST.
This can be done on multiple frames. Once the user is ready to transmit, it
calls send(). Then all buffers with status equal to TP_STATUS_SEND_REQUEST are
forwarded to the network device. The kernel updates each status of sent
frames with TP_STATUS_SENDING until the end of transfer.
At the end of each transfer, buffer status returns to TP_STATUS_AVAILABLE.
header->tp_len = in_i_size;
header->tp_status = TP_STATUS_SEND_REQUEST;
retval = send(this->socket, NULL, 0, 0);
The user can also use poll() to check if a buffer is available:
(status == TP_STATUS_SENDING)
struct pollfd pfd;
pfd.fd = fd;
pfd.revents = 0;
pfd.events = POLLOUT;
retval = poll(&pfd, 1, timeout);
--------------------------------------------------------------------------------
+ THANKS
--------------------------------------------------------------------------------

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,148 @@
4xx/Axon EMAC ethernet nodes
The EMAC ethernet controller in IBM and AMCC 4xx chips, and also
the Axon bridge. To operate this needs to interact with a ths
special McMAL DMA controller, and sometimes an RGMII or ZMII
interface. In addition to the nodes and properties described
below, the node for the OPB bus on which the EMAC sits must have a
correct clock-frequency property.
i) The EMAC node itself
Required properties:
- device_type : "network"
- compatible : compatible list, contains 2 entries, first is
"ibm,emac-CHIP" where CHIP is the host ASIC (440gx,
405gp, Axon) and second is either "ibm,emac" or
"ibm,emac4". For Axon, thus, we have: "ibm,emac-axon",
"ibm,emac4"
- interrupts : <interrupt mapping for EMAC IRQ and WOL IRQ>
- interrupt-parent : optional, if needed for interrupt mapping
- reg : <registers mapping>
- local-mac-address : 6 bytes, MAC address
- mal-device : phandle of the associated McMAL node
- mal-tx-channel : 1 cell, index of the tx channel on McMAL associated
with this EMAC
- mal-rx-channel : 1 cell, index of the rx channel on McMAL associated
with this EMAC
- cell-index : 1 cell, hardware index of the EMAC cell on a given
ASIC (typically 0x0 and 0x1 for EMAC0 and EMAC1 on
each Axon chip)
- max-frame-size : 1 cell, maximum frame size supported in bytes
- rx-fifo-size : 1 cell, Rx fifo size in bytes for 10 and 100 Mb/sec
operations.
For Axon, 2048
- tx-fifo-size : 1 cell, Tx fifo size in bytes for 10 and 100 Mb/sec
operations.
For Axon, 2048.
- fifo-entry-size : 1 cell, size of a fifo entry (used to calculate
thresholds).
For Axon, 0x00000010
- mal-burst-size : 1 cell, MAL burst size (used to calculate thresholds)
in bytes.
For Axon, 0x00000100 (I think ...)
- phy-mode : string, mode of operations of the PHY interface.
Supported values are: "mii", "rmii", "smii", "rgmii",
"tbi", "gmii", rtbi", "sgmii".
For Axon on CAB, it is "rgmii"
- mdio-device : 1 cell, required iff using shared MDIO registers
(440EP). phandle of the EMAC to use to drive the
MDIO lines for the PHY used by this EMAC.
- zmii-device : 1 cell, required iff connected to a ZMII. phandle of
the ZMII device node
- zmii-channel : 1 cell, required iff connected to a ZMII. Which ZMII
channel or 0xffffffff if ZMII is only used for MDIO.
- rgmii-device : 1 cell, required iff connected to an RGMII. phandle
of the RGMII device node.
For Axon: phandle of plb5/plb4/opb/rgmii
- rgmii-channel : 1 cell, required iff connected to an RGMII. Which
RGMII channel is used by this EMAC.
Fox Axon: present, whatever value is appropriate for each
EMAC, that is the content of the current (bogus) "phy-port"
property.
Optional properties:
- phy-address : 1 cell, optional, MDIO address of the PHY. If absent,
a search is performed.
- phy-map : 1 cell, optional, bitmap of addresses to probe the PHY
for, used if phy-address is absent. bit 0x00000001 is
MDIO address 0.
For Axon it can be absent, though my current driver
doesn't handle phy-address yet so for now, keep
0x00ffffff in it.
- rx-fifo-size-gige : 1 cell, Rx fifo size in bytes for 1000 Mb/sec
operations (if absent the value is the same as
rx-fifo-size). For Axon, either absent or 2048.
- tx-fifo-size-gige : 1 cell, Tx fifo size in bytes for 1000 Mb/sec
operations (if absent the value is the same as
tx-fifo-size). For Axon, either absent or 2048.
- tah-device : 1 cell, optional. If connected to a TAH engine for
offload, phandle of the TAH device node.
- tah-channel : 1 cell, optional. If appropriate, channel used on the
TAH engine.
Example:
EMAC0: ethernet@40000800 {
device_type = "network";
compatible = "ibm,emac-440gp", "ibm,emac";
interrupt-parent = <&UIC1>;
interrupts = <1c 4 1d 4>;
reg = <40000800 70>;
local-mac-address = [00 04 AC E3 1B 1E];
mal-device = <&MAL0>;
mal-tx-channel = <0 1>;
mal-rx-channel = <0>;
cell-index = <0>;
max-frame-size = <5dc>;
rx-fifo-size = <1000>;
tx-fifo-size = <800>;
phy-mode = "rmii";
phy-map = <00000001>;
zmii-device = <&ZMII0>;
zmii-channel = <0>;
};
ii) McMAL node
Required properties:
- device_type : "dma-controller"
- compatible : compatible list, containing 2 entries, first is
"ibm,mcmal-CHIP" where CHIP is the host ASIC (like
emac) and the second is either "ibm,mcmal" or
"ibm,mcmal2".
For Axon, "ibm,mcmal-axon","ibm,mcmal2"
- interrupts : <interrupt mapping for the MAL interrupts sources:
5 sources: tx_eob, rx_eob, serr, txde, rxde>.
For Axon: This is _different_ from the current
firmware. We use the "delayed" interrupts for txeob
and rxeob. Thus we end up with mapping those 5 MPIC
interrupts, all level positive sensitive: 10, 11, 32,
33, 34 (in decimal)
- dcr-reg : < DCR registers range >
- dcr-parent : if needed for dcr-reg
- num-tx-chans : 1 cell, number of Tx channels
- num-rx-chans : 1 cell, number of Rx channels
iii) ZMII node
Required properties:
- compatible : compatible list, containing 2 entries, first is
"ibm,zmii-CHIP" where CHIP is the host ASIC (like
EMAC) and the second is "ibm,zmii".
For Axon, there is no ZMII node.
- reg : <registers mapping>
iv) RGMII node
Required properties:
- compatible : compatible list, containing 2 entries, first is
"ibm,rgmii-CHIP" where CHIP is the host ASIC (like
EMAC) and the second is "ibm,rgmii".
For Axon, "ibm,rgmii-axon","ibm,rgmii"
- reg : <registers mapping>
- revision : as provided by the RGMII new version register if
available.
For Axon: 0x0000012a

View File

@ -0,0 +1,53 @@
Memory mapped SJA1000 CAN controller from NXP (formerly Philips)
Required properties:
- compatible : should be "nxp,sja1000".
- reg : should specify the chip select, address offset and size required
to map the registers of the SJA1000. The size is usually 0x80.
- interrupts: property with a value describing the interrupt source
(number and sensitivity) required for the SJA1000.
Optional properties:
- nxp,external-clock-frequency : Frequency of the external oscillator
clock in Hz. Note that the internal clock frequency used by the
SJA1000 is half of that value. If not specified, a default value
of 16000000 (16 MHz) is used.
- nxp,tx-output-mode : operation mode of the TX output control logic:
<0x0> : bi-phase output mode
<0x1> : normal output mode (default)
<0x2> : test output mode
<0x3> : clock output mode
- nxp,tx-output-config : TX output pin configuration:
<0x01> : TX0 invert
<0x02> : TX0 pull-down (default)
<0x04> : TX0 pull-up
<0x06> : TX0 push-pull
<0x08> : TX1 invert
<0x10> : TX1 pull-down
<0x20> : TX1 pull-up
<0x30> : TX1 push-pull
- nxp,clock-out-frequency : clock frequency in Hz on the CLKOUT pin.
If not specified or if the specified value is 0, the CLKOUT pin
will be disabled.
- nxp,no-comparator-bypass : Allows to disable the CAN input comperator.
For futher information, please have a look to the SJA1000 data sheet.
Examples:
can@3,100 {
compatible = "nxp,sja1000";
reg = <3 0x100 0x80>;
interrupts = <2 0>;
interrupt-parent = <&mpic>;
nxp,external-clock-frequency = <16000000>;
};

View File

@ -0,0 +1,64 @@
=====================================================================
E500 LAW & Coherency Module Device Tree Binding
Copyright (C) 2009 Freescale Semiconductor Inc.
=====================================================================
Local Access Window (LAW) Node
The LAW node represents the region of CCSR space where local access
windows are configured. For ECM based devices this is the first 4k
of CCSR space that includes CCSRBAR, ALTCBAR, ALTCAR, BPTR, and some
number of local access windows as specified by fsl,num-laws.
PROPERTIES
- compatible
Usage: required
Value type: <string>
Definition: Must include "fsl,ecm-law"
- reg
Usage: required
Value type: <prop-encoded-array>
Definition: A standard property. The value specifies the
physical address offset and length of the CCSR space
registers.
- fsl,num-laws
Usage: required
Value type: <u32>
Definition: The value specifies the number of local access
windows for this device.
=====================================================================
E500 Coherency Module Node
The E500 LAW node represents the region of CCSR space where ECM config
and error reporting registers exist, this is the second 4k (0x1000)
of CCSR space.
PROPERTIES
- compatible
Usage: required
Value type: <string>
Definition: Must include "fsl,CHIP-ecm", "fsl,ecm" where
CHIP is the processor (mpc8572, mpc8544, etc.)
- reg
Usage: required
Value type: <prop-encoded-array>
Definition: A standard property. The value specifies the
physical address offset and length of the CCSR space
registers.
- interrupts
Usage: required
Value type: <prop-encoded-array>
- interrupt-parent
Usage: required
Value type: <phandle>
=====================================================================

View File

@ -17,6 +17,9 @@ Required properties:
- model : precise model of the QE, Can be "QE", "CPM", or "CPM2"
- reg : offset and length of the device registers.
- bus-frequency : the clock frequency for QUICC Engine.
- fsl,qe-num-riscs: define how many RISC engines the QE has.
- fsl,qe-num-snums: define how many serial number(SNUM) the QE can use for the
threads.
Recommended properties
- brg-frequency : the internal clock source frequency for baud-rate

View File

@ -5,17 +5,18 @@ for MMC, SD, and SDIO types of memory cards.
Required properties:
- compatible : should be
"fsl,<chip>-esdhc", "fsl,mpc8379-esdhc" for MPC83xx processors.
"fsl,<chip>-esdhc", "fsl,mpc8536-esdhc" for MPC85xx processors.
"fsl,<chip>-esdhc", "fsl,esdhc"
- reg : should contain eSDHC registers location and length.
- interrupts : should contain eSDHC interrupt.
- interrupt-parent : interrupt source phandle.
- clock-frequency : specifies eSDHC base clock frequency.
- sdhci,1-bit-only : (optional) specifies that a controller can
only handle 1-bit data transfers.
Example:
sdhci@2e000 {
compatible = "fsl,mpc8378-esdhc", "fsl,mpc8379-esdhc";
compatible = "fsl,mpc8378-esdhc", "fsl,esdhc";
reg = <0x2e000 0x1000>;
interrupts = <42 0x8>;
interrupt-parent = <&ipic>;

View File

@ -0,0 +1,64 @@
=====================================================================
MPX LAW & Coherency Module Device Tree Binding
Copyright (C) 2009 Freescale Semiconductor Inc.
=====================================================================
Local Access Window (LAW) Node
The LAW node represents the region of CCSR space where local access
windows are configured. For MCM based devices this is the first 4k
of CCSR space that includes CCSRBAR, ALTCBAR, ALTCAR, BPTR, and some
number of local access windows as specified by fsl,num-laws.
PROPERTIES
- compatible
Usage: required
Value type: <string>
Definition: Must include "fsl,mcm-law"
- reg
Usage: required
Value type: <prop-encoded-array>
Definition: A standard property. The value specifies the
physical address offset and length of the CCSR space
registers.
- fsl,num-laws
Usage: required
Value type: <u32>
Definition: The value specifies the number of local access
windows for this device.
=====================================================================
MPX Coherency Module Node
The MPX LAW node represents the region of CCSR space where MCM config
and error reporting registers exist, this is the second 4k (0x1000)
of CCSR space.
PROPERTIES
- compatible
Usage: required
Value type: <string>
Definition: Must include "fsl,CHIP-mcm", "fsl,mcm" where
CHIP is the processor (mpc8641, mpc8610, etc.)
- reg
Usage: required
Value type: <prop-encoded-array>
Definition: A standard property. The value specifies the
physical address offset and length of the CCSR space
registers.
- interrupts
Usage: required
Value type: <prop-encoded-array>
- interrupt-parent
Usage: required
Value type: <phandle>
=====================================================================

View File

@ -0,0 +1,50 @@
Specifying GPIO information for devices
============================================
1) gpios property
-----------------
Nodes that makes use of GPIOs should define them using `gpios' property,
format of which is: <&gpio-controller1-phandle gpio1-specifier
&gpio-controller2-phandle gpio2-specifier
0 /* holes are permitted, means no GPIO 3 */
&gpio-controller4-phandle gpio4-specifier
...>;
Note that gpio-specifier length is controller dependent.
gpio-specifier may encode: bank, pin position inside the bank,
whether pin is open-drain and whether pin is logically inverted.
Example of the node using GPIOs:
node {
gpios = <&qe_pio_e 18 0>;
};
In this example gpio-specifier is "18 0" and encodes GPIO pin number,
and empty GPIO flags as accepted by the "qe_pio_e" gpio-controller.
2) gpio-controller nodes
------------------------
Every GPIO controller node must have #gpio-cells property defined,
this information will be used to translate gpio-specifiers.
Example of two SOC GPIO banks defined as gpio-controller nodes:
qe_pio_a: gpio-controller@1400 {
#gpio-cells = <2>;
compatible = "fsl,qe-pario-bank-a", "fsl,qe-pario-bank";
reg = <0x1400 0x18>;
gpio-controller;
};
qe_pio_e: gpio-controller@1460 {
#gpio-cells = <2>;
compatible = "fsl,qe-pario-bank-e", "fsl,qe-pario-bank";
reg = <0x1460 0x18>;
gpio-controller;
};

View File

@ -16,10 +16,17 @@ LED sub-node properties:
string defining the trigger assigned to the LED. Current triggers are:
"backlight" - LED will act as a back-light, controlled by the framebuffer
system
"default-on" - LED will turn on
"default-on" - LED will turn on, but see "default-state" below
"heartbeat" - LED "double" flashes at a load average based rate
"ide-disk" - LED indicates disk activity
"timer" - LED flashes at a fixed, configurable rate
- default-state: (optional) The initial state of the LED. Valid
values are "on", "off", and "keep". If the LED is already on or off
and the default-state property is set the to same value, then no
glitch should be produced where the LED momentarily turns off (or
on). The "keep" setting will keep the LED at whatever its current
state is, without producing a glitch. The default is off if this
property is not present.
Examples:
@ -30,14 +37,22 @@ leds {
gpios = <&mcu_pio 0 1>; /* Active low */
linux,default-trigger = "ide-disk";
};
fault {
gpios = <&mcu_pio 1 0>;
/* Keep LED on if BIOS detected hardware fault */
default-state = "keep";
};
};
run-control {
compatible = "gpio-leds";
red {
gpios = <&mpc8572 6 0>;
default-state = "off";
};
green {
gpios = <&mpc8572 7 0>;
default-state = "on";
};
}

View File

@ -0,0 +1,19 @@
MDIO on GPIOs
Currently defined compatibles:
- virtual,gpio-mdio
MDC and MDIO lines connected to GPIO controllers are listed in the
gpios property as described in section VIII.1 in the following order:
MDC, MDIO.
Example:
mdio {
compatible = "virtual,mdio-gpio";
#address-cells = <1>;
#size-cells = <0>;
gpios = <&qe_pio_a 11
&qe_pio_c 6>;
};

View File

@ -0,0 +1,521 @@
Marvell Discovery mv64[345]6x System Controller chips
===========================================================
The Marvell mv64[345]60 series of system controller chips contain
many of the peripherals needed to implement a complete computer
system. In this section, we define device tree nodes to describe
the system controller chip itself and each of the peripherals
which it contains. Compatible string values for each node are
prefixed with the string "marvell,", for Marvell Technology Group Ltd.
1) The /system-controller node
This node is used to represent the system-controller and must be
present when the system uses a system controller chip. The top-level
system-controller node contains information that is global to all
devices within the system controller chip. The node name begins
with "system-controller" followed by the unit address, which is
the base address of the memory-mapped register set for the system
controller chip.
Required properties:
- ranges : Describes the translation of system controller addresses
for memory mapped registers.
- clock-frequency: Contains the main clock frequency for the system
controller chip.
- reg : This property defines the address and size of the
memory-mapped registers contained within the system controller
chip. The address specified in the "reg" property should match
the unit address of the system-controller node.
- #address-cells : Address representation for system controller
devices. This field represents the number of cells needed to
represent the address of the memory-mapped registers of devices
within the system controller chip.
- #size-cells : Size representation for for the memory-mapped
registers within the system controller chip.
- #interrupt-cells : Defines the width of cells used to represent
interrupts.
Optional properties:
- model : The specific model of the system controller chip. Such
as, "mv64360", "mv64460", or "mv64560".
- compatible : A string identifying the compatibility identifiers
of the system controller chip.
The system-controller node contains child nodes for each system
controller device that the platform uses. Nodes should not be created
for devices which exist on the system controller chip but are not used
Example Marvell Discovery mv64360 system-controller node:
system-controller@f1000000 { /* Marvell Discovery mv64360 */
#address-cells = <1>;
#size-cells = <1>;
model = "mv64360"; /* Default */
compatible = "marvell,mv64360";
clock-frequency = <133333333>;
reg = <0xf1000000 0x10000>;
virtual-reg = <0xf1000000>;
ranges = <0x88000000 0x88000000 0x1000000 /* PCI 0 I/O Space */
0x80000000 0x80000000 0x8000000 /* PCI 0 MEM Space */
0xa0000000 0xa0000000 0x4000000 /* User FLASH */
0x00000000 0xf1000000 0x0010000 /* Bridge's regs */
0xf2000000 0xf2000000 0x0040000>;/* Integrated SRAM */
[ child node definitions... ]
}
2) Child nodes of /system-controller
a) Marvell Discovery MDIO bus
The MDIO is a bus to which the PHY devices are connected. For each
device that exists on this bus, a child node should be created. See
the definition of the PHY node below for an example of how to define
a PHY.
Required properties:
- #address-cells : Should be <1>
- #size-cells : Should be <0>
- device_type : Should be "mdio"
- compatible : Should be "marvell,mv64360-mdio"
Example:
mdio {
#address-cells = <1>;
#size-cells = <0>;
device_type = "mdio";
compatible = "marvell,mv64360-mdio";
ethernet-phy@0 {
......
};
};
b) Marvell Discovery ethernet controller
The Discover ethernet controller is described with two levels
of nodes. The first level describes an ethernet silicon block
and the second level describes up to 3 ethernet nodes within
that block. The reason for the multiple levels is that the
registers for the node are interleaved within a single set
of registers. The "ethernet-block" level describes the
shared register set, and the "ethernet" nodes describe ethernet
port-specific properties.
Ethernet block node
Required properties:
- #address-cells : <1>
- #size-cells : <0>
- compatible : "marvell,mv64360-eth-block"
- reg : Offset and length of the register set for this block
Example Discovery Ethernet block node:
ethernet-block@2000 {
#address-cells = <1>;
#size-cells = <0>;
compatible = "marvell,mv64360-eth-block";
reg = <0x2000 0x2000>;
ethernet@0 {
.......
};
};
Ethernet port node
Required properties:
- device_type : Should be "network".
- compatible : Should be "marvell,mv64360-eth".
- reg : Should be <0>, <1>, or <2>, according to which registers
within the silicon block the device uses.
- interrupts : <a> where a is the interrupt number for the port.
- interrupt-parent : the phandle for the interrupt controller
that services interrupts for this device.
- phy : the phandle for the PHY connected to this ethernet
controller.
- local-mac-address : 6 bytes, MAC address
Example Discovery Ethernet port node:
ethernet@0 {
device_type = "network";
compatible = "marvell,mv64360-eth";
reg = <0>;
interrupts = <32>;
interrupt-parent = <&PIC>;
phy = <&PHY0>;
local-mac-address = [ 00 00 00 00 00 00 ];
};
c) Marvell Discovery PHY nodes
Required properties:
- device_type : Should be "ethernet-phy"
- interrupts : <a> where a is the interrupt number for this phy.
- interrupt-parent : the phandle for the interrupt controller that
services interrupts for this device.
- reg : The ID number for the phy, usually a small integer
Example Discovery PHY node:
ethernet-phy@1 {
device_type = "ethernet-phy";
compatible = "broadcom,bcm5421";
interrupts = <76>; /* GPP 12 */
interrupt-parent = <&PIC>;
reg = <1>;
};
d) Marvell Discovery SDMA nodes
Represent DMA hardware associated with the MPSC (multiprotocol
serial controllers).
Required properties:
- compatible : "marvell,mv64360-sdma"
- reg : Offset and length of the register set for this device
- interrupts : <a> where a is the interrupt number for the DMA
device.
- interrupt-parent : the phandle for the interrupt controller
that services interrupts for this device.
Example Discovery SDMA node:
sdma@4000 {
compatible = "marvell,mv64360-sdma";
reg = <0x4000 0xc18>;
virtual-reg = <0xf1004000>;
interrupts = <36>;
interrupt-parent = <&PIC>;
};
e) Marvell Discovery BRG nodes
Represent baud rate generator hardware associated with the MPSC
(multiprotocol serial controllers).
Required properties:
- compatible : "marvell,mv64360-brg"
- reg : Offset and length of the register set for this device
- clock-src : A value from 0 to 15 which selects the clock
source for the baud rate generator. This value corresponds
to the CLKS value in the BRGx configuration register. See
the mv64x60 User's Manual.
- clock-frequence : The frequency (in Hz) of the baud rate
generator's input clock.
- current-speed : The current speed setting (presumably by
firmware) of the baud rate generator.
Example Discovery BRG node:
brg@b200 {
compatible = "marvell,mv64360-brg";
reg = <0xb200 0x8>;
clock-src = <8>;
clock-frequency = <133333333>;
current-speed = <9600>;
};
f) Marvell Discovery CUNIT nodes
Represent the Serial Communications Unit device hardware.
Required properties:
- reg : Offset and length of the register set for this device
Example Discovery CUNIT node:
cunit@f200 {
reg = <0xf200 0x200>;
};
g) Marvell Discovery MPSCROUTING nodes
Represent the Discovery's MPSC routing hardware
Required properties:
- reg : Offset and length of the register set for this device
Example Discovery CUNIT node:
mpscrouting@b500 {
reg = <0xb400 0xc>;
};
h) Marvell Discovery MPSCINTR nodes
Represent the Discovery's MPSC DMA interrupt hardware registers
(SDMA cause and mask registers).
Required properties:
- reg : Offset and length of the register set for this device
Example Discovery MPSCINTR node:
mpsintr@b800 {
reg = <0xb800 0x100>;
};
i) Marvell Discovery MPSC nodes
Represent the Discovery's MPSC (Multiprotocol Serial Controller)
serial port.
Required properties:
- device_type : "serial"
- compatible : "marvell,mv64360-mpsc"
- reg : Offset and length of the register set for this device
- sdma : the phandle for the SDMA node used by this port
- brg : the phandle for the BRG node used by this port
- cunit : the phandle for the CUNIT node used by this port
- mpscrouting : the phandle for the MPSCROUTING node used by this port
- mpscintr : the phandle for the MPSCINTR node used by this port
- cell-index : the hardware index of this cell in the MPSC core
- max_idle : value needed for MPSC CHR3 (Maximum Frame Length)
register
- interrupts : <a> where a is the interrupt number for the MPSC.
- interrupt-parent : the phandle for the interrupt controller
that services interrupts for this device.
Example Discovery MPSCINTR node:
mpsc@8000 {
device_type = "serial";
compatible = "marvell,mv64360-mpsc";
reg = <0x8000 0x38>;
virtual-reg = <0xf1008000>;
sdma = <&SDMA0>;
brg = <&BRG0>;
cunit = <&CUNIT>;
mpscrouting = <&MPSCROUTING>;
mpscintr = <&MPSCINTR>;
cell-index = <0>;
max_idle = <40>;
interrupts = <40>;
interrupt-parent = <&PIC>;
};
j) Marvell Discovery Watch Dog Timer nodes
Represent the Discovery's watchdog timer hardware
Required properties:
- compatible : "marvell,mv64360-wdt"
- reg : Offset and length of the register set for this device
Example Discovery Watch Dog Timer node:
wdt@b410 {
compatible = "marvell,mv64360-wdt";
reg = <0xb410 0x8>;
};
k) Marvell Discovery I2C nodes
Represent the Discovery's I2C hardware
Required properties:
- device_type : "i2c"
- compatible : "marvell,mv64360-i2c"
- reg : Offset and length of the register set for this device
- interrupts : <a> where a is the interrupt number for the I2C.
- interrupt-parent : the phandle for the interrupt controller
that services interrupts for this device.
Example Discovery I2C node:
compatible = "marvell,mv64360-i2c";
reg = <0xc000 0x20>;
virtual-reg = <0xf100c000>;
interrupts = <37>;
interrupt-parent = <&PIC>;
};
l) Marvell Discovery PIC (Programmable Interrupt Controller) nodes
Represent the Discovery's PIC hardware
Required properties:
- #interrupt-cells : <1>
- #address-cells : <0>
- compatible : "marvell,mv64360-pic"
- reg : Offset and length of the register set for this device
- interrupt-controller
Example Discovery PIC node:
pic {
#interrupt-cells = <1>;
#address-cells = <0>;
compatible = "marvell,mv64360-pic";
reg = <0x0 0x88>;
interrupt-controller;
};
m) Marvell Discovery MPP (Multipurpose Pins) multiplexing nodes
Represent the Discovery's MPP hardware
Required properties:
- compatible : "marvell,mv64360-mpp"
- reg : Offset and length of the register set for this device
Example Discovery MPP node:
mpp@f000 {
compatible = "marvell,mv64360-mpp";
reg = <0xf000 0x10>;
};
n) Marvell Discovery GPP (General Purpose Pins) nodes
Represent the Discovery's GPP hardware
Required properties:
- compatible : "marvell,mv64360-gpp"
- reg : Offset and length of the register set for this device
Example Discovery GPP node:
gpp@f000 {
compatible = "marvell,mv64360-gpp";
reg = <0xf100 0x20>;
};
o) Marvell Discovery PCI host bridge node
Represents the Discovery's PCI host bridge device. The properties
for this node conform to Rev 2.1 of the PCI Bus Binding to IEEE
1275-1994. A typical value for the compatible property is
"marvell,mv64360-pci".
Example Discovery PCI host bridge node
pci@80000000 {
#address-cells = <3>;
#size-cells = <2>;
#interrupt-cells = <1>;
device_type = "pci";
compatible = "marvell,mv64360-pci";
reg = <0xcf8 0x8>;
ranges = <0x01000000 0x0 0x0
0x88000000 0x0 0x01000000
0x02000000 0x0 0x80000000
0x80000000 0x0 0x08000000>;
bus-range = <0 255>;
clock-frequency = <66000000>;
interrupt-parent = <&PIC>;
interrupt-map-mask = <0xf800 0x0 0x0 0x7>;
interrupt-map = <
/* IDSEL 0x0a */
0x5000 0 0 1 &PIC 80
0x5000 0 0 2 &PIC 81
0x5000 0 0 3 &PIC 91
0x5000 0 0 4 &PIC 93
/* IDSEL 0x0b */
0x5800 0 0 1 &PIC 91
0x5800 0 0 2 &PIC 93
0x5800 0 0 3 &PIC 80
0x5800 0 0 4 &PIC 81
/* IDSEL 0x0c */
0x6000 0 0 1 &PIC 91
0x6000 0 0 2 &PIC 93
0x6000 0 0 3 &PIC 80
0x6000 0 0 4 &PIC 81
/* IDSEL 0x0d */
0x6800 0 0 1 &PIC 93
0x6800 0 0 2 &PIC 80
0x6800 0 0 3 &PIC 81
0x6800 0 0 4 &PIC 91
>;
};
p) Marvell Discovery CPU Error nodes
Represent the Discovery's CPU error handler device.
Required properties:
- compatible : "marvell,mv64360-cpu-error"
- reg : Offset and length of the register set for this device
- interrupts : the interrupt number for this device
- interrupt-parent : the phandle for the interrupt controller
that services interrupts for this device.
Example Discovery CPU Error node:
cpu-error@0070 {
compatible = "marvell,mv64360-cpu-error";
reg = <0x70 0x10 0x128 0x28>;
interrupts = <3>;
interrupt-parent = <&PIC>;
};
q) Marvell Discovery SRAM Controller nodes
Represent the Discovery's SRAM controller device.
Required properties:
- compatible : "marvell,mv64360-sram-ctrl"
- reg : Offset and length of the register set for this device
- interrupts : the interrupt number for this device
- interrupt-parent : the phandle for the interrupt controller
that services interrupts for this device.
Example Discovery SRAM Controller node:
sram-ctrl@0380 {
compatible = "marvell,mv64360-sram-ctrl";
reg = <0x380 0x80>;
interrupts = <13>;
interrupt-parent = <&PIC>;
};
r) Marvell Discovery PCI Error Handler nodes
Represent the Discovery's PCI error handler device.
Required properties:
- compatible : "marvell,mv64360-pci-error"
- reg : Offset and length of the register set for this device
- interrupts : the interrupt number for this device
- interrupt-parent : the phandle for the interrupt controller
that services interrupts for this device.
Example Discovery PCI Error Handler node:
pci-error@1d40 {
compatible = "marvell,mv64360-pci-error";
reg = <0x1d40 0x40 0xc28 0x4>;
interrupts = <12>;
interrupt-parent = <&PIC>;
};
s) Marvell Discovery Memory Controller nodes
Represent the Discovery's memory controller device.
Required properties:
- compatible : "marvell,mv64360-mem-ctrl"
- reg : Offset and length of the register set for this device
- interrupts : the interrupt number for this device
- interrupt-parent : the phandle for the interrupt controller
that services interrupts for this device.
Example Discovery Memory Controller node:
mem-ctrl@1400 {
compatible = "marvell,mv64360-mem-ctrl";
reg = <0x1400 0x60>;
interrupts = <17>;
interrupt-parent = <&PIC>;
};

View File

@ -0,0 +1,25 @@
PHY nodes
Required properties:
- device_type : Should be "ethernet-phy"
- interrupts : <a b> where a is the interrupt number and b is a
field that represents an encoding of the sense and level
information for the interrupt. This should be encoded based on
the information in section 2) depending on the type of interrupt
controller you have.
- interrupt-parent : the phandle for the interrupt controller that
services interrupts for this device.
- reg : The ID number for the phy, usually a small integer
- linux,phandle : phandle for this node; likely referenced by an
ethernet controller node.
Example:
ethernet-phy@0 {
linux,phandle = <2452000>
interrupt-parent = <40000>;
interrupts = <35 1>;
reg = <0>;
device_type = "ethernet-phy";
};

View File

@ -0,0 +1,57 @@
SPI (Serial Peripheral Interface) busses
SPI busses can be described with a node for the SPI master device
and a set of child nodes for each SPI slave on the bus. For this
discussion, it is assumed that the system's SPI controller is in
SPI master mode. This binding does not describe SPI controllers
in slave mode.
The SPI master node requires the following properties:
- #address-cells - number of cells required to define a chip select
address on the SPI bus.
- #size-cells - should be zero.
- compatible - name of SPI bus controller following generic names
recommended practice.
No other properties are required in the SPI bus node. It is assumed
that a driver for an SPI bus device will understand that it is an SPI bus.
However, the binding does not attempt to define the specific method for
assigning chip select numbers. Since SPI chip select configuration is
flexible and non-standardized, it is left out of this binding with the
assumption that board specific platform code will be used to manage
chip selects. Individual drivers can define additional properties to
support describing the chip select layout.
SPI slave nodes must be children of the SPI master node and can
contain the following properties.
- reg - (required) chip select address of device.
- compatible - (required) name of SPI device following generic names
recommended practice
- spi-max-frequency - (required) Maximum SPI clocking speed of device in Hz
- spi-cpol - (optional) Empty property indicating device requires
inverse clock polarity (CPOL) mode
- spi-cpha - (optional) Empty property indicating device requires
shifted clock phase (CPHA) mode
- spi-cs-high - (optional) Empty property indicating device requires
chip select active high
SPI example for an MPC5200 SPI bus:
spi@f00 {
#address-cells = <1>;
#size-cells = <0>;
compatible = "fsl,mpc5200b-spi","fsl,mpc5200-spi";
reg = <0xf00 0x20>;
interrupts = <2 13 0 2 14 0>;
interrupt-parent = <&mpc5200_pic>;
ethernet-switch@0 {
compatible = "micrel,ks8995m";
spi-max-frequency = <1000000>;
reg = <0>;
};
codec@1 {
compatible = "ti,tlv320aic26";
spi-max-frequency = <100000>;
reg = <1>;
};
};

View File

@ -0,0 +1,25 @@
USB EHCI controllers
Required properties:
- compatible : should be "usb-ehci".
- reg : should contain at least address and length of the standard EHCI
register set for the device. Optional platform-dependent registers
(debug-port or other) can be also specified here, but only after
definition of standard EHCI registers.
- interrupts : one EHCI interrupt should be described here.
If device registers are implemented in big endian mode, the device
node should have "big-endian-regs" property.
If controller implementation operates with big endian descriptors,
"big-endian-desc" property should be specified.
If both big endian registers and descriptors are used by the controller
implementation, "big-endian" property can be specified instead of having
both "big-endian-regs" and "big-endian-desc".
Example (Sequoia 440EPx):
ehci@e0000300 {
compatible = "ibm,usb-ehci-440epx", "usb-ehci";
interrupt-parent = <&UIC0>;
interrupts = <1a 4>;
reg = <0 e0000300 90 0 e0000390 70>;
big-endian;
};

View File

@ -0,0 +1,295 @@
d) Xilinx IP cores
The Xilinx EDK toolchain ships with a set of IP cores (devices) for use
in Xilinx Spartan and Virtex FPGAs. The devices cover the whole range
of standard device types (network, serial, etc.) and miscellaneous
devices (gpio, LCD, spi, etc). Also, since these devices are
implemented within the fpga fabric every instance of the device can be
synthesised with different options that change the behaviour.
Each IP-core has a set of parameters which the FPGA designer can use to
control how the core is synthesized. Historically, the EDK tool would
extract the device parameters relevant to device drivers and copy them
into an 'xparameters.h' in the form of #define symbols. This tells the
device drivers how the IP cores are configured, but it requres the kernel
to be recompiled every time the FPGA bitstream is resynthesized.
The new approach is to export the parameters into the device tree and
generate a new device tree each time the FPGA bitstream changes. The
parameters which used to be exported as #defines will now become
properties of the device node. In general, device nodes for IP-cores
will take the following form:
(name): (generic-name)@(base-address) {
compatible = "xlnx,(ip-core-name)-(HW_VER)"
[, (list of compatible devices), ...];
reg = <(baseaddr) (size)>;
interrupt-parent = <&interrupt-controller-phandle>;
interrupts = < ... >;
xlnx,(parameter1) = "(string-value)";
xlnx,(parameter2) = <(int-value)>;
};
(generic-name): an open firmware-style name that describes the
generic class of device. Preferably, this is one word, such
as 'serial' or 'ethernet'.
(ip-core-name): the name of the ip block (given after the BEGIN
directive in system.mhs). Should be in lowercase
and all underscores '_' converted to dashes '-'.
(name): is derived from the "PARAMETER INSTANCE" value.
(parameter#): C_* parameters from system.mhs. The C_ prefix is
dropped from the parameter name, the name is converted
to lowercase and all underscore '_' characters are
converted to dashes '-'.
(baseaddr): the baseaddr parameter value (often named C_BASEADDR).
(HW_VER): from the HW_VER parameter.
(size): the address range size (often C_HIGHADDR - C_BASEADDR + 1).
Typically, the compatible list will include the exact IP core version
followed by an older IP core version which implements the same
interface or any other device with the same interface.
'reg', 'interrupt-parent' and 'interrupts' are all optional properties.
For example, the following block from system.mhs:
BEGIN opb_uartlite
PARAMETER INSTANCE = opb_uartlite_0
PARAMETER HW_VER = 1.00.b
PARAMETER C_BAUDRATE = 115200
PARAMETER C_DATA_BITS = 8
PARAMETER C_ODD_PARITY = 0
PARAMETER C_USE_PARITY = 0
PARAMETER C_CLK_FREQ = 50000000
PARAMETER C_BASEADDR = 0xEC100000
PARAMETER C_HIGHADDR = 0xEC10FFFF
BUS_INTERFACE SOPB = opb_7
PORT OPB_Clk = CLK_50MHz
PORT Interrupt = opb_uartlite_0_Interrupt
PORT RX = opb_uartlite_0_RX
PORT TX = opb_uartlite_0_TX
PORT OPB_Rst = sys_bus_reset_0
END
becomes the following device tree node:
opb_uartlite_0: serial@ec100000 {
device_type = "serial";
compatible = "xlnx,opb-uartlite-1.00.b";
reg = <ec100000 10000>;
interrupt-parent = <&opb_intc_0>;
interrupts = <1 0>; // got this from the opb_intc parameters
current-speed = <d#115200>; // standard serial device prop
clock-frequency = <d#50000000>; // standard serial device prop
xlnx,data-bits = <8>;
xlnx,odd-parity = <0>;
xlnx,use-parity = <0>;
};
Some IP cores actually implement 2 or more logical devices. In
this case, the device should still describe the whole IP core with
a single node and add a child node for each logical device. The
ranges property can be used to translate from parent IP-core to the
registers of each device. In addition, the parent node should be
compatible with the bus type 'xlnx,compound', and should contain
#address-cells and #size-cells, as with any other bus. (Note: this
makes the assumption that both logical devices have the same bus
binding. If this is not true, then separate nodes should be used
for each logical device). The 'cell-index' property can be used to
enumerate logical devices within an IP core. For example, the
following is the system.mhs entry for the dual ps2 controller found
on the ml403 reference design.
BEGIN opb_ps2_dual_ref
PARAMETER INSTANCE = opb_ps2_dual_ref_0
PARAMETER HW_VER = 1.00.a
PARAMETER C_BASEADDR = 0xA9000000
PARAMETER C_HIGHADDR = 0xA9001FFF
BUS_INTERFACE SOPB = opb_v20_0
PORT Sys_Intr1 = ps2_1_intr
PORT Sys_Intr2 = ps2_2_intr
PORT Clkin1 = ps2_clk_rx_1
PORT Clkin2 = ps2_clk_rx_2
PORT Clkpd1 = ps2_clk_tx_1
PORT Clkpd2 = ps2_clk_tx_2
PORT Rx1 = ps2_d_rx_1
PORT Rx2 = ps2_d_rx_2
PORT Txpd1 = ps2_d_tx_1
PORT Txpd2 = ps2_d_tx_2
END
It would result in the following device tree nodes:
opb_ps2_dual_ref_0: opb-ps2-dual-ref@a9000000 {
#address-cells = <1>;
#size-cells = <1>;
compatible = "xlnx,compound";
ranges = <0 a9000000 2000>;
// If this device had extra parameters, then they would
// go here.
ps2@0 {
compatible = "xlnx,opb-ps2-dual-ref-1.00.a";
reg = <0 40>;
interrupt-parent = <&opb_intc_0>;
interrupts = <3 0>;
cell-index = <0>;
};
ps2@1000 {
compatible = "xlnx,opb-ps2-dual-ref-1.00.a";
reg = <1000 40>;
interrupt-parent = <&opb_intc_0>;
interrupts = <3 0>;
cell-index = <0>;
};
};
Also, the system.mhs file defines bus attachments from the processor
to the devices. The device tree structure should reflect the bus
attachments. Again an example; this system.mhs fragment:
BEGIN ppc405_virtex4
PARAMETER INSTANCE = ppc405_0
PARAMETER HW_VER = 1.01.a
BUS_INTERFACE DPLB = plb_v34_0
BUS_INTERFACE IPLB = plb_v34_0
END
BEGIN opb_intc
PARAMETER INSTANCE = opb_intc_0
PARAMETER HW_VER = 1.00.c
PARAMETER C_BASEADDR = 0xD1000FC0
PARAMETER C_HIGHADDR = 0xD1000FDF
BUS_INTERFACE SOPB = opb_v20_0
END
BEGIN opb_uart16550
PARAMETER INSTANCE = opb_uart16550_0
PARAMETER HW_VER = 1.00.d
PARAMETER C_BASEADDR = 0xa0000000
PARAMETER C_HIGHADDR = 0xa0001FFF
BUS_INTERFACE SOPB = opb_v20_0
END
BEGIN plb_v34
PARAMETER INSTANCE = plb_v34_0
PARAMETER HW_VER = 1.02.a
END
BEGIN plb_bram_if_cntlr
PARAMETER INSTANCE = plb_bram_if_cntlr_0
PARAMETER HW_VER = 1.00.b
PARAMETER C_BASEADDR = 0xFFFF0000
PARAMETER C_HIGHADDR = 0xFFFFFFFF
BUS_INTERFACE SPLB = plb_v34_0
END
BEGIN plb2opb_bridge
PARAMETER INSTANCE = plb2opb_bridge_0
PARAMETER HW_VER = 1.01.a
PARAMETER C_RNG0_BASEADDR = 0x20000000
PARAMETER C_RNG0_HIGHADDR = 0x3FFFFFFF
PARAMETER C_RNG1_BASEADDR = 0x60000000
PARAMETER C_RNG1_HIGHADDR = 0x7FFFFFFF
PARAMETER C_RNG2_BASEADDR = 0x80000000
PARAMETER C_RNG2_HIGHADDR = 0xBFFFFFFF
PARAMETER C_RNG3_BASEADDR = 0xC0000000
PARAMETER C_RNG3_HIGHADDR = 0xDFFFFFFF
BUS_INTERFACE SPLB = plb_v34_0
BUS_INTERFACE MOPB = opb_v20_0
END
Gives this device tree (some properties removed for clarity):
plb@0 {
#address-cells = <1>;
#size-cells = <1>;
compatible = "xlnx,plb-v34-1.02.a";
device_type = "ibm,plb";
ranges; // 1:1 translation
plb_bram_if_cntrl_0: bram@ffff0000 {
reg = <ffff0000 10000>;
}
opb@20000000 {
#address-cells = <1>;
#size-cells = <1>;
ranges = <20000000 20000000 20000000
60000000 60000000 20000000
80000000 80000000 40000000
c0000000 c0000000 20000000>;
opb_uart16550_0: serial@a0000000 {
reg = <a00000000 2000>;
};
opb_intc_0: interrupt-controller@d1000fc0 {
reg = <d1000fc0 20>;
};
};
};
That covers the general approach to binding xilinx IP cores into the
device tree. The following are bindings for specific devices:
i) Xilinx ML300 Framebuffer
Simple framebuffer device from the ML300 reference design (also on the
ML403 reference design as well as others).
Optional properties:
- resolution = <xres yres> : pixel resolution of framebuffer. Some
implementations use a different resolution.
Default is <d#640 d#480>
- virt-resolution = <xvirt yvirt> : Size of framebuffer in memory.
Default is <d#1024 d#480>.
- rotate-display (empty) : rotate display 180 degrees.
ii) Xilinx SystemACE
The Xilinx SystemACE device is used to program FPGAs from an FPGA
bitstream stored on a CF card. It can also be used as a generic CF
interface device.
Optional properties:
- 8-bit (empty) : Set this property for SystemACE in 8 bit mode
iii) Xilinx EMAC and Xilinx TEMAC
Xilinx Ethernet devices. In addition to general xilinx properties
listed above, nodes for these devices should include a phy-handle
property, and may include other common network device properties
like local-mac-address.
iv) Xilinx Uartlite
Xilinx uartlite devices are simple fixed speed serial ports.
Required properties:
- current-speed : Baud rate of uartlite
v) Xilinx hwicap
Xilinx hwicap devices provide access to the configuration logic
of the FPGA through the Internal Configuration Access Port
(ICAP). The ICAP enables partial reconfiguration of the FPGA,
readback of the configuration information, and some control over
'warm boots' of the FPGA fabric.
Required properties:
- xlnx,family : The family of the FPGA, necessary since the
capabilities of the underlying ICAP hardware
differ between different families. May be
'virtex2p', 'virtex4', or 'virtex5'.
vi) Xilinx Uart 16550
Xilinx UART 16550 devices are very similar to the NS16550 but with
different register spacing and an offset from the base address.
Required properties:
- clock-frequency : Frequency of the clock input
- reg-offset : A value of 3 is required
- reg-shift : A value of 2 is required

172
Documentation/pps/pps.txt Normal file
View File

@ -0,0 +1,172 @@
PPS - Pulse Per Second
----------------------
(C) Copyright 2007 Rodolfo Giometti <giometti@enneenne.com>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
Overview
--------
LinuxPPS provides a programming interface (API) to define in the
system several PPS sources.
PPS means "pulse per second" and a PPS source is just a device which
provides a high precision signal each second so that an application
can use it to adjust system clock time.
A PPS source can be connected to a serial port (usually to the Data
Carrier Detect pin) or to a parallel port (ACK-pin) or to a special
CPU's GPIOs (this is the common case in embedded systems) but in each
case when a new pulse arrives the system must apply to it a timestamp
and record it for userland.
Common use is the combination of the NTPD as userland program, with a
GPS receiver as PPS source, to obtain a wallclock-time with
sub-millisecond synchronisation to UTC.
RFC considerations
------------------
While implementing a PPS API as RFC 2783 defines and using an embedded
CPU GPIO-Pin as physical link to the signal, I encountered a deeper
problem:
At startup it needs a file descriptor as argument for the function
time_pps_create().
This implies that the source has a /dev/... entry. This assumption is
ok for the serial and parallel port, where you can do something
useful besides(!) the gathering of timestamps as it is the central
task for a PPS-API. But this assumption does not work for a single
purpose GPIO line. In this case even basic file-related functionality
(like read() and write()) makes no sense at all and should not be a
precondition for the use of a PPS-API.
The problem can be simply solved if you consider that a PPS source is
not always connected with a GPS data source.
So your programs should check if the GPS data source (the serial port
for instance) is a PPS source too, and if not they should provide the
possibility to open another device as PPS source.
In LinuxPPS the PPS sources are simply char devices usually mapped
into files /dev/pps0, /dev/pps1, etc..
Coding example
--------------
To register a PPS source into the kernel you should define a struct
pps_source_info_s as follows:
static struct pps_source_info pps_ktimer_info = {
.name = "ktimer",
.path = "",
.mode = PPS_CAPTUREASSERT | PPS_OFFSETASSERT | \
PPS_ECHOASSERT | \
PPS_CANWAIT | PPS_TSFMT_TSPEC,
.echo = pps_ktimer_echo,
.owner = THIS_MODULE,
};
and then calling the function pps_register_source() in your
intialization routine as follows:
source = pps_register_source(&pps_ktimer_info,
PPS_CAPTUREASSERT | PPS_OFFSETASSERT);
The pps_register_source() prototype is:
int pps_register_source(struct pps_source_info_s *info, int default_params)
where "info" is a pointer to a structure that describes a particular
PPS source, "default_params" tells the system what the initial default
parameters for the device should be (it is obvious that these parameters
must be a subset of ones defined in the struct
pps_source_info_s which describe the capabilities of the driver).
Once you have registered a new PPS source into the system you can
signal an assert event (for example in the interrupt handler routine)
just using:
pps_event(source, &ts, PPS_CAPTUREASSERT, ptr)
where "ts" is the event's timestamp.
The same function may also run the defined echo function
(pps_ktimer_echo(), passing to it the "ptr" pointer) if the user
asked for that... etc..
Please see the file drivers/pps/clients/ktimer.c for example code.
SYSFS support
-------------
If the SYSFS filesystem is enabled in the kernel it provides a new class:
$ ls /sys/class/pps/
pps0/ pps1/ pps2/
Every directory is the ID of a PPS sources defined in the system and
inside you find several files:
$ ls /sys/class/pps/pps0/
assert clear echo mode name path subsystem@ uevent
Inside each "assert" and "clear" file you can find the timestamp and a
sequence number:
$ cat /sys/class/pps/pps0/assert
1170026870.983207967#8
Where before the "#" is the timestamp in seconds; after it is the
sequence number. Other files are:
* echo: reports if the PPS source has an echo function or not;
* mode: reports available PPS functioning modes;
* name: reports the PPS source's name;
* path: reports the PPS source's device path, that is the device the
PPS source is connected to (if it exists).
Testing the PPS support
-----------------------
In order to test the PPS support even without specific hardware you can use
the ktimer driver (see the client subsection in the PPS configuration menu)
and the userland tools provided into Documentaion/pps/ directory.
Once you have enabled the compilation of ktimer just modprobe it (if
not statically compiled):
# modprobe ktimer
and the run ppstest as follow:
$ ./ppstest /dev/pps0
trying PPS source "/dev/pps1"
found PPS source "/dev/pps1"
ok, found 1 source(s), now start fetching data...
source 0 - assert 1186592699.388832443, sequence: 364 - clear 0.000000000, sequence: 0
source 0 - assert 1186592700.388931295, sequence: 365 - clear 0.000000000, sequence: 0
source 0 - assert 1186592701.389032765, sequence: 366 - clear 0.000000000, sequence: 0
Please, note that to compile userland programs you need the file timepps.h
(see Documentation/pps/).

View File

@ -1,575 +1,139 @@
rfkill - RF switch subsystem support
====================================
rfkill - RF kill switch support
===============================
1 Introduction
2 Implementation details
3 Kernel driver guidelines
3.1 wireless device drivers
3.2 platform/switch drivers
3.3 input device drivers
4 Kernel API
5 Userspace support
1. Introduction
2. Implementation details
3. Kernel API
4. Userspace support
1. Introduction:
1. Introduction
The rfkill switch subsystem exists to add a generic interface to circuitry that
can enable or disable the signal output of a wireless *transmitter* of any
type. By far, the most common use is to disable radio-frequency transmitters.
The rfkill subsystem provides a generic interface to disabling any radio
transmitter in the system. When a transmitter is blocked, it shall not
radiate any power.
Note that disabling the signal output means that the the transmitter is to be
made to not emit any energy when "blocked". rfkill is not about blocking data
transmissions, it is about blocking energy emission.
The subsystem also provides the ability to react on button presses and
disable all transmitters of a certain type (or all). This is intended for
situations where transmitters need to be turned off, for example on
aircraft.
The rfkill subsystem offers support for keys and switches often found on
laptops to enable wireless devices like WiFi and Bluetooth, so that these keys
and switches actually perform an action in all wireless devices of a given type
attached to the system.
The rfkill subsystem has a concept of "hard" and "soft" block, which
differ little in their meaning (block == transmitters off) but rather in
whether they can be changed or not:
- hard block: read-only radio block that cannot be overriden by software
- soft block: writable radio block (need not be readable) that is set by
the system software.
The buttons to enable and disable the wireless transmitters are important in
situations where the user is for example using his laptop on a location where
radio-frequency transmitters _must_ be disabled (e.g. airplanes).
Because of this requirement, userspace support for the keys should not be made
mandatory. Because userspace might want to perform some additional smarter
tasks when the key is pressed, rfkill provides userspace the possibility to
take over the task to handle the key events.
2. Implementation details
The rfkill subsystem is composed of three main components:
* the rfkill core,
* the deprecated rfkill-input module (an input layer handler, being
replaced by userspace policy code) and
* the rfkill drivers.
The rfkill core provides API for kernel drivers to register their radio
transmitter with the kernel, methods for turning it on and off and, letting
the system know about hardware-disabled states that may be implemented on
the device.
The rfkill core code also notifies userspace of state changes, and provides
ways for userspace to query the current states. See the "Userspace support"
section below.
When the device is hard-blocked (either by a call to rfkill_set_hw_state()
or from query_hw_block) set_block() will be invoked for additional software
block, but drivers can ignore the method call since they can use the return
value of the function rfkill_set_hw_state() to sync the software state
instead of keeping track of calls to set_block(). In fact, drivers should
use the return value of rfkill_set_hw_state() unless the hardware actually
keeps track of soft and hard block separately.
3. Kernel API
Drivers for radio transmitters normally implement an rfkill driver.
Platform drivers might implement input devices if the rfkill button is just
that, a button. If that button influences the hardware then you need to
implement an rfkill driver instead. This also applies if the platform provides
a way to turn on/off the transmitter(s).
For some platforms, it is possible that the hardware state changes during
suspend/hibernation, in which case it will be necessary to update the rfkill
core with the current state is at resume time.
To create an rfkill driver, driver's Kconfig needs to have
depends on RFKILL || !RFKILL
to ensure the driver cannot be built-in when rfkill is modular. The !RFKILL
case allows the driver to be built when rfkill is not configured, which which
case all rfkill API can still be used but will be provided by static inlines
which compile to almost nothing.
Calling rfkill_set_hw_state() when a state change happens is required from
rfkill drivers that control devices that can be hard-blocked unless they also
assign the poll_hw_block() callback (then the rfkill core will poll the
device). Don't do this unless you cannot get the event in any other way.
5. Userspace support
The recommended userspace interface to use is /dev/rfkill, which is a misc
character device that allows userspace to obtain and set the state of rfkill
devices and sets of devices. It also notifies userspace about device addition
and removal. The API is a simple read/write API that is defined in
linux/rfkill.h, with one ioctl that allows turning off the deprecated input
handler in the kernel for the transition period.
Except for the one ioctl, communication with the kernel is done via read()
and write() of instances of 'struct rfkill_event'. In this structure, the
soft and hard block are properly separated (unlike sysfs, see below) and
userspace is able to get a consistent snapshot of all rfkill devices in the
system. Also, it is possible to switch all rfkill drivers (or all drivers of
a specified type) into a state which also updates the default state for
hotplugged devices.
After an application opens /dev/rfkill, it can read the current state of
all devices, and afterwards can poll the descriptor for hotplug or state
change events.
Applications must ignore operations (the "op" field) they do not handle,
this allows the API to be extended in the future.
===============================================================================
2: Implementation details
The rfkill subsystem is composed of various components: the rfkill class, the
rfkill-input module (an input layer handler), and some specific input layer
events.
The rfkill class provides kernel drivers with an interface that allows them to
know when they should enable or disable a wireless network device transmitter.
This is enabled by the CONFIG_RFKILL Kconfig option.
The rfkill class support makes sure userspace will be notified of all state
changes on rfkill devices through uevents. It provides a notification chain
for interested parties in the kernel to also get notified of rfkill state
changes in other drivers. It creates several sysfs entries which can be used
by userspace. See section "Userspace support".
The rfkill-input module provides the kernel with the ability to implement a
basic response when the user presses a key or button (or toggles a switch)
related to rfkill functionality. It is an in-kernel implementation of default
policy of reacting to rfkill-related input events and neither mandatory nor
required for wireless drivers to operate. It is enabled by the
CONFIG_RFKILL_INPUT Kconfig option.
rfkill-input is a rfkill-related events input layer handler. This handler will
listen to all rfkill key events and will change the rfkill state of the
wireless devices accordingly. With this option enabled userspace could either
do nothing or simply perform monitoring tasks.
The rfkill-input module also provides EPO (emergency power-off) functionality
for all wireless transmitters. This function cannot be overridden, and it is
always active. rfkill EPO is related to *_RFKILL_ALL input layer events.
Important terms for the rfkill subsystem:
In order to avoid confusion, we avoid the term "switch" in rfkill when it is
referring to an electronic control circuit that enables or disables a
transmitter. We reserve it for the physical device a human manipulates
(which is an input device, by the way):
rfkill switch:
A physical device a human manipulates. Its state can be perceived by
the kernel either directly (through a GPIO pin, ACPI GPE) or by its
effect on a rfkill line of a wireless device.
rfkill controller:
A hardware circuit that controls the state of a rfkill line, which a
kernel driver can interact with *to modify* that state (i.e. it has
either write-only or read/write access).
rfkill line:
An input channel (hardware or software) of a wireless device, which
causes a wireless transmitter to stop emitting energy (BLOCK) when it
is active. Point of view is extremely important here: rfkill lines are
always seen from the PoV of a wireless device (and its driver).
soft rfkill line/software rfkill line:
A rfkill line the wireless device driver can directly change the state
of. Related to rfkill_state RFKILL_STATE_SOFT_BLOCKED.
hard rfkill line/hardware rfkill line:
A rfkill line that works fully in hardware or firmware, and that cannot
be overridden by the kernel driver. The hardware device or the
firmware just exports its status to the driver, but it is read-only.
Related to rfkill_state RFKILL_STATE_HARD_BLOCKED.
The enum rfkill_state describes the rfkill state of a transmitter:
When a rfkill line or rfkill controller is in the RFKILL_STATE_UNBLOCKED state,
the wireless transmitter (radio TX circuit for example) is *enabled*. When the
it is in the RFKILL_STATE_SOFT_BLOCKED or RFKILL_STATE_HARD_BLOCKED, the
wireless transmitter is to be *blocked* from operating.
RFKILL_STATE_SOFT_BLOCKED indicates that a call to toggle_radio() can change
that state. RFKILL_STATE_HARD_BLOCKED indicates that a call to toggle_radio()
will not be able to change the state and will return with a suitable error if
attempts are made to set the state to RFKILL_STATE_UNBLOCKED.
RFKILL_STATE_HARD_BLOCKED is used by drivers to signal that the device is
locked in the BLOCKED state by a hardwire rfkill line (typically an input pin
that, when active, forces the transmitter to be disabled) which the driver
CANNOT override.
Full rfkill functionality requires two different subsystems to cooperate: the
input layer and the rfkill class. The input layer issues *commands* to the
entire system requesting that devices registered to the rfkill class change
state. The way this interaction happens is not complex, but it is not obvious
either:
Kernel Input layer:
* Generates KEY_WWAN, KEY_WLAN, KEY_BLUETOOTH, SW_RFKILL_ALL, and
other such events when the user presses certain keys, buttons, or
toggles certain physical switches.
THE INPUT LAYER IS NEVER USED TO PROPAGATE STATUS, NOTIFICATIONS OR THE
KIND OF STUFF AN ON-SCREEN-DISPLAY APPLICATION WOULD REPORT. It is
used to issue *commands* for the system to change behaviour, and these
commands may or may not be carried out by some kernel driver or
userspace application. It follows that doing user feedback based only
on input events is broken, as there is no guarantee that an input event
will be acted upon.
Most wireless communication device drivers implementing rfkill
functionality MUST NOT generate these events, and have no reason to
register themselves with the input layer. Doing otherwise is a common
misconception. There is an API to propagate rfkill status change
information, and it is NOT the input layer.
rfkill class:
* Calls a hook in a driver to effectively change the wireless
transmitter state;
* Keeps track of the wireless transmitter state (with help from
the driver);
* Generates userspace notifications (uevents) and a call to a
notification chain (kernel) when there is a wireless transmitter
state change;
* Connects a wireless communications driver with the common rfkill
control system, which, for example, allows actions such as
"switch all bluetooth devices offline" to be carried out by
userspace or by rfkill-input.
THE RFKILL CLASS NEVER ISSUES INPUT EVENTS. THE RFKILL CLASS DOES
NOT LISTEN TO INPUT EVENTS. NO DRIVER USING THE RFKILL CLASS SHALL
EVER LISTEN TO, OR ACT ON RFKILL INPUT EVENTS. Doing otherwise is
a layering violation.
Most wireless data communication drivers in the kernel have just to
implement the rfkill class API to work properly. Interfacing to the
input layer is not often required (and is very often a *bug*) on
wireless drivers.
Platform drivers often have to attach to the input layer to *issue*
(but never to listen to) rfkill events for rfkill switches, and also to
the rfkill class to export a control interface for the platform rfkill
controllers to the rfkill subsystem. This does NOT mean the rfkill
switch is attached to a rfkill class (doing so is almost always wrong).
It just means the same kernel module is the driver for different
devices (rfkill switches and rfkill controllers).
Userspace input handlers (uevents) or kernel input handlers (rfkill-input):
* Implements the policy of what should happen when one of the input
layer events related to rfkill operation is received.
* Uses the sysfs interface (userspace) or private rfkill API calls
to tell the devices registered with the rfkill class to change
their state (i.e. translates the input layer event into real
action).
* rfkill-input implements EPO by handling EV_SW SW_RFKILL_ALL 0
(power off all transmitters) in a special way: it ignores any
overrides and local state cache and forces all transmitters to the
RFKILL_STATE_SOFT_BLOCKED state (including those which are already
supposed to be BLOCKED).
* rfkill EPO will remain active until rfkill-input receives an
EV_SW SW_RFKILL_ALL 1 event. While the EPO is active, transmitters
are locked in the blocked state (rfkill will refuse to unblock them).
* rfkill-input implements different policies that the user can
select for handling EV_SW SW_RFKILL_ALL 1. It will unlock rfkill,
and either do nothing (leave transmitters blocked, but now unlocked),
restore the transmitters to their state before the EPO, or unblock
them all.
Userspace uevent handler or kernel platform-specific drivers hooked to the
rfkill notifier chain:
* Taps into the rfkill notifier chain or to KOBJ_CHANGE uevents,
in order to know when a device that is registered with the rfkill
class changes state;
* Issues feedback notifications to the user;
* In the rare platforms where this is required, synthesizes an input
event to command all *OTHER* rfkill devices to also change their
statues when a specific rfkill device changes state.
===============================================================================
3: Kernel driver guidelines
Remember: point-of-view is everything for a driver that connects to the rfkill
subsystem. All the details below must be measured/perceived from the point of
view of the specific driver being modified.
The first thing one needs to know is whether his driver should be talking to
the rfkill class or to the input layer. In rare cases (platform drivers), it
could happen that you need to do both, as platform drivers often handle a
variety of devices in the same driver.
Do not mistake input devices for rfkill controllers. The only type of "rfkill
switch" device that is to be registered with the rfkill class are those
directly controlling the circuits that cause a wireless transmitter to stop
working (or the software equivalent of them), i.e. what we call a rfkill
controller. Every other kind of "rfkill switch" is just an input device and
MUST NOT be registered with the rfkill class.
A driver should register a device with the rfkill class when ALL of the
following conditions are met (they define a rfkill controller):
1. The device is/controls a data communications wireless transmitter;
2. The kernel can interact with the hardware/firmware to CHANGE the wireless
transmitter state (block/unblock TX operation);
3. The transmitter can be made to not emit any energy when "blocked":
rfkill is not about blocking data transmissions, it is about blocking
energy emission;
A driver should register a device with the input subsystem to issue
rfkill-related events (KEY_WLAN, KEY_BLUETOOTH, KEY_WWAN, KEY_WIMAX,
SW_RFKILL_ALL, etc) when ALL of the folowing conditions are met:
1. It is directly related to some physical device the user interacts with, to
command the O.S./firmware/hardware to enable/disable a data communications
wireless transmitter.
Examples of the physical device are: buttons, keys and switches the user
will press/touch/slide/switch to enable or disable the wireless
communication device.
2. It is NOT slaved to another device, i.e. there is no other device that
issues rfkill-related input events in preference to this one.
Please refer to the corner cases and examples section for more details.
When in doubt, do not issue input events. For drivers that should generate
input events in some platforms, but not in others (e.g. b43), the best solution
is to NEVER generate input events in the first place. That work should be
deferred to a platform-specific kernel module (which will know when to generate
events through the rfkill notifier chain) or to userspace. This avoids the
usual maintenance problems with DMI whitelisting.
Corner cases and examples:
====================================
1. If the device is an input device that, because of hardware or firmware,
causes wireless transmitters to be blocked regardless of the kernel's will, it
is still just an input device, and NOT to be registered with the rfkill class.
2. If the wireless transmitter switch control is read-only, it is an input
device and not to be registered with the rfkill class (and maybe not to be made
an input layer event source either, see below).
3. If there is some other device driver *closer* to the actual hardware the
user interacted with (the button/switch/key) to issue an input event, THAT is
the device driver that should be issuing input events.
E.g:
[RFKILL slider switch] -- [GPIO hardware] -- [WLAN card rf-kill input]
(platform driver) (wireless card driver)
The user is closer to the RFKILL slide switch plaform driver, so the driver
which must issue input events is the platform driver looking at the GPIO
hardware, and NEVER the wireless card driver (which is just a slave). It is
very likely that there are other leaves than just the WLAN card rf-kill input
(e.g. a bluetooth card, etc)...
On the other hand, some embedded devices do this:
[RFKILL slider switch] -- [WLAN card rf-kill input]
(wireless card driver)
In this situation, the wireless card driver *could* register itself as an input
device and issue rf-kill related input events... but in order to AVOID the need
for DMI whitelisting, the wireless card driver does NOT do it. Userspace (HAL)
or a platform driver (that exists only on these embedded devices) will do the
dirty job of issuing the input events.
COMMON MISTAKES in kernel drivers, related to rfkill:
====================================
1. NEVER confuse input device keys and buttons with input device switches.
1a. Switches are always set or reset. They report the current state
(on position or off position).
1b. Keys and buttons are either in the pressed or not-pressed state, and
that's it. A "button" that latches down when you press it, and
unlatches when you press it again is in fact a switch as far as input
devices go.
Add the SW_* events you need for switches, do NOT try to emulate a button using
KEY_* events just because there is no such SW_* event yet. Do NOT try to use,
for example, KEY_BLUETOOTH when you should be using SW_BLUETOOTH instead.
2. Input device switches (sources of EV_SW events) DO store their current state
(so you *must* initialize it by issuing a gratuitous input layer event on
driver start-up and also when resuming from sleep), and that state CAN be
queried from userspace through IOCTLs. There is no sysfs interface for this,
but that doesn't mean you should break things trying to hook it to the rfkill
class to get a sysfs interface :-)
3. Do not issue *_RFKILL_ALL events by default, unless you are sure it is the
correct event for your switch/button. These events are emergency power-off
events when they are trying to turn the transmitters off. An example of an
input device which SHOULD generate *_RFKILL_ALL events is the wireless-kill
switch in a laptop which is NOT a hotkey, but a real sliding/rocker switch.
An example of an input device which SHOULD NOT generate *_RFKILL_ALL events by
default, is any sort of hot key that is type-specific (e.g. the one for WLAN).
3.1 Guidelines for wireless device drivers
------------------------------------------
(in this text, rfkill->foo means the foo field of struct rfkill).
1. Each independent transmitter in a wireless device (usually there is only one
transmitter per device) should have a SINGLE rfkill class attached to it.
2. If the device does not have any sort of hardware assistance to allow the
driver to rfkill the device, the driver should emulate it by taking all actions
required to silence the transmitter.
3. If it is impossible to silence the transmitter (i.e. it still emits energy,
even if it is just in brief pulses, when there is no data to transmit and there
is no hardware support to turn it off) do NOT lie to the users. Do not attach
it to a rfkill class. The rfkill subsystem does not deal with data
transmission, it deals with energy emission. If the transmitter is emitting
energy, it is not blocked in rfkill terms.
4. It doesn't matter if the device has multiple rfkill input lines affecting
the same transmitter, their combined state is to be exported as a single state
per transmitter (see rule 1).
This rule exists because users of the rfkill subsystem expect to get (and set,
when possible) the overall transmitter rfkill state, not of a particular rfkill
line.
5. The wireless device driver MUST NOT leave the transmitter enabled during
suspend and hibernation unless:
5.1. The transmitter has to be enabled for some sort of functionality
like wake-on-wireless-packet or autonomous packed forwarding in a mesh
network, and that functionality is enabled for this suspend/hibernation
cycle.
AND
5.2. The device was not on a user-requested BLOCKED state before
the suspend (i.e. the driver must NOT unblock a device, not even
to support wake-on-wireless-packet or remain in the mesh).
In other words, there is absolutely no allowed scenario where a driver can
automatically take action to unblock a rfkill controller (obviously, this deals
with scenarios where soft-blocking or both soft and hard blocking is happening.
Scenarios where hardware rfkill lines are the only ones blocking the
transmitter are outside of this rule, since the wireless device driver does not
control its input hardware rfkill lines in the first place).
6. During resume, rfkill will try to restore its previous state.
7. After a rfkill class is suspended, it will *not* call rfkill->toggle_radio
until it is resumed.
Example of a WLAN wireless driver connected to the rfkill subsystem:
--------------------------------------------------------------------
A certain WLAN card has one input pin that causes it to block the transmitter
and makes the status of that input pin available (only for reading!) to the
kernel driver. This is a hard rfkill input line (it cannot be overridden by
the kernel driver).
The card also has one PCI register that, if manipulated by the driver, causes
it to block the transmitter. This is a soft rfkill input line.
It has also a thermal protection circuitry that shuts down its transmitter if
the card overheats, and makes the status of that protection available (only for
reading!) to the kernel driver. This is also a hard rfkill input line.
If either one of these rfkill lines are active, the transmitter is blocked by
the hardware and forced offline.
The driver should allocate and attach to its struct device *ONE* instance of
the rfkill class (there is only one transmitter).
It can implement the get_state() hook, and return RFKILL_STATE_HARD_BLOCKED if
either one of its two hard rfkill input lines are active. If the two hard
rfkill lines are inactive, it must return RFKILL_STATE_SOFT_BLOCKED if its soft
rfkill input line is active. Only if none of the rfkill input lines are
active, will it return RFKILL_STATE_UNBLOCKED.
Since the device has a hardware rfkill line, it IS subject to state changes
external to rfkill. Therefore, the driver must make sure that it calls
rfkill_force_state() to keep the status always up-to-date, and it must do a
rfkill_force_state() on resume from sleep.
Every time the driver gets a notification from the card that one of its rfkill
lines changed state (polling might be needed on badly designed cards that don't
generate interrupts for such events), it recomputes the rfkill state as per
above, and calls rfkill_force_state() to update it.
The driver should implement the toggle_radio() hook, that:
1. Returns an error if one of the hardware rfkill lines are active, and the
caller asked for RFKILL_STATE_UNBLOCKED.
2. Activates the soft rfkill line if the caller asked for state
RFKILL_STATE_SOFT_BLOCKED. It should do this even if one of the hard rfkill
lines are active, effectively double-blocking the transmitter.
3. Deactivates the soft rfkill line if none of the hardware rfkill lines are
active and the caller asked for RFKILL_STATE_UNBLOCKED.
===============================================================================
4: Kernel API
To build a driver with rfkill subsystem support, the driver should depend on
(or select) the Kconfig symbol RFKILL; it should _not_ depend on RKFILL_INPUT.
The hardware the driver talks to may be write-only (where the current state
of the hardware is unknown), or read-write (where the hardware can be queried
about its current state).
The rfkill class will call the get_state hook of a device every time it needs
to know the *real* current state of the hardware. This can happen often, but
it does not do any polling, so it is not enough on hardware that is subject
to state changes outside of the rfkill subsystem.
Therefore, calling rfkill_force_state() when a state change happens is
mandatory when the device has a hardware rfkill line, or when something else
like the firmware could cause its state to be changed without going through the
rfkill class.
Some hardware provides events when its status changes. In these cases, it is
best for the driver to not provide a get_state hook, and instead register the
rfkill class *already* with the correct status, and keep it updated using
rfkill_force_state() when it gets an event from the hardware.
rfkill_force_state() must be used on the device resume handlers to update the
rfkill status, should there be any chance of the device status changing during
the sleep.
There is no provision for a statically-allocated rfkill struct. You must
use rfkill_allocate() to allocate one.
You should:
- rfkill_allocate()
- modify rfkill fields (flags, name)
- modify state to the current hardware state (THIS IS THE ONLY TIME
YOU CAN ACCESS state DIRECTLY)
- rfkill_register()
The only way to set a device to the RFKILL_STATE_HARD_BLOCKED state is through
a suitable return of get_state() or through rfkill_force_state().
When a device is in the RFKILL_STATE_HARD_BLOCKED state, the only way to switch
it to a different state is through a suitable return of get_state() or through
rfkill_force_state().
If toggle_radio() is called to set a device to state RFKILL_STATE_SOFT_BLOCKED
when that device is already at the RFKILL_STATE_HARD_BLOCKED state, it should
not return an error. Instead, it should try to double-block the transmitter,
so that its state will change from RFKILL_STATE_HARD_BLOCKED to
RFKILL_STATE_SOFT_BLOCKED should the hardware blocking cease.
Please refer to the source for more documentation.
===============================================================================
5: Userspace support
rfkill devices issue uevents (with an action of "change"), with the following
environment variables set:
Additionally, each rfkill device is registered in sysfs and there has the
following attributes:
name: Name assigned by driver to this key (interface or driver name).
type: Driver type string ("wlan", "bluetooth", etc).
persistent: Whether the soft blocked state is initialised from
non-volatile storage at startup.
state: Current state of the transmitter
0: RFKILL_STATE_SOFT_BLOCKED
transmitter is turned off by software
1: RFKILL_STATE_UNBLOCKED
transmitter is (potentially) active
2: RFKILL_STATE_HARD_BLOCKED
transmitter is forced off by something outside of
the driver's control.
This file is deprecated because it can only properly show
three of the four possible states, soft-and-hard-blocked is
missing.
claim: 0: Kernel handles events
This file is deprecated because there no longer is a way to
claim just control over a single rfkill instance.
rfkill devices also issue uevents (with an action of "change"), with the
following environment variables set:
RFKILL_NAME
RFKILL_STATE
RFKILL_TYPE
The ABI for these variables is defined by the sysfs attributes. It is best
to take a quick look at the source to make sure of the possible values.
It is expected that HAL will trap those, and bridge them to DBUS, etc. These
events CAN and SHOULD be used to give feedback to the user about the rfkill
status of the system.
Input devices may issue events that are related to rfkill. These are the
various KEY_* events and SW_* events supported by rfkill-input.c.
******IMPORTANT******
When rfkill-input is ACTIVE, userspace is NOT TO CHANGE THE STATE OF AN RFKILL
SWITCH IN RESPONSE TO AN INPUT EVENT also handled by rfkill-input, unless it
has set to true the user_claim attribute for that particular switch. This rule
is *absolute*; do NOT violate it.
******IMPORTANT******
Userspace must not assume it is the only source of control for rfkill switches.
Their state CAN and WILL change due to firmware actions, direct user actions,
and the rfkill-input EPO override for *_RFKILL_ALL.
When rfkill-input is not active, userspace must initiate a rfkill status
change by writing to the "state" attribute in order for anything to happen.
Take particular care to implement EV_SW SW_RFKILL_ALL properly. When that
switch is set to OFF, *every* rfkill device *MUST* be immediately put into the
RFKILL_STATE_SOFT_BLOCKED state, no questions asked.
The following sysfs entries will be created:
name: Name assigned by driver to this key (interface or driver name).
type: Name of the key type ("wlan", "bluetooth", etc).
state: Current state of the transmitter
0: RFKILL_STATE_SOFT_BLOCKED
transmitter is forced off, but one can override it
by a write to the state attribute;
1: RFKILL_STATE_UNBLOCKED
transmiter is NOT forced off, and may operate if
all other conditions for such operation are met
(such as interface is up and configured, etc);
2: RFKILL_STATE_HARD_BLOCKED
transmitter is forced off by something outside of
the driver's control. One cannot set a device to
this state through writes to the state attribute;
claim: 1: Userspace handles events, 0: Kernel handles events
Both the "state" and "claim" entries are also writable. For the "state" entry
this means that when 1 or 0 is written, the device rfkill state (if not yet in
the requested state), will be will be toggled accordingly.
For the "claim" entry writing 1 to it means that the kernel no longer handles
key events even though RFKILL_INPUT input was enabled. When "claim" has been
set to 0, userspace should make sure that it listens for the input events or
check the sysfs "state" entry regularly to correctly perform the required tasks
when the rkfill key is pressed.
A note about input devices and EV_SW events:
In order to know the current state of an input device switch (like
SW_RFKILL_ALL), you will need to use an IOCTL. That information is not
available through sysfs in a generic way at this time, and it is not available
through the rfkill class AT ALL.
The contents of these variables corresponds to the "name", "state" and
"type" sysfs files explained above.

View File

@ -135,7 +135,7 @@ manipulating this list), the user code must observe the following
protocol on 'lock entry' insertion and removal:
On insertion:
1) set the 'list_op_pending' word to the address of the 'lock word'
1) set the 'list_op_pending' word to the address of the 'lock entry'
to be inserted,
2) acquire the futex lock,
3) add the lock entry, with its thread id (TID) in the bottom 29 bits
@ -143,7 +143,7 @@ On insertion:
4) clear the 'list_op_pending' word.
On removal:
1) set the 'list_op_pending' word to the address of the 'lock word'
1) set the 'list_op_pending' word to the address of the 'lock entry'
to be removed,
2) remove the lock entry for this lock from the 'head' list,
2) release the futex lock, and

View File

@ -1,10 +1,11 @@
SCSI FC Tansport
=============================================
Date: 4/12/2007
Date: 11/18/2008
Kernel Revisions for features:
rports : <<TBS>>
vports : 2.6.22 (? TBD)
vports : 2.6.22
bsg support : 2.6.30 (?TBD?)
Introduction
@ -15,6 +16,7 @@ The FC transport can be found at:
drivers/scsi/scsi_transport_fc.c
include/scsi/scsi_transport_fc.h
include/scsi/scsi_netlink_fc.h
include/scsi/scsi_bsg_fc.h
This file is found at Documentation/scsi/scsi_fc_transport.txt
@ -472,6 +474,14 @@ int
fc_vport_terminate(struct fc_vport *vport)
FC BSG support (CT & ELS passthru, and more)
========================================================================
<< To Be Supplied >>
Credits
=======
The following people have contributed to this document:

View File

@ -1271,6 +1271,11 @@ of interest:
hostdata[0] - area reserved for LLD at end of struct Scsi_Host. Size
is set by the second argument (named 'xtr_bytes') to
scsi_host_alloc() or scsi_register().
vendor_id - a unique value that identifies the vendor supplying
the LLD for the Scsi_Host. Used most often in validating
vendor-specific message requests. Value consists of an
identifier type and a vendor-specific value.
See scsi_netlink.h for a description of valid formats.
The scsi_host structure is defined in include/scsi/scsi_host.h

View File

@ -139,6 +139,7 @@ ALC883/888
acer Acer laptops (Travelmate 3012WTMi, Aspire 5600, etc)
acer-aspire Acer Aspire 9810
acer-aspire-4930g Acer Aspire 4930G
acer-aspire-6530g Acer Aspire 6530G
acer-aspire-8930g Acer Aspire 8930G
medion Medion Laptops
medion-md2 Medion MD2

View File

@ -233,8 +233,8 @@ These protections are added to score to judge whether this zone should be used
for page allocation or should be reclaimed.
In this example, if normal pages (index=2) are required to this DMA zone and
pages_high is used for watermark, the kernel judges this zone should not be
used because pages_free(1355) is smaller than watermark + protection[2]
watermark[WMARK_HIGH] is used for watermark, the kernel judges this zone should
not be used because pages_free(1355) is smaller than watermark + protection[2]
(4 + 2004 = 2008). If this protection value is 0, this zone would be used for
normal page requirement. If requirement is DMA zone(index=0), protection[0]
(=0) is used.
@ -280,9 +280,10 @@ The default value is 65536.
min_free_kbytes:
This is used to force the Linux VM to keep a minimum number
of kilobytes free. The VM uses this number to compute a pages_min
value for each lowmem zone in the system. Each lowmem zone gets
a number of reserved free pages based proportionally on its size.
of kilobytes free. The VM uses this number to compute a
watermark[WMARK_MIN] value for each lowmem zone in the system.
Each lowmem zone gets a number of reserved free pages based
proportionally on its size.
Some minimal amount of memory is needed to satisfy PF_MEMALLOC
allocations; if you set this to lower than 1024KB, your system will
@ -314,10 +315,14 @@ min_unmapped_ratio:
This is available only on NUMA kernels.
A percentage of the total pages in each zone. Zone reclaim will only
occur if more than this percentage of pages are file backed and unmapped.
This is to insure that a minimal amount of local pages is still available for
file I/O even if the node is overallocated.
This is a percentage of the total pages in each zone. Zone reclaim will
only occur if more than this percentage of pages are in a state that
zone_reclaim_mode allows to be reclaimed.
If zone_reclaim_mode has the value 4 OR'd, then the percentage is compared
against all file-backed unmapped pages including swapcache pages and tmpfs
files. Otherwise, only unmapped pages backed by normal files but not tmpfs
files and similar are considered.
The default is 1 percent.

View File

@ -7,7 +7,6 @@ Copyright 2008 Red Hat Inc.
(dual licensed under the GPL v2)
Reviewers: Elias Oltmanns, Randy Dunlap, Andrew Morton,
John Kacur, and David Teigland.
Written for: 2.6.28-rc2
Introduction
@ -33,13 +32,26 @@ The File System
Ftrace uses the debugfs file system to hold the control files as
well as the files to display output.
To mount the debugfs system:
When debugfs is configured into the kernel (which selecting any ftrace
option will do) the directory /sys/kernel/debug will be created. To mount
this directory, you can add to your /etc/fstab file:
# mkdir /debug
# mount -t debugfs nodev /debug
debugfs /sys/kernel/debug debugfs defaults 0 0
( Note: it is more common to mount at /sys/kernel/debug, but for
simplicity this document will use /debug)
Or you can mount it at run time with:
mount -t debugfs nodev /sys/kernel/debug
For quicker access to that directory you may want to make a soft link to
it:
ln -s /sys/kernel/debug /debug
Any selected ftrace option will also create a directory called tracing
within the debugfs. The rest of the document will assume that you are in
the ftrace directory (cd /sys/kernel/debug/tracing) and will only concentrate
on the files within that directory and not distract from the content with
the extended "/sys/kernel/debug/tracing" path name.
That's it! (assuming that you have ftrace configured into your kernel)
@ -389,18 +401,18 @@ trace_options
The trace_options file is used to control what gets printed in
the trace output. To see what is available, simply cat the file:
cat /debug/tracing/trace_options
cat trace_options
print-parent nosym-offset nosym-addr noverbose noraw nohex nobin \
noblock nostacktrace nosched-tree nouserstacktrace nosym-userobj
To disable one of the options, echo in the option prepended with
"no".
echo noprint-parent > /debug/tracing/trace_options
echo noprint-parent > trace_options
To enable an option, leave off the "no".
echo sym-offset > /debug/tracing/trace_options
echo sym-offset > trace_options
Here are the available options:
@ -476,11 +488,11 @@ sched_switch
This tracer simply records schedule switches. Here is an example
of how to use it.
# echo sched_switch > /debug/tracing/current_tracer
# echo 1 > /debug/tracing/tracing_enabled
# echo sched_switch > current_tracer
# echo 1 > tracing_enabled
# sleep 1
# echo 0 > /debug/tracing/tracing_enabled
# cat /debug/tracing/trace
# echo 0 > tracing_enabled
# cat trace
# tracer: sched_switch
#
@ -583,13 +595,13 @@ new trace is saved.
To reset the maximum, echo 0 into tracing_max_latency. Here is
an example:
# echo irqsoff > /debug/tracing/current_tracer
# echo 0 > /debug/tracing/tracing_max_latency
# echo 1 > /debug/tracing/tracing_enabled
# echo irqsoff > current_tracer
# echo 0 > tracing_max_latency
# echo 1 > tracing_enabled
# ls -ltr
[...]
# echo 0 > /debug/tracing/tracing_enabled
# cat /debug/tracing/latency_trace
# echo 0 > tracing_enabled
# cat latency_trace
# tracer: irqsoff
#
irqsoff latency trace v1.1.5 on 2.6.26
@ -690,13 +702,13 @@ Like the irqsoff tracer, it records the maximum latency for
which preemption was disabled. The control of preemptoff tracer
is much like the irqsoff tracer.
# echo preemptoff > /debug/tracing/current_tracer
# echo 0 > /debug/tracing/tracing_max_latency
# echo 1 > /debug/tracing/tracing_enabled
# echo preemptoff > current_tracer
# echo 0 > tracing_max_latency
# echo 1 > tracing_enabled
# ls -ltr
[...]
# echo 0 > /debug/tracing/tracing_enabled
# cat /debug/tracing/latency_trace
# echo 0 > tracing_enabled
# cat latency_trace
# tracer: preemptoff
#
preemptoff latency trace v1.1.5 on 2.6.26-rc8
@ -837,13 +849,13 @@ tracer.
Again, using this trace is much like the irqsoff and preemptoff
tracers.
# echo preemptirqsoff > /debug/tracing/current_tracer
# echo 0 > /debug/tracing/tracing_max_latency
# echo 1 > /debug/tracing/tracing_enabled
# echo preemptirqsoff > current_tracer
# echo 0 > tracing_max_latency
# echo 1 > tracing_enabled
# ls -ltr
[...]
# echo 0 > /debug/tracing/tracing_enabled
# cat /debug/tracing/latency_trace
# echo 0 > tracing_enabled
# cat latency_trace
# tracer: preemptirqsoff
#
preemptirqsoff latency trace v1.1.5 on 2.6.26-rc8
@ -999,12 +1011,12 @@ slightly differently than we did with the previous tracers.
Instead of performing an 'ls', we will run 'sleep 1' under
'chrt' which changes the priority of the task.
# echo wakeup > /debug/tracing/current_tracer
# echo 0 > /debug/tracing/tracing_max_latency
# echo 1 > /debug/tracing/tracing_enabled
# echo wakeup > current_tracer
# echo 0 > tracing_max_latency
# echo 1 > tracing_enabled
# chrt -f 5 sleep 1
# echo 0 > /debug/tracing/tracing_enabled
# cat /debug/tracing/latency_trace
# echo 0 > tracing_enabled
# cat latency_trace
# tracer: wakeup
#
wakeup latency trace v1.1.5 on 2.6.26-rc8
@ -1114,11 +1126,11 @@ can be done from the debug file system. Make sure the
ftrace_enabled is set; otherwise this tracer is a nop.
# sysctl kernel.ftrace_enabled=1
# echo function > /debug/tracing/current_tracer
# echo 1 > /debug/tracing/tracing_enabled
# echo function > current_tracer
# echo 1 > tracing_enabled
# usleep 1
# echo 0 > /debug/tracing/tracing_enabled
# cat /debug/tracing/trace
# echo 0 > tracing_enabled
# cat trace
# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
@ -1155,7 +1167,7 @@ int trace_fd;
[...]
int main(int argc, char *argv[]) {
[...]
trace_fd = open("/debug/tracing/tracing_enabled", O_WRONLY);
trace_fd = open(tracing_file("tracing_enabled"), O_WRONLY);
[...]
if (condition_hit()) {
write(trace_fd, "0", 1);
@ -1163,26 +1175,20 @@ int main(int argc, char *argv[]) {
[...]
}
Note: Here we hard coded the path name. The debugfs mount is not
guaranteed to be at /debug (and is more commonly at
/sys/kernel/debug). For simple one time traces, the above is
sufficent. For anything else, a search through /proc/mounts may
be needed to find where the debugfs file-system is mounted.
Single thread tracing
---------------------
By writing into /debug/tracing/set_ftrace_pid you can trace a
By writing into set_ftrace_pid you can trace a
single thread. For example:
# cat /debug/tracing/set_ftrace_pid
# cat set_ftrace_pid
no pid
# echo 3111 > /debug/tracing/set_ftrace_pid
# cat /debug/tracing/set_ftrace_pid
# echo 3111 > set_ftrace_pid
# cat set_ftrace_pid
3111
# echo function > /debug/tracing/current_tracer
# cat /debug/tracing/trace | head
# echo function > current_tracer
# cat trace | head
# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
@ -1193,8 +1199,8 @@ no pid
yum-updatesd-3111 [003] 1637.254683: lock_hrtimer_base <-hrtimer_try_to_cancel
yum-updatesd-3111 [003] 1637.254685: fget_light <-do_sys_poll
yum-updatesd-3111 [003] 1637.254686: pipe_poll <-do_sys_poll
# echo -1 > /debug/tracing/set_ftrace_pid
# cat /debug/tracing/trace |head
# echo -1 > set_ftrace_pid
# cat trace |head
# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
@ -1216,6 +1222,51 @@ something like this simple program:
#include <fcntl.h>
#include <unistd.h>
#define _STR(x) #x
#define STR(x) _STR(x)
#define MAX_PATH 256
const char *find_debugfs(void)
{
static char debugfs[MAX_PATH+1];
static int debugfs_found;
char type[100];
FILE *fp;
if (debugfs_found)
return debugfs;
if ((fp = fopen("/proc/mounts","r")) == NULL) {
perror("/proc/mounts");
return NULL;
}
while (fscanf(fp, "%*s %"
STR(MAX_PATH)
"s %99s %*s %*d %*d\n",
debugfs, type) == 2) {
if (strcmp(type, "debugfs") == 0)
break;
}
fclose(fp);
if (strcmp(type, "debugfs") != 0) {
fprintf(stderr, "debugfs not mounted");
return NULL;
}
debugfs_found = 1;
return debugfs;
}
const char *tracing_file(const char *file_name)
{
static char trace_file[MAX_PATH+1];
snprintf(trace_file, MAX_PATH, "%s/%s", find_debugfs(), file_name);
return trace_file;
}
int main (int argc, char **argv)
{
if (argc < 1)
@ -1226,12 +1277,12 @@ int main (int argc, char **argv)
char line[64];
int s;
ffd = open("/debug/tracing/current_tracer", O_WRONLY);
ffd = open(tracing_file("current_tracer"), O_WRONLY);
if (ffd < 0)
exit(-1);
write(ffd, "nop", 3);
fd = open("/debug/tracing/set_ftrace_pid", O_WRONLY);
fd = open(tracing_file("set_ftrace_pid"), O_WRONLY);
s = sprintf(line, "%d\n", getpid());
write(fd, line, s);
@ -1383,22 +1434,22 @@ want, depending on your needs.
tracing_cpu_mask file) or you might sometimes see unordered
function calls while cpu tracing switch.
hide: echo nofuncgraph-cpu > /debug/tracing/trace_options
show: echo funcgraph-cpu > /debug/tracing/trace_options
hide: echo nofuncgraph-cpu > trace_options
show: echo funcgraph-cpu > trace_options
- The duration (function's time of execution) is displayed on
the closing bracket line of a function or on the same line
than the current function in case of a leaf one. It is default
enabled.
hide: echo nofuncgraph-duration > /debug/tracing/trace_options
show: echo funcgraph-duration > /debug/tracing/trace_options
hide: echo nofuncgraph-duration > trace_options
show: echo funcgraph-duration > trace_options
- The overhead field precedes the duration field in case of
reached duration thresholds.
hide: echo nofuncgraph-overhead > /debug/tracing/trace_options
show: echo funcgraph-overhead > /debug/tracing/trace_options
hide: echo nofuncgraph-overhead > trace_options
show: echo funcgraph-overhead > trace_options
depends on: funcgraph-duration
ie:
@ -1427,8 +1478,8 @@ want, depending on your needs.
- The task/pid field displays the thread cmdline and pid which
executed the function. It is default disabled.
hide: echo nofuncgraph-proc > /debug/tracing/trace_options
show: echo funcgraph-proc > /debug/tracing/trace_options
hide: echo nofuncgraph-proc > trace_options
show: echo funcgraph-proc > trace_options
ie:
@ -1451,8 +1502,8 @@ want, depending on your needs.
system clock since it started. A snapshot of this time is
given on each entry/exit of functions
hide: echo nofuncgraph-abstime > /debug/tracing/trace_options
show: echo funcgraph-abstime > /debug/tracing/trace_options
hide: echo nofuncgraph-abstime > trace_options
show: echo funcgraph-abstime > trace_options
ie:
@ -1549,7 +1600,7 @@ listed in:
available_filter_functions
# cat /debug/tracing/available_filter_functions
# cat available_filter_functions
put_prev_task_idle
kmem_cache_create
pick_next_task_rt
@ -1561,12 +1612,12 @@ mutex_lock
If I am only interested in sys_nanosleep and hrtimer_interrupt:
# echo sys_nanosleep hrtimer_interrupt \
> /debug/tracing/set_ftrace_filter
# echo ftrace > /debug/tracing/current_tracer
# echo 1 > /debug/tracing/tracing_enabled
> set_ftrace_filter
# echo ftrace > current_tracer
# echo 1 > tracing_enabled
# usleep 1
# echo 0 > /debug/tracing/tracing_enabled
# cat /debug/tracing/trace
# echo 0 > tracing_enabled
# cat trace
# tracer: ftrace
#
# TASK-PID CPU# TIMESTAMP FUNCTION
@ -1577,7 +1628,7 @@ If I am only interested in sys_nanosleep and hrtimer_interrupt:
To see which functions are being traced, you can cat the file:
# cat /debug/tracing/set_ftrace_filter
# cat set_ftrace_filter
hrtimer_interrupt
sys_nanosleep
@ -1597,7 +1648,7 @@ Note: It is better to use quotes to enclose the wild cards,
otherwise the shell may expand the parameters into names
of files in the local directory.
# echo 'hrtimer_*' > /debug/tracing/set_ftrace_filter
# echo 'hrtimer_*' > set_ftrace_filter
Produces:
@ -1618,7 +1669,7 @@ Produces:
Notice that we lost the sys_nanosleep.
# cat /debug/tracing/set_ftrace_filter
# cat set_ftrace_filter
hrtimer_run_queues
hrtimer_run_pending
hrtimer_init
@ -1644,17 +1695,17 @@ To append to the filters, use '>>'
To clear out a filter so that all functions will be recorded
again:
# echo > /debug/tracing/set_ftrace_filter
# cat /debug/tracing/set_ftrace_filter
# echo > set_ftrace_filter
# cat set_ftrace_filter
#
Again, now we want to append.
# echo sys_nanosleep > /debug/tracing/set_ftrace_filter
# cat /debug/tracing/set_ftrace_filter
# echo sys_nanosleep > set_ftrace_filter
# cat set_ftrace_filter
sys_nanosleep
# echo 'hrtimer_*' >> /debug/tracing/set_ftrace_filter
# cat /debug/tracing/set_ftrace_filter
# echo 'hrtimer_*' >> set_ftrace_filter
# cat set_ftrace_filter
hrtimer_run_queues
hrtimer_run_pending
hrtimer_init
@ -1677,7 +1728,7 @@ hrtimer_init_sleeper
The set_ftrace_notrace prevents those functions from being
traced.
# echo '*preempt*' '*lock*' > /debug/tracing/set_ftrace_notrace
# echo '*preempt*' '*lock*' > set_ftrace_notrace
Produces:
@ -1767,13 +1818,13 @@ the effect on the tracing is different. Every read from
trace_pipe is consumed. This means that subsequent reads will be
different. The trace is live.
# echo function > /debug/tracing/current_tracer
# cat /debug/tracing/trace_pipe > /tmp/trace.out &
# echo function > current_tracer
# cat trace_pipe > /tmp/trace.out &
[1] 4153
# echo 1 > /debug/tracing/tracing_enabled
# echo 1 > tracing_enabled
# usleep 1
# echo 0 > /debug/tracing/tracing_enabled
# cat /debug/tracing/trace
# echo 0 > tracing_enabled
# cat trace
# tracer: function
#
# TASK-PID CPU# TIMESTAMP FUNCTION
@ -1809,7 +1860,7 @@ number listed is the number of entries that can be recorded per
CPU. To know the full size, multiply the number of possible CPUS
with the number of entries.
# cat /debug/tracing/buffer_size_kb
# cat buffer_size_kb
1408 (units kilobytes)
Note, to modify this, you must have tracing completely disabled.
@ -1817,18 +1868,18 @@ To do that, echo "nop" into the current_tracer. If the
current_tracer is not set to "nop", an EINVAL error will be
returned.
# echo nop > /debug/tracing/current_tracer
# echo 10000 > /debug/tracing/buffer_size_kb
# cat /debug/tracing/buffer_size_kb
# echo nop > current_tracer
# echo 10000 > buffer_size_kb
# cat buffer_size_kb
10000 (units kilobytes)
The number of pages which will be allocated is limited to a
percentage of available memory. Allocating too much will produce
an error.
# echo 1000000000000 > /debug/tracing/buffer_size_kb
# echo 1000000000000 > buffer_size_kb
-bash: echo: write error: Cannot allocate memory
# cat /debug/tracing/buffer_size_kb
# cat buffer_size_kb
85
-----------

View File

@ -32,41 +32,41 @@ is no way to automatically detect if you are losing events due to CPUs racing.
Usage Quick Reference
---------------------
$ mount -t debugfs debugfs /debug
$ echo mmiotrace > /debug/tracing/current_tracer
$ cat /debug/tracing/trace_pipe > mydump.txt &
$ mount -t debugfs debugfs /sys/kernel/debug
$ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
$ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt &
Start X or whatever.
$ echo "X is up" > /debug/tracing/trace_marker
$ echo nop > /debug/tracing/current_tracer
$ echo "X is up" > /sys/kernel/debug/tracing/trace_marker
$ echo nop > /sys/kernel/debug/tracing/current_tracer
Check for lost events.
Usage
-----
Make sure debugfs is mounted to /debug. If not, (requires root privileges)
$ mount -t debugfs debugfs /debug
Make sure debugfs is mounted to /sys/kernel/debug. If not, (requires root privileges)
$ mount -t debugfs debugfs /sys/kernel/debug
Check that the driver you are about to trace is not loaded.
Activate mmiotrace (requires root privileges):
$ echo mmiotrace > /debug/tracing/current_tracer
$ echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
Start storing the trace:
$ cat /debug/tracing/trace_pipe > mydump.txt &
$ cat /sys/kernel/debug/tracing/trace_pipe > mydump.txt &
The 'cat' process should stay running (sleeping) in the background.
Load the driver you want to trace and use it. Mmiotrace will only catch MMIO
accesses to areas that are ioremapped while mmiotrace is active.
During tracing you can place comments (markers) into the trace by
$ echo "X is up" > /debug/tracing/trace_marker
$ echo "X is up" > /sys/kernel/debug/tracing/trace_marker
This makes it easier to see which part of the (huge) trace corresponds to
which action. It is recommended to place descriptive markers about what you
do.
Shut down mmiotrace (requires root privileges):
$ echo nop > /debug/tracing/current_tracer
$ echo nop > /sys/kernel/debug/tracing/current_tracer
The 'cat' process exits. If it does not, kill it by issuing 'fg' command and
pressing ctrl+c.
@ -78,10 +78,10 @@ to view your kernel log and look for "mmiotrace has lost events" warning. If
events were lost, the trace is incomplete. You should enlarge the buffers and
try again. Buffers are enlarged by first seeing how large the current buffers
are:
$ cat /debug/tracing/buffer_size_kb
$ cat /sys/kernel/debug/tracing/buffer_size_kb
gives you a number. Approximately double this number and write it back, for
instance:
$ echo 128000 > /debug/tracing/buffer_size_kb
$ echo 128000 > /sys/kernel/debug/tracing/buffer_size_kb
Then start again from the top.
If you are doing a trace for a driver project, e.g. Nouveau, you should also

View File

@ -16,3 +16,8 @@
15 -> TeVii S470 [d470:9022]
16 -> DVBWorld DVB-S2 2005 [0001:2005]
17 -> NetUP Dual DVB-S2 CI [1b55:2a2c]
18 -> Hauppauge WinTV-HVR1270 [0070:2211]
19 -> Hauppauge WinTV-HVR1275 [0070:2215]
20 -> Hauppauge WinTV-HVR1255 [0070:2251]
21 -> Hauppauge WinTV-HVR1210 [0070:2291,0070:2295]
22 -> Mygica X8506 DMB-TH [14f1:8651]

View File

@ -6,8 +6,8 @@
5 -> Leadtek Winfast 2000XP Expert [107d:6611,107d:6613]
6 -> AverTV Studio 303 (M126) [1461:000b]
7 -> MSI TV-@nywhere Master [1462:8606]
8 -> Leadtek Winfast DV2000 [107d:6620]
9 -> Leadtek PVR 2000 [107d:663b,107d:663c,107d:6632]
8 -> Leadtek Winfast DV2000 [107d:6620,107d:6621]
9 -> Leadtek PVR 2000 [107d:663b,107d:663c,107d:6632,107d:6630,107d:6638,107d:6631,107d:6637,107d:663d]
10 -> IODATA GV-VCP3/PCI [10fc:d003]
11 -> Prolink PlayTV PVR
12 -> ASUS PVR-416 [1043:4823,1461:c111]
@ -59,7 +59,7 @@
58 -> Pinnacle PCTV HD 800i [11bd:0051]
59 -> DViCO FusionHDTV 5 PCI nano [18ac:d530]
60 -> Pinnacle Hybrid PCTV [12ab:1788]
61 -> Winfast TV2000 XP Global [107d:6f18]
61 -> Leadtek TV2000 XP Global [107d:6f18,107d:6618]
62 -> PowerColor RA330 [14f1:ea3d]
63 -> Geniatech X8000-MT DVBT [14f1:8852]
64 -> DViCO FusionHDTV DVB-T PRO [18ac:db30]
@ -78,3 +78,5 @@
77 -> TBS 8910 DVB-S [8910:8888]
78 -> Prof 6200 DVB-S [b022:3022]
79 -> Terratec Cinergy HT PCI MKII [153b:1177]
80 -> Hauppauge WinTV-IR Only [0070:9290]
81 -> Leadtek WinFast DTV1800 Hybrid [107d:6654]

View File

@ -17,7 +17,7 @@
16 -> Hauppauge WinTV HVR 950 (em2883) [2040:6513,2040:6517,2040:651b]
17 -> Pinnacle PCTV HD Pro Stick (em2880) [2304:0227]
18 -> Hauppauge WinTV HVR 900 (R2) (em2880) [2040:6502]
19 -> PointNix Intra-Oral Camera (em2860)
19 -> EM2860/SAA711X Reference Design (em2860)
20 -> AMD ATI TV Wonder HD 600 (em2880) [0438:b002]
21 -> eMPIA Technology, Inc. GrabBeeX+ Video Encoder (em2800) [eb1a:2801]
22 -> Unknown EM2750/EM2751 webcam grabber (em2750) [eb1a:2750,eb1a:2751]
@ -61,3 +61,8 @@
63 -> Kaiomy TVnPC U2 (em2860) [eb1a:e303]
64 -> Easy Cap Capture DC-60 (em2860)
65 -> IO-DATA GV-MVP/SZ (em2820/em2840) [04bb:0515]
66 -> Empire dual TV (em2880)
67 -> Terratec Grabby (em2860) [0ccd:0096]
68 -> Terratec AV350 (em2860) [0ccd:0084]
69 -> KWorld ATSC 315U HDTV TV Box (em2882) [eb1a:a313]
70 -> Evga inDtube (em2882)

View File

@ -124,10 +124,10 @@
123 -> Beholder BeholdTV 407 [0000:4070]
124 -> Beholder BeholdTV 407 FM [0000:4071]
125 -> Beholder BeholdTV 409 [0000:4090]
126 -> Beholder BeholdTV 505 FM/RDS [0000:5051,0000:505B,5ace:5050]
127 -> Beholder BeholdTV 507 FM/RDS / BeholdTV 509 FM [0000:5071,0000:507B,5ace:5070,5ace:5090]
126 -> Beholder BeholdTV 505 FM [5ace:5050]
127 -> Beholder BeholdTV 507 FM / BeholdTV 509 FM [5ace:5070,5ace:5090]
128 -> Beholder BeholdTV Columbus TVFM [0000:5201]
129 -> Beholder BeholdTV 607 / BeholdTV 609 [5ace:6070,5ace:6071,5ace:6072,5ace:6073,5ace:6090,5ace:6091,5ace:6092,5ace:6093]
129 -> Beholder BeholdTV 607 FM [5ace:6070]
130 -> Beholder BeholdTV M6 [5ace:6190]
131 -> Twinhan Hybrid DTV-DVB 3056 PCI [1822:0022]
132 -> Genius TVGO AM11MCE
@ -143,7 +143,7 @@
142 -> Beholder BeholdTV H6 [5ace:6290]
143 -> Beholder BeholdTV M63 [5ace:6191]
144 -> Beholder BeholdTV M6 Extra [5ace:6193]
145 -> AVerMedia MiniPCI DVB-T Hybrid M103 [1461:f636]
145 -> AVerMedia MiniPCI DVB-T Hybrid M103 [1461:f636,1461:f736]
146 -> ASUSTeK P7131 Analog
147 -> Asus Tiger 3in1 [1043:4878]
148 -> Encore ENLTV-FM v5.3 [1a7f:2008]
@ -154,4 +154,16 @@
153 -> Kworld Plus TV Analog Lite PCI [17de:7128]
154 -> Avermedia AVerTV GO 007 FM Plus [1461:f31d]
155 -> Hauppauge WinTV-HVR1120 ATSC/QAM-Hybrid [0070:6706,0070:6708]
156 -> Hauppauge WinTV-HVR1110r3 [0070:6707,0070:6709,0070:670a]
156 -> Hauppauge WinTV-HVR1110r3 DVB-T/Hybrid [0070:6707,0070:6709,0070:670a]
157 -> Avermedia AVerTV Studio 507UA [1461:a11b]
158 -> AVerMedia Cardbus TV/Radio (E501R) [1461:b7e9]
159 -> Beholder BeholdTV 505 RDS [0000:505B]
160 -> Beholder BeholdTV 507 RDS [0000:5071]
161 -> Beholder BeholdTV 507 RDS [0000:507B]
162 -> Beholder BeholdTV 607 FM [5ace:6071]
163 -> Beholder BeholdTV 609 FM [5ace:6090]
164 -> Beholder BeholdTV 609 FM [5ace:6091]
165 -> Beholder BeholdTV 607 RDS [5ace:6072]
166 -> Beholder BeholdTV 607 RDS [5ace:6073]
167 -> Beholder BeholdTV 609 RDS [5ace:6092]
168 -> Beholder BeholdTV 609 RDS [5ace:6093]

View File

@ -76,3 +76,5 @@ tuner=75 - Philips TEA5761 FM Radio
tuner=76 - Xceive 5000 tuner
tuner=77 - TCL tuner MF02GIP-5N-E
tuner=78 - Philips FMD1216MEX MK3 Hybrid Tuner
tuner=79 - Philips PAL/SECAM multi (FM1216 MK5)
tuner=80 - Philips FQ1216LME MK3 PAL/SECAM w/active loopthrough

View File

@ -163,10 +163,11 @@ sunplus 055f:c650 Mustek MDC5500Z
zc3xx 055f:d003 Mustek WCam300A
zc3xx 055f:d004 Mustek WCam300 AN
conex 0572:0041 Creative Notebook cx11646
ov519 05a9:0519 OmniVision
ov519 05a9:0519 OV519 Microphone
ov519 05a9:0530 OmniVision
ov519 05a9:4519 OmniVision
ov519 05a9:4519 Webcam Classic
ov519 05a9:8519 OmniVision
ov519 05a9:a518 D-Link DSB-C310 Webcam
sunplus 05da:1018 Digital Dream Enigma 1.3
stk014 05e1:0893 Syntek DV4000
spca561 060b:a001 Maxell Compact Pc PM3
@ -178,6 +179,7 @@ spca506 06e1:a190 ADS Instant VCD
ov534 06f8:3002 Hercules Blog Webcam
ov534 06f8:3003 Hercules Dualpix HD Weblog
sonixj 06f8:3004 Hercules Classic Silver
sonixj 06f8:3008 Hercules Deluxe Optical Glass
spca508 0733:0110 ViewQuest VQ110
spca508 0130:0130 Clone Digital Webcam 11043
spca501 0733:0401 Intel Create and Share
@ -209,6 +211,7 @@ sunplus 08ca:2050 Medion MD 41437
sunplus 08ca:2060 Aiptek PocketDV5300
tv8532 0923:010f ICM532 cams
mars 093a:050f Mars-Semi Pc-Camera
mr97310a 093a:010f Sakar Digital no. 77379
pac207 093a:2460 Qtec Webcam 100
pac207 093a:2461 HP Webcam
pac207 093a:2463 Philips SPC 220 NC
@ -265,6 +268,11 @@ sonixj 0c45:60ec SN9C105+MO4000
sonixj 0c45:60fb Surfer NoName
sonixj 0c45:60fc LG-LIC300
sonixj 0c45:60fe Microdia Audio
sonixj 0c45:6100 PC Camera (SN9C128)
sonixj 0c45:610a PC Camera (SN9C128)
sonixj 0c45:610b PC Camera (SN9C128)
sonixj 0c45:610c PC Camera (SN9C128)
sonixj 0c45:610e PC Camera (SN9C128)
sonixj 0c45:6128 Microdia/Sonix SNP325
sonixj 0c45:612a Avant Camera
sonixj 0c45:612c Typhoon Rasy Cam 1.3MPix

View File

@ -26,6 +26,55 @@ Global video workflow
Once the last buffer is filled in, the QCI interface stops.
c) Capture global finite state machine schema
+----+ +---+ +----+
| DQ | | Q | | DQ |
| v | v | v
+-----------+ +------------------------+
| STOP | | Wait for capture start |
+-----------+ Q +------------------------+
+-> | QCI: stop | ------------------> | QCI: run | <------------+
| | DMA: stop | | DMA: stop | |
| +-----------+ +-----> +------------------------+ |
| / | |
| / +---+ +----+ | |
|capture list empty / | Q | | DQ | | QCI Irq EOF |
| / | v | v v |
| +--------------------+ +----------------------+ |
| | DMA hotlink missed | | Capture running | |
| +--------------------+ +----------------------+ |
| | QCI: run | +-----> | QCI: run | <-+ |
| | DMA: stop | / | DMA: run | | |
| +--------------------+ / +----------------------+ | Other |
| ^ /DMA still | | channels |
| | capture list / running | DMA Irq End | not |
| | not empty / | | finished |
| | / v | yet |
| +----------------------+ +----------------------+ | |
| | Videobuf released | | Channel completed | | |
| +----------------------+ +----------------------+ | |
+-- | QCI: run | | QCI: run | --+ |
| DMA: run | | DMA: run | |
+----------------------+ +----------------------+ |
^ / | |
| no overrun / | overrun |
| / v |
+--------------------+ / +----------------------+ |
| Frame completed | / | Frame overran | |
+--------------------+ <-----+ +----------------------+ restart frame |
| QCI: run | | QCI: stop | --------------+
| DMA: run | | DMA: stop |
+--------------------+ +----------------------+
Legend: - each box is a FSM state
- each arrow is the condition to transition to another state
- an arrow with a comment is a mandatory transition (no condition)
- arrow "Q" means : a buffer was enqueued
- arrow "DQ" means : a buffer was dequeued
- "QCI: stop" means the QCI interface is not enabled
- "DMA: stop" means all 3 DMA channels are stopped
- "DMA: run" means at least 1 DMA channel is still running
DMA usage
---------

View File

@ -89,6 +89,11 @@ from dev (driver name followed by the bus_id, to be precise). If you set it
up before calling v4l2_device_register then it will be untouched. If dev is
NULL, then you *must* setup v4l2_dev->name before calling v4l2_device_register.
You can use v4l2_device_set_name() to set the name based on a driver name and
a driver-global atomic_t instance. This will generate names like ivtv0, ivtv1,
etc. If the name ends with a digit, then it will insert a dash: cx18-0,
cx18-1, etc. This function returns the instance number.
The first 'dev' argument is normally the struct device pointer of a pci_dev,
usb_interface or platform_device. It is rare for dev to be NULL, but it happens
with ISA devices or when one device creates multiple PCI devices, thus making
@ -385,6 +390,30 @@ later date. It differs between i2c drivers and as such can be confusing.
To see which chip variants are supported you can look in the i2c driver code
for the i2c_device_id table. This lists all the possibilities.
There are two more helper functions:
v4l2_i2c_new_subdev_cfg: this function adds new irq and platform_data
arguments and has both 'addr' and 'probed_addrs' arguments: if addr is not
0 then that will be used (non-probing variant), otherwise the probed_addrs
are probed.
For example: this will probe for address 0x10:
struct v4l2_subdev *sd = v4l2_i2c_new_subdev_cfg(v4l2_dev, adapter,
"module_foo", "chipid", 0, NULL, 0, I2C_ADDRS(0x10));
v4l2_i2c_new_subdev_board uses an i2c_board_info struct which is passed
to the i2c driver and replaces the irq, platform_data and addr arguments.
If the subdev supports the s_config core ops, then that op is called with
the irq and platform_data arguments after the subdev was setup. The older
v4l2_i2c_new_(probed_)subdev functions will call s_config as well, but with
irq set to 0 and platform_data set to NULL.
Note that in the next kernel release the functions v4l2_i2c_new_subdev,
v4l2_i2c_new_probed_subdev and v4l2_i2c_new_probed_subdev_addr will all be
replaced by a single v4l2_i2c_new_subdev that is identical to
v4l2_i2c_new_subdev_cfg but without the irq and platform_data arguments.
struct video_device
-------------------

View File

@ -2,7 +2,7 @@
obj- := dummy.o
# List of programs to build
hostprogs-y := slabinfo
hostprogs-y := slabinfo page-types
# Tell kbuild to always build the programs
always := $(hostprogs-y)

View File

@ -75,15 +75,15 @@ Page stealing from process memory and shm is done if stealing the page would
alleviate memory pressure on any zone in the page's node that has fallen below
its watermark.
pages_min/pages_low/pages_high/low_on_memory/zone_wake_kswapd: These are
per-zone fields, used to determine when a zone needs to be balanced. When
the number of pages falls below pages_min, the hysteric field low_on_memory
gets set. This stays set till the number of free pages becomes pages_high.
When low_on_memory is set, page allocation requests will try to free some
pages in the zone (providing GFP_WAIT is set in the request). Orthogonal
to this, is the decision to poke kswapd to free some zone pages. That
decision is not hysteresis based, and is done when the number of free
pages is below pages_low; in which case zone_wake_kswapd is also set.
watemark[WMARK_MIN/WMARK_LOW/WMARK_HIGH]/low_on_memory/zone_wake_kswapd: These
are per-zone fields, used to determine when a zone needs to be balanced. When
the number of pages falls below watermark[WMARK_MIN], the hysteric field
low_on_memory gets set. This stays set till the number of free pages becomes
watermark[WMARK_HIGH]. When low_on_memory is set, page allocation requests will
try to free some pages in the zone (providing GFP_WAIT is set in the request).
Orthogonal to this, is the decision to poke kswapd to free some zone pages.
That decision is not hysteresis based, and is done when the number of free
pages is below watermark[WMARK_LOW]; in which case zone_wake_kswapd is also set.
(Good) Ideas that I have heard:

View File

@ -0,0 +1,698 @@
/*
* page-types: Tool for querying page flags
*
* Copyright (C) 2009 Intel corporation
* Copyright (C) 2009 Wu Fengguang <fengguang.wu@intel.com>
*/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdint.h>
#include <stdarg.h>
#include <string.h>
#include <getopt.h>
#include <limits.h>
#include <sys/types.h>
#include <sys/errno.h>
#include <sys/fcntl.h>
/*
* kernel page flags
*/
#define KPF_BYTES 8
#define PROC_KPAGEFLAGS "/proc/kpageflags"
/* copied from kpageflags_read() */
#define KPF_LOCKED 0
#define KPF_ERROR 1
#define KPF_REFERENCED 2
#define KPF_UPTODATE 3
#define KPF_DIRTY 4
#define KPF_LRU 5
#define KPF_ACTIVE 6
#define KPF_SLAB 7
#define KPF_WRITEBACK 8
#define KPF_RECLAIM 9
#define KPF_BUDDY 10
/* [11-20] new additions in 2.6.31 */
#define KPF_MMAP 11
#define KPF_ANON 12
#define KPF_SWAPCACHE 13
#define KPF_SWAPBACKED 14
#define KPF_COMPOUND_HEAD 15
#define KPF_COMPOUND_TAIL 16
#define KPF_HUGE 17
#define KPF_UNEVICTABLE 18
#define KPF_NOPAGE 20
/* [32-] kernel hacking assistances */
#define KPF_RESERVED 32
#define KPF_MLOCKED 33
#define KPF_MAPPEDTODISK 34
#define KPF_PRIVATE 35
#define KPF_PRIVATE_2 36
#define KPF_OWNER_PRIVATE 37
#define KPF_ARCH 38
#define KPF_UNCACHED 39
/* [48-] take some arbitrary free slots for expanding overloaded flags
* not part of kernel API
*/
#define KPF_READAHEAD 48
#define KPF_SLOB_FREE 49
#define KPF_SLUB_FROZEN 50
#define KPF_SLUB_DEBUG 51
#define KPF_ALL_BITS ((uint64_t)~0ULL)
#define KPF_HACKERS_BITS (0xffffULL << 32)
#define KPF_OVERLOADED_BITS (0xffffULL << 48)
#define BIT(name) (1ULL << KPF_##name)
#define BITS_COMPOUND (BIT(COMPOUND_HEAD) | BIT(COMPOUND_TAIL))
static char *page_flag_names[] = {
[KPF_LOCKED] = "L:locked",
[KPF_ERROR] = "E:error",
[KPF_REFERENCED] = "R:referenced",
[KPF_UPTODATE] = "U:uptodate",
[KPF_DIRTY] = "D:dirty",
[KPF_LRU] = "l:lru",
[KPF_ACTIVE] = "A:active",
[KPF_SLAB] = "S:slab",
[KPF_WRITEBACK] = "W:writeback",
[KPF_RECLAIM] = "I:reclaim",
[KPF_BUDDY] = "B:buddy",
[KPF_MMAP] = "M:mmap",
[KPF_ANON] = "a:anonymous",
[KPF_SWAPCACHE] = "s:swapcache",
[KPF_SWAPBACKED] = "b:swapbacked",
[KPF_COMPOUND_HEAD] = "H:compound_head",
[KPF_COMPOUND_TAIL] = "T:compound_tail",
[KPF_HUGE] = "G:huge",
[KPF_UNEVICTABLE] = "u:unevictable",
[KPF_NOPAGE] = "n:nopage",
[KPF_RESERVED] = "r:reserved",
[KPF_MLOCKED] = "m:mlocked",
[KPF_MAPPEDTODISK] = "d:mappedtodisk",
[KPF_PRIVATE] = "P:private",
[KPF_PRIVATE_2] = "p:private_2",
[KPF_OWNER_PRIVATE] = "O:owner_private",
[KPF_ARCH] = "h:arch",
[KPF_UNCACHED] = "c:uncached",
[KPF_READAHEAD] = "I:readahead",
[KPF_SLOB_FREE] = "P:slob_free",
[KPF_SLUB_FROZEN] = "A:slub_frozen",
[KPF_SLUB_DEBUG] = "E:slub_debug",
};
/*
* data structures
*/
static int opt_raw; /* for kernel developers */
static int opt_list; /* list pages (in ranges) */
static int opt_no_summary; /* don't show summary */
static pid_t opt_pid; /* process to walk */
#define MAX_ADDR_RANGES 1024
static int nr_addr_ranges;
static unsigned long opt_offset[MAX_ADDR_RANGES];
static unsigned long opt_size[MAX_ADDR_RANGES];
#define MAX_BIT_FILTERS 64
static int nr_bit_filters;
static uint64_t opt_mask[MAX_BIT_FILTERS];
static uint64_t opt_bits[MAX_BIT_FILTERS];
static int page_size;
#define PAGES_BATCH (64 << 10) /* 64k pages */
static int kpageflags_fd;
static uint64_t kpageflags_buf[KPF_BYTES * PAGES_BATCH];
#define HASH_SHIFT 13
#define HASH_SIZE (1 << HASH_SHIFT)
#define HASH_MASK (HASH_SIZE - 1)
#define HASH_KEY(flags) (flags & HASH_MASK)
static unsigned long total_pages;
static unsigned long nr_pages[HASH_SIZE];
static uint64_t page_flags[HASH_SIZE];
/*
* helper functions
*/
#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
#define min_t(type, x, y) ({ \
type __min1 = (x); \
type __min2 = (y); \
__min1 < __min2 ? __min1 : __min2; })
unsigned long pages2mb(unsigned long pages)
{
return (pages * page_size) >> 20;
}
void fatal(const char *x, ...)
{
va_list ap;
va_start(ap, x);
vfprintf(stderr, x, ap);
va_end(ap);
exit(EXIT_FAILURE);
}
/*
* page flag names
*/
char *page_flag_name(uint64_t flags)
{
static char buf[65];
int present;
int i, j;
for (i = 0, j = 0; i < ARRAY_SIZE(page_flag_names); i++) {
present = (flags >> i) & 1;
if (!page_flag_names[i]) {
if (present)
fatal("unkown flag bit %d\n", i);
continue;
}
buf[j++] = present ? page_flag_names[i][0] : '_';
}
return buf;
}
char *page_flag_longname(uint64_t flags)
{
static char buf[1024];
int i, n;
for (i = 0, n = 0; i < ARRAY_SIZE(page_flag_names); i++) {
if (!page_flag_names[i])
continue;
if ((flags >> i) & 1)
n += snprintf(buf + n, sizeof(buf) - n, "%s,",
page_flag_names[i] + 2);
}
if (n)
n--;
buf[n] = '\0';
return buf;
}
/*
* page list and summary
*/
void show_page_range(unsigned long offset, uint64_t flags)
{
static uint64_t flags0;
static unsigned long index;
static unsigned long count;
if (flags == flags0 && offset == index + count) {
count++;
return;
}
if (count)
printf("%lu\t%lu\t%s\n",
index, count, page_flag_name(flags0));
flags0 = flags;
index = offset;
count = 1;
}
void show_page(unsigned long offset, uint64_t flags)
{
printf("%lu\t%s\n", offset, page_flag_name(flags));
}
void show_summary(void)
{
int i;
printf(" flags\tpage-count MB"
" symbolic-flags\t\t\tlong-symbolic-flags\n");
for (i = 0; i < ARRAY_SIZE(nr_pages); i++) {
if (nr_pages[i])
printf("0x%016llx\t%10lu %8lu %s\t%s\n",
(unsigned long long)page_flags[i],
nr_pages[i],
pages2mb(nr_pages[i]),
page_flag_name(page_flags[i]),
page_flag_longname(page_flags[i]));
}
printf(" total\t%10lu %8lu\n",
total_pages, pages2mb(total_pages));
}
/*
* page flag filters
*/
int bit_mask_ok(uint64_t flags)
{
int i;
for (i = 0; i < nr_bit_filters; i++) {
if (opt_bits[i] == KPF_ALL_BITS) {
if ((flags & opt_mask[i]) == 0)
return 0;
} else {
if ((flags & opt_mask[i]) != opt_bits[i])
return 0;
}
}
return 1;
}
uint64_t expand_overloaded_flags(uint64_t flags)
{
/* SLOB/SLUB overload several page flags */
if (flags & BIT(SLAB)) {
if (flags & BIT(PRIVATE))
flags ^= BIT(PRIVATE) | BIT(SLOB_FREE);
if (flags & BIT(ACTIVE))
flags ^= BIT(ACTIVE) | BIT(SLUB_FROZEN);
if (flags & BIT(ERROR))
flags ^= BIT(ERROR) | BIT(SLUB_DEBUG);
}
/* PG_reclaim is overloaded as PG_readahead in the read path */
if ((flags & (BIT(RECLAIM) | BIT(WRITEBACK))) == BIT(RECLAIM))
flags ^= BIT(RECLAIM) | BIT(READAHEAD);
return flags;
}
uint64_t well_known_flags(uint64_t flags)
{
/* hide flags intended only for kernel hacker */
flags &= ~KPF_HACKERS_BITS;
/* hide non-hugeTLB compound pages */
if ((flags & BITS_COMPOUND) && !(flags & BIT(HUGE)))
flags &= ~BITS_COMPOUND;
return flags;
}
/*
* page frame walker
*/
int hash_slot(uint64_t flags)
{
int k = HASH_KEY(flags);
int i;
/* Explicitly reserve slot 0 for flags 0: the following logic
* cannot distinguish an unoccupied slot from slot (flags==0).
*/
if (flags == 0)
return 0;
/* search through the remaining (HASH_SIZE-1) slots */
for (i = 1; i < ARRAY_SIZE(page_flags); i++, k++) {
if (!k || k >= ARRAY_SIZE(page_flags))
k = 1;
if (page_flags[k] == 0) {
page_flags[k] = flags;
return k;
}
if (page_flags[k] == flags)
return k;
}
fatal("hash table full: bump up HASH_SHIFT?\n");
exit(EXIT_FAILURE);
}
void add_page(unsigned long offset, uint64_t flags)
{
flags = expand_overloaded_flags(flags);
if (!opt_raw)
flags = well_known_flags(flags);
if (!bit_mask_ok(flags))
return;
if (opt_list == 1)
show_page_range(offset, flags);
else if (opt_list == 2)
show_page(offset, flags);
nr_pages[hash_slot(flags)]++;
total_pages++;
}
void walk_pfn(unsigned long index, unsigned long count)
{
unsigned long batch;
unsigned long n;
unsigned long i;
if (index > ULONG_MAX / KPF_BYTES)
fatal("index overflow: %lu\n", index);
lseek(kpageflags_fd, index * KPF_BYTES, SEEK_SET);
while (count) {
batch = min_t(unsigned long, count, PAGES_BATCH);
n = read(kpageflags_fd, kpageflags_buf, batch * KPF_BYTES);
if (n == 0)
break;
if (n < 0) {
perror(PROC_KPAGEFLAGS);
exit(EXIT_FAILURE);
}
if (n % KPF_BYTES != 0)
fatal("partial read: %lu bytes\n", n);
n = n / KPF_BYTES;
for (i = 0; i < n; i++)
add_page(index + i, kpageflags_buf[i]);
index += batch;
count -= batch;
}
}
void walk_addr_ranges(void)
{
int i;
kpageflags_fd = open(PROC_KPAGEFLAGS, O_RDONLY);
if (kpageflags_fd < 0) {
perror(PROC_KPAGEFLAGS);
exit(EXIT_FAILURE);
}
if (!nr_addr_ranges)
walk_pfn(0, ULONG_MAX);
for (i = 0; i < nr_addr_ranges; i++)
walk_pfn(opt_offset[i], opt_size[i]);
close(kpageflags_fd);
}
/*
* user interface
*/
const char *page_flag_type(uint64_t flag)
{
if (flag & KPF_HACKERS_BITS)
return "(r)";
if (flag & KPF_OVERLOADED_BITS)
return "(o)";
return " ";
}
void usage(void)
{
int i, j;
printf(
"page-types [options]\n"
" -r|--raw Raw mode, for kernel developers\n"
" -a|--addr addr-spec Walk a range of pages\n"
" -b|--bits bits-spec Walk pages with specified bits\n"
#if 0 /* planned features */
" -p|--pid pid Walk process address space\n"
" -f|--file filename Walk file address space\n"
#endif
" -l|--list Show page details in ranges\n"
" -L|--list-each Show page details one by one\n"
" -N|--no-summary Don't show summay info\n"
" -h|--help Show this usage message\n"
"addr-spec:\n"
" N one page at offset N (unit: pages)\n"
" N+M pages range from N to N+M-1\n"
" N,M pages range from N to M-1\n"
" N, pages range from N to end\n"
" ,M pages range from 0 to M\n"
"bits-spec:\n"
" bit1,bit2 (flags & (bit1|bit2)) != 0\n"
" bit1,bit2=bit1 (flags & (bit1|bit2)) == bit1\n"
" bit1,~bit2 (flags & (bit1|bit2)) == bit1\n"
" =bit1,bit2 flags == (bit1|bit2)\n"
"bit-names:\n"
);
for (i = 0, j = 0; i < ARRAY_SIZE(page_flag_names); i++) {
if (!page_flag_names[i])
continue;
printf("%16s%s", page_flag_names[i] + 2,
page_flag_type(1ULL << i));
if (++j > 3) {
j = 0;
putchar('\n');
}
}
printf("\n "
"(r) raw mode bits (o) overloaded bits\n");
}
unsigned long long parse_number(const char *str)
{
unsigned long long n;
n = strtoll(str, NULL, 0);
if (n == 0 && str[0] != '0')
fatal("invalid name or number: %s\n", str);
return n;
}
void parse_pid(const char *str)
{
opt_pid = parse_number(str);
}
void parse_file(const char *name)
{
}
void add_addr_range(unsigned long offset, unsigned long size)
{
if (nr_addr_ranges >= MAX_ADDR_RANGES)
fatal("too much addr ranges\n");
opt_offset[nr_addr_ranges] = offset;
opt_size[nr_addr_ranges] = size;
nr_addr_ranges++;
}
void parse_addr_range(const char *optarg)
{
unsigned long offset;
unsigned long size;
char *p;
p = strchr(optarg, ',');
if (!p)
p = strchr(optarg, '+');
if (p == optarg) {
offset = 0;
size = parse_number(p + 1);
} else if (p) {
offset = parse_number(optarg);
if (p[1] == '\0')
size = ULONG_MAX;
else {
size = parse_number(p + 1);
if (*p == ',') {
if (size < offset)
fatal("invalid range: %lu,%lu\n",
offset, size);
size -= offset;
}
}
} else {
offset = parse_number(optarg);
size = 1;
}
add_addr_range(offset, size);
}
void add_bits_filter(uint64_t mask, uint64_t bits)
{
if (nr_bit_filters >= MAX_BIT_FILTERS)
fatal("too much bit filters\n");
opt_mask[nr_bit_filters] = mask;
opt_bits[nr_bit_filters] = bits;
nr_bit_filters++;
}
uint64_t parse_flag_name(const char *str, int len)
{
int i;
if (!*str || !len)
return 0;
if (len <= 8 && !strncmp(str, "compound", len))
return BITS_COMPOUND;
for (i = 0; i < ARRAY_SIZE(page_flag_names); i++) {
if (!page_flag_names[i])
continue;
if (!strncmp(str, page_flag_names[i] + 2, len))
return 1ULL << i;
}
return parse_number(str);
}
uint64_t parse_flag_names(const char *str, int all)
{
const char *p = str;
uint64_t flags = 0;
while (1) {
if (*p == ',' || *p == '=' || *p == '\0') {
if ((*str != '~') || (*str == '~' && all && *++str))
flags |= parse_flag_name(str, p - str);
if (*p != ',')
break;
str = p + 1;
}
p++;
}
return flags;
}
void parse_bits_mask(const char *optarg)
{
uint64_t mask;
uint64_t bits;
const char *p;
p = strchr(optarg, '=');
if (p == optarg) {
mask = KPF_ALL_BITS;
bits = parse_flag_names(p + 1, 0);
} else if (p) {
mask = parse_flag_names(optarg, 0);
bits = parse_flag_names(p + 1, 0);
} else if (strchr(optarg, '~')) {
mask = parse_flag_names(optarg, 1);
bits = parse_flag_names(optarg, 0);
} else {
mask = parse_flag_names(optarg, 0);
bits = KPF_ALL_BITS;
}
add_bits_filter(mask, bits);
}
struct option opts[] = {
{ "raw" , 0, NULL, 'r' },
{ "pid" , 1, NULL, 'p' },
{ "file" , 1, NULL, 'f' },
{ "addr" , 1, NULL, 'a' },
{ "bits" , 1, NULL, 'b' },
{ "list" , 0, NULL, 'l' },
{ "list-each" , 0, NULL, 'L' },
{ "no-summary", 0, NULL, 'N' },
{ "help" , 0, NULL, 'h' },
{ NULL , 0, NULL, 0 }
};
int main(int argc, char *argv[])
{
int c;
page_size = getpagesize();
while ((c = getopt_long(argc, argv,
"rp:f:a:b:lLNh", opts, NULL)) != -1) {
switch (c) {
case 'r':
opt_raw = 1;
break;
case 'p':
parse_pid(optarg);
break;
case 'f':
parse_file(optarg);
break;
case 'a':
parse_addr_range(optarg);
break;
case 'b':
parse_bits_mask(optarg);
break;
case 'l':
opt_list = 1;
break;
case 'L':
opt_list = 2;
break;
case 'N':
opt_no_summary = 1;
break;
case 'h':
usage();
exit(0);
default:
usage();
exit(1);
}
}
if (opt_list == 1)
printf("offset\tcount\tflags\n");
if (opt_list == 2)
printf("offset\tflags\n");
walk_addr_ranges();
if (opt_list == 1)
show_page_range(0, 0); /* drain the buffer */
if (opt_no_summary)
return 0;
if (opt_list)
printf("\n\n");
show_summary();
return 0;
}

View File

@ -12,9 +12,9 @@ There are three components to pagemap:
value for each virtual page, containing the following data (from
fs/proc/task_mmu.c, above pagemap_read):
* Bits 0-55 page frame number (PFN) if present
* Bits 0-54 page frame number (PFN) if present
* Bits 0-4 swap type if swapped
* Bits 5-55 swap offset if swapped
* Bits 5-54 swap offset if swapped
* Bits 55-60 page shift (page size = 1<<page shift)
* Bit 61 reserved for future use
* Bit 62 page swapped
@ -36,7 +36,7 @@ There are three components to pagemap:
* /proc/kpageflags. This file contains a 64-bit set of flags for each
page, indexed by PFN.
The flags are (from fs/proc/proc_misc, above kpageflags_read):
The flags are (from fs/proc/page.c, above kpageflags_read):
0. LOCKED
1. ERROR
@ -49,6 +49,68 @@ There are three components to pagemap:
8. WRITEBACK
9. RECLAIM
10. BUDDY
11. MMAP
12. ANON
13. SWAPCACHE
14. SWAPBACKED
15. COMPOUND_HEAD
16. COMPOUND_TAIL
16. HUGE
18. UNEVICTABLE
20. NOPAGE
Short descriptions to the page flags:
0. LOCKED
page is being locked for exclusive access, eg. by undergoing read/write IO
7. SLAB
page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator
When compound page is used, SLUB/SLQB will only set this flag on the head
page; SLOB will not flag it at all.
10. BUDDY
a free memory block managed by the buddy system allocator
The buddy system organizes free memory in blocks of various orders.
An order N block has 2^N physically contiguous pages, with the BUDDY flag
set for and _only_ for the first page.
15. COMPOUND_HEAD
16. COMPOUND_TAIL
A compound page with order N consists of 2^N physically contiguous pages.
A compound page with order 2 takes the form of "HTTT", where H donates its
head page and T donates its tail page(s). The major consumers of compound
pages are hugeTLB pages (Documentation/vm/hugetlbpage.txt), the SLUB etc.
memory allocators and various device drivers. However in this interface,
only huge/giga pages are made visible to end users.
17. HUGE
this is an integral part of a HugeTLB page
20. NOPAGE
no page frame exists at the requested address
[IO related page flags]
1. ERROR IO error occurred
3. UPTODATE page has up-to-date data
ie. for file backed page: (in-memory data revision >= on-disk one)
4. DIRTY page has been written to, hence contains new data
ie. for file backed page: (in-memory data revision > on-disk one)
8. WRITEBACK page is being synced to disk
[LRU related page flags]
5. LRU page is in one of the LRU lists
6. ACTIVE page is in the active LRU list
18. UNEVICTABLE page is in the unevictable (non-)LRU list
It is somehow pinned and not a candidate for LRU page reclaims,
eg. ramfs pages, shmctl(SHM_LOCK) and mlock() memory segments
2. REFERENCED page has been referenced since last LRU list enqueue/requeue
9. RECLAIM page will be reclaimed soon after its pageout IO completed
11. MMAP a memory mapped page
12. ANON a memory mapped page that is not part of a file
13. SWAPCACHE page is mapped to swap space, ie. has an associated swap entry
14. SWAPBACKED page is backed by swap/RAM
The page-types tool in this directory can be used to query the above flags.
Using pagemap to do something useful:

View File

@ -0,0 +1,95 @@
Last reviewed: 06/02/2009
HP iLO2 NMI Watchdog Driver
NMI sourcing for iLO2 based ProLiant Servers
Documentation and Driver by
Thomas Mingarelli <thomas.mingarelli@hp.com>
The HP iLO2 NMI Watchdog driver is a kernel module that provides basic
watchdog functionality and the added benefit of NMI sourcing. Both the
watchdog functionality and the NMI sourcing capability need to be enabled
by the user. Remember that the two modes are not dependant on one another.
A user can have the NMI sourcing without the watchdog timer and vice-versa.
Watchdog functionality is enabled like any other common watchdog driver. That
is, an application needs to be started that kicks off the watchdog timer. A
basic application exists in the Documentation/watchdog/src directory called
watchdog-test.c. Simply compile the C file and kick it off. If the system
gets into a bad state and hangs, the HP ProLiant iLO 2 timer register will
not be updated in a timely fashion and a hardware system reset (also known as
an Automatic Server Recovery (ASR)) event will occur.
The hpwdt driver also has four (4) module parameters. They are the following:
soft_margin - allows the user to set the watchdog timer value
allow_kdump - allows the user to save off a kernel dump image after an NMI
nowayout - basic watchdog parameter that does not allow the timer to
be restarted or an impending ASR to be escaped.
priority - determines whether or not the hpwdt driver is first on the
die_notify list to handle NMIs or last. The default value
for this module parameter is 0 or LAST. If the user wants to
enable NMI sourcing then reload the hpwdt driver with
priority=1 (and boot with nmi_watchdog=0).
NOTE: More information about watchdog drivers in general, including the ioctl
interface to /dev/watchdog can be found in
Documentation/watchdog/watchdog-api.txt and Documentation/IPMI.txt.
The priority parameter was introduced due to other kernel software that relied
on handling NMIs (like oprofile). Keeping hpwdt's priority at 0 (or LAST)
enables the users of NMIs for non critical events to be work as expected.
The NMI sourcing capability is disabled by default due to the inability to
distinguish between "NMI Watchdog Ticks" and "HW generated NMI events" in the
Linux kernel. What this means is that the hpwdt nmi handler code is called
each time the NMI signal fires off. This could amount to several thousands of
NMIs in a matter of seconds. If a user sees the Linux kernel's "dazed and
confused" message in the logs or if the system gets into a hung state, then
the hpwdt driver can be reloaded with the "priority" module parameter set
(priority=1).
1. If the kernel has not been booted with nmi_watchdog turned off then
edit /boot/grub/menu.lst and place the nmi_watchdog=0 at the end of the
currently booting kernel line.
2. reboot the sever
3. Once the system comes up perform a rmmod hpwdt
4. insmod /lib/modules/`uname -r`/kernel/drivers/char/watchdog/hpwdt.ko priority=1
Now, the hpwdt can successfully receive and source the NMI and provide a log
message that details the reason for the NMI (as determined by the HP BIOS).
Below is a list of NMIs the HP BIOS understands along with the associated
code (reason):
No source found 00h
Uncorrectable Memory Error 01h
ASR NMI 1Bh
PCI Parity Error 20h
NMI Button Press 27h
SB_BUS_NMI 28h
ILO Doorbell NMI 29h
ILO IOP NMI 2Ah
ILO Watchdog NMI 2Bh
Proc Throt NMI 2Ch
Front Side Bus NMI 2Dh
PCI Express Error 2Fh
DMA controller NMI 30h
Hypertransport/CSI Error 31h
-- Tom Mingarelli
(thomas.mingarelli@hp.com)

Some files were not shown because too many files have changed in this diff Show More