mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
synced 2025-01-11 16:29:05 +00:00
64db4cfff9
This patch fixes a long-standing performance bug in classic RCU that results in massive internal-to-RCU lock contention on systems with more than a few hundred CPUs. Although this patch creates a separate flavor of RCU for ease of review and patch maintenance, it is intended to replace classic RCU. This patch still handles stress better than does mainline, so I am still calling it ready for inclusion. This patch is against the -tip tree. Nevertheless, experience on an actual 1000+ CPU machine would still be most welcome. Most of the changes noted below were found while creating an rcutiny (which should permit ejecting the current rcuclassic) and while doing detailed line-by-line documentation. Updates from v9 (http://lkml.org/lkml/2008/12/2/334): o Fixes from remainder of line-by-line code walkthrough, including comment spelling, initialization, undesirable narrowing due to type conversion, removing redundant memory barriers, removing redundant local-variable initialization, and removing redundant local variables. I do not believe that any of these fixes address the CPU-hotplug issues that Andi Kleen was seeing, but please do give it a whirl in case the machine is smarter than I am. A writeup from the walkthrough may be found at the following URL, in case you are suffering from terminal insomnia or masochism: http://www.kernel.org/pub/linux/kernel/people/paulmck/tmp/rcutree-walkthrough.2008.12.16a.pdf o Made rcutree tracing use seq_file, as suggested some time ago by Lai Jiangshan. o Added a .csv variant of the rcudata debugfs trace file, to allow people having thousands of CPUs to drop the data into a spreadsheet. Tested with oocalc and gnumeric. Updated documentation to suit. Updates from v8 (http://lkml.org/lkml/2008/11/15/139): o Fix a theoretical race between grace-period initialization and force_quiescent_state() that could occur if more than three jiffies were required to carry out the grace-period initialization. Which it might, if you had enough CPUs. o Apply Ingo's printk-standardization patch. o Substitute local variables for repeated accesses to global variables. o Fix comment misspellings and redundant (but harmless) increments of ->n_rcu_pending (this latter after having explicitly added it). o Apply checkpatch fixes. Updates from v7 (http://lkml.org/lkml/2008/10/10/291): o Fixed a number of problems noted by Gautham Shenoy, including the cpu-stall-detection bug that he was having difficulty convincing me was real. ;-) o Changed cpu-stall detection to wait for ten seconds rather than three in order to reduce false positive, as suggested by Ingo Molnar. o Produced a design document (http://lwn.net/Articles/305782/). The act of writing this document uncovered a number of both theoretical and "here and now" bugs as noted below. o Fix dynticks_nesting accounting confusion, simplify WARN_ON() condition, fix kerneldoc comments, and add memory barriers in dynticks interface functions. o Add more data to tracing. o Remove unused "rcu_barrier" field from rcu_data structure. o Count calls to rcu_pending() from scheduling-clock interrupt to use as a surrogate timebase should jiffies stop counting. o Fix a theoretical race between force_quiescent_state() and grace-period initialization. Yes, initialization does have to go on for some jiffies for this race to occur, but given enough CPUs... Updates from v6 (http://lkml.org/lkml/2008/9/23/448): o Fix a number of checkpatch.pl complaints. o Apply review comments from Ingo Molnar and Lai Jiangshan on the stall-detection code. o Fix several bugs in !CONFIG_SMP builds. o Fix a misspelled config-parameter name so that RCU now announces at boot time if stall detection is configured. o Run tests on numerous combinations of configurations parameters, which after the fixes above, now build and run correctly. Updates from v5 (http://lkml.org/lkml/2008/9/15/92, bad subject line): o Fix a compiler error in the !CONFIG_FANOUT_EXACT case (blew a changeset some time ago, and finally got around to retesting this option). o Fix some tracing bugs in rcupreempt that caused incorrect totals to be printed. o I now test with a more brutal random-selection online/offline script (attached). Probably more brutal than it needs to be on the people reading it as well, but so it goes. o A number of optimizations and usability improvements: o Make rcu_pending() ignore the grace-period timeout when there is no grace period in progress. o Make force_quiescent_state() avoid going for a global lock in the case where there is no grace period in progress. o Rearrange struct fields to improve struct layout. o Make call_rcu() initiate a grace period if RCU was idle, rather than waiting for the next scheduling clock interrupt. o Invoke rcu_irq_enter() and rcu_irq_exit() only when idle, as suggested by Andi Kleen. I still don't completely trust this change, and might back it out. o Make CONFIG_RCU_TRACE be the single config variable manipulated for all forms of RCU, instead of the prior confusion. o Document tracing files and formats for both rcupreempt and rcutree. Updates from v4 for those missing v5 given its bad subject line: o Separated dynticks interface so that NMIs and irqs call separate functions, greatly simplifying it. In particular, this code no longer requires a proof of correctness. ;-) o Separated dynticks state out into its own per-CPU structure, avoiding the duplicated accounting. o The case where a dynticks-idle CPU runs an irq handler that invokes call_rcu() is now correctly handled, forcing that CPU out of dynticks-idle mode. o Review comments have been applied (thank you all!!!). For but one example, fixed the dynticks-ordering issue that Manfred pointed out, saving me much debugging. ;-) o Adjusted rcuclassic and rcupreempt to handle dynticks changes. Attached is an updated patch to Classic RCU that applies a hierarchy, greatly reducing the contention on the top-level lock for large machines. This passes 10-hour concurrent rcutorture and online-offline testing on 128-CPU ppc64 without dynticks enabled, and exposes some timekeeping bugs in presence of dynticks (exciting working on a system where "sleep 1" hangs until interrupted...), which were fixed in the 2.6.27 kernel. It is getting more reliable than mainline by some measures, so the next version will be against -tip for inclusion. See also Manfred Spraul's recent patches (or his earlier work from 2004 at http://marc.info/?l=linux-kernel&m=108546384711797&w=2). We will converge onto a common patch in the fullness of time, but are currently exploring different regions of the design space. That said, I have already gratefully stolen quite a few of Manfred's ideas. This patch provides CONFIG_RCU_FANOUT, which controls the bushiness of the RCU hierarchy. Defaults to 32 on 32-bit machines and 64 on 64-bit machines. If CONFIG_NR_CPUS is less than CONFIG_RCU_FANOUT, there is no hierarchy. By default, the RCU initialization code will adjust CONFIG_RCU_FANOUT to balance the hierarchy, so strongly NUMA architectures may choose to set CONFIG_RCU_FANOUT_EXACT to disable this balancing, allowing the hierarchy to be exactly aligned to the underlying hardware. Up to two levels of hierarchy are permitted (in addition to the root node), allowing up to 16,384 CPUs on 32-bit systems and up to 262,144 CPUs on 64-bit systems. I just know that I am going to regret saying this, but this seems more than sufficient for the foreseeable future. (Some architectures might wish to set CONFIG_RCU_FANOUT=4, which would limit such architectures to 64 CPUs. If this becomes a real problem, additional levels can be added, but I doubt that it will make a significant difference on real hardware.) In the common case, a given CPU will manipulate its private rcu_data structure and the rcu_node structure that it shares with its immediate neighbors. This can reduce both lock and memory contention by multiple orders of magnitude, which should eliminate the need for the strange manipulations that are reported to be required when running Linux on very large systems. Some shortcomings: o More bugs will probably surface as a result of an ongoing line-by-line code inspection. Patches will be provided as required. o There are probably hangs, rcutorture failures, &c. Seems quite stable on a 128-CPU machine, but that is kind of small compared to 4096 CPUs. However, seems to do better than mainline. Patches will be provided as required. o The memory footprint of this version is several KB larger than rcuclassic. A separate UP-only rcutiny patch will be provided, which will reduce the memory footprint significantly, even compared to the old rcuclassic. One such patch passes light testing, and has a memory footprint smaller even than rcuclassic. Initial reaction from various embedded guys was "it is not worth it", so am putting it aside. Credits: o Manfred Spraul for ideas, review comments, and bugs spotted, as well as some good friendly competition. ;-) o Josh Triplett, Ingo Molnar, Peter Zijlstra, Mathieu Desnoyers, Lai Jiangshan, Andi Kleen, Andy Whitcroft, and Andrew Morton for reviews and comments. o Thomas Gleixner for much-needed help with some timer issues (see patches below). o Jon M. Tollefson, Tim Pepper, Andrew Theurer, Jose R. Santos, Andy Whitcroft, Darrick Wong, Nishanth Aravamudan, Anton Blanchard, Dave Kleikamp, and Nathan Lynch for keeping machines alive despite my heavy abuse^Wtesting. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
886 lines
30 KiB
Plaintext
886 lines
30 KiB
Plaintext
|
|
config PRINTK_TIME
|
|
bool "Show timing information on printks"
|
|
depends on PRINTK
|
|
help
|
|
Selecting this option causes timing information to be
|
|
included in printk output. This allows you to measure
|
|
the interval between kernel operations, including bootup
|
|
operations. This is useful for identifying long delays
|
|
in kernel startup.
|
|
|
|
config ENABLE_WARN_DEPRECATED
|
|
bool "Enable __deprecated logic"
|
|
default y
|
|
help
|
|
Enable the __deprecated logic in the kernel build.
|
|
Disable this to suppress the "warning: 'foo' is deprecated
|
|
(declared at kernel/power/somefile.c:1234)" messages.
|
|
|
|
config ENABLE_MUST_CHECK
|
|
bool "Enable __must_check logic"
|
|
default y
|
|
help
|
|
Enable the __must_check logic in the kernel build. Disable this to
|
|
suppress the "warning: ignoring return value of 'foo', declared with
|
|
attribute warn_unused_result" messages.
|
|
|
|
config FRAME_WARN
|
|
int "Warn for stack frames larger than (needs gcc 4.4)"
|
|
range 0 8192
|
|
default 1024 if !64BIT
|
|
default 2048 if 64BIT
|
|
help
|
|
Tell gcc to warn at build time for stack frames larger than this.
|
|
Setting this too low will cause a lot of warnings.
|
|
Setting it to 0 disables the warning.
|
|
Requires gcc 4.4
|
|
|
|
config MAGIC_SYSRQ
|
|
bool "Magic SysRq key"
|
|
depends on !UML
|
|
help
|
|
If you say Y here, you will have some control over the system even
|
|
if the system crashes for example during kernel debugging (e.g., you
|
|
will be able to flush the buffer cache to disk, reboot the system
|
|
immediately or dump some status information). This is accomplished
|
|
by pressing various keys while holding SysRq (Alt+PrintScreen). It
|
|
also works on a serial console (on PC hardware at least), if you
|
|
send a BREAK and then within 5 seconds a command keypress. The
|
|
keys are documented in <file:Documentation/sysrq.txt>. Don't say Y
|
|
unless you really know what this hack does.
|
|
|
|
config UNUSED_SYMBOLS
|
|
bool "Enable unused/obsolete exported symbols"
|
|
default y if X86
|
|
help
|
|
Unused but exported symbols make the kernel needlessly bigger. For
|
|
that reason most of these unused exports will soon be removed. This
|
|
option is provided temporarily to provide a transition period in case
|
|
some external kernel module needs one of these symbols anyway. If you
|
|
encounter such a case in your module, consider if you are actually
|
|
using the right API. (rationale: since nobody in the kernel is using
|
|
this in a module, there is a pretty good chance it's actually the
|
|
wrong interface to use). If you really need the symbol, please send a
|
|
mail to the linux kernel mailing list mentioning the symbol and why
|
|
you really need it, and what the merge plan to the mainline kernel for
|
|
your module is.
|
|
|
|
config DEBUG_FS
|
|
bool "Debug Filesystem"
|
|
depends on SYSFS
|
|
help
|
|
debugfs is a virtual file system that kernel developers use to put
|
|
debugging files into. Enable this option to be able to read and
|
|
write to these files.
|
|
|
|
For detailed documentation on the debugfs API, see
|
|
Documentation/DocBook/filesystems.
|
|
|
|
If unsure, say N.
|
|
|
|
config HEADERS_CHECK
|
|
bool "Run 'make headers_check' when building vmlinux"
|
|
depends on !UML
|
|
help
|
|
This option will extract the user-visible kernel headers whenever
|
|
building the kernel, and will run basic sanity checks on them to
|
|
ensure that exported files do not attempt to include files which
|
|
were not exported, etc.
|
|
|
|
If you're making modifications to header files which are
|
|
relevant for userspace, say 'Y', and check the headers
|
|
exported to $(INSTALL_HDR_PATH) (usually 'usr/include' in
|
|
your build tree), to make sure they're suitable.
|
|
|
|
config DEBUG_SECTION_MISMATCH
|
|
bool "Enable full Section mismatch analysis"
|
|
depends on UNDEFINED
|
|
# This option is on purpose disabled for now.
|
|
# It will be enabled when we are down to a resonable number
|
|
# of section mismatch warnings (< 10 for an allyesconfig build)
|
|
help
|
|
The section mismatch analysis checks if there are illegal
|
|
references from one section to another section.
|
|
Linux will during link or during runtime drop some sections
|
|
and any use of code/data previously in these sections will
|
|
most likely result in an oops.
|
|
In the code functions and variables are annotated with
|
|
__init, __devinit etc. (see full list in include/linux/init.h)
|
|
which results in the code/data being placed in specific sections.
|
|
The section mismatch analysis is always done after a full
|
|
kernel build but enabling this option will in addition
|
|
do the following:
|
|
- Add the option -fno-inline-functions-called-once to gcc
|
|
When inlining a function annotated __init in a non-init
|
|
function we would lose the section information and thus
|
|
the analysis would not catch the illegal reference.
|
|
This option tells gcc to inline less but will also
|
|
result in a larger kernel.
|
|
- Run the section mismatch analysis for each module/built-in.o
|
|
When we run the section mismatch analysis on vmlinux.o we
|
|
lose valueble information about where the mismatch was
|
|
introduced.
|
|
Running the analysis for each module/built-in.o file
|
|
will tell where the mismatch happens much closer to the
|
|
source. The drawback is that we will report the same
|
|
mismatch at least twice.
|
|
- Enable verbose reporting from modpost to help solving
|
|
the section mismatches reported.
|
|
|
|
config DEBUG_KERNEL
|
|
bool "Kernel debugging"
|
|
help
|
|
Say Y here if you are developing drivers or trying to debug and
|
|
identify kernel problems.
|
|
|
|
config DEBUG_SHIRQ
|
|
bool "Debug shared IRQ handlers"
|
|
depends on DEBUG_KERNEL && GENERIC_HARDIRQS
|
|
help
|
|
Enable this to generate a spurious interrupt as soon as a shared
|
|
interrupt handler is registered, and just before one is deregistered.
|
|
Drivers ought to be able to handle interrupts coming in at those
|
|
points; some don't and need to be caught.
|
|
|
|
config DETECT_SOFTLOCKUP
|
|
bool "Detect Soft Lockups"
|
|
depends on DEBUG_KERNEL && !S390
|
|
default y
|
|
help
|
|
Say Y here to enable the kernel to detect "soft lockups",
|
|
which are bugs that cause the kernel to loop in kernel
|
|
mode for more than 60 seconds, without giving other tasks a
|
|
chance to run.
|
|
|
|
When a soft-lockup is detected, the kernel will print the
|
|
current stack trace (which you should report), but the
|
|
system will stay locked up. This feature has negligible
|
|
overhead.
|
|
|
|
(Note that "hard lockups" are separate type of bugs that
|
|
can be detected via the NMI-watchdog, on platforms that
|
|
support it.)
|
|
|
|
config BOOTPARAM_SOFTLOCKUP_PANIC
|
|
bool "Panic (Reboot) On Soft Lockups"
|
|
depends on DETECT_SOFTLOCKUP
|
|
help
|
|
Say Y here to enable the kernel to panic on "soft lockups",
|
|
which are bugs that cause the kernel to loop in kernel
|
|
mode for more than 60 seconds, without giving other tasks a
|
|
chance to run.
|
|
|
|
The panic can be used in combination with panic_timeout,
|
|
to cause the system to reboot automatically after a
|
|
lockup has been detected. This feature is useful for
|
|
high-availability systems that have uptime guarantees and
|
|
where a lockup must be resolved ASAP.
|
|
|
|
Say N if unsure.
|
|
|
|
config BOOTPARAM_SOFTLOCKUP_PANIC_VALUE
|
|
int
|
|
depends on DETECT_SOFTLOCKUP
|
|
range 0 1
|
|
default 0 if !BOOTPARAM_SOFTLOCKUP_PANIC
|
|
default 1 if BOOTPARAM_SOFTLOCKUP_PANIC
|
|
|
|
config SCHED_DEBUG
|
|
bool "Collect scheduler debugging info"
|
|
depends on DEBUG_KERNEL && PROC_FS
|
|
default y
|
|
help
|
|
If you say Y here, the /proc/sched_debug file will be provided
|
|
that can help debug the scheduler. The runtime overhead of this
|
|
option is minimal.
|
|
|
|
config SCHEDSTATS
|
|
bool "Collect scheduler statistics"
|
|
depends on DEBUG_KERNEL && PROC_FS
|
|
help
|
|
If you say Y here, additional code will be inserted into the
|
|
scheduler and related routines to collect statistics about
|
|
scheduler behavior and provide them in /proc/schedstat. These
|
|
stats may be useful for both tuning and debugging the scheduler
|
|
If you aren't debugging the scheduler or trying to tune a specific
|
|
application, you can say N to avoid the very slight overhead
|
|
this adds.
|
|
|
|
config TIMER_STATS
|
|
bool "Collect kernel timers statistics"
|
|
depends on DEBUG_KERNEL && PROC_FS
|
|
help
|
|
If you say Y here, additional code will be inserted into the
|
|
timer routines to collect statistics about kernel timers being
|
|
reprogrammed. The statistics can be read from /proc/timer_stats.
|
|
The statistics collection is started by writing 1 to /proc/timer_stats,
|
|
writing 0 stops it. This feature is useful to collect information
|
|
about timer usage patterns in kernel and userspace. This feature
|
|
is lightweight if enabled in the kernel config but not activated
|
|
(it defaults to deactivated on bootup and will only be activated
|
|
if some application like powertop activates it explicitly).
|
|
|
|
config DEBUG_OBJECTS
|
|
bool "Debug object operations"
|
|
depends on DEBUG_KERNEL
|
|
help
|
|
If you say Y here, additional code will be inserted into the
|
|
kernel to track the life time of various objects and validate
|
|
the operations on those objects.
|
|
|
|
config DEBUG_OBJECTS_SELFTEST
|
|
bool "Debug objects selftest"
|
|
depends on DEBUG_OBJECTS
|
|
help
|
|
This enables the selftest of the object debug code.
|
|
|
|
config DEBUG_OBJECTS_FREE
|
|
bool "Debug objects in freed memory"
|
|
depends on DEBUG_OBJECTS
|
|
help
|
|
This enables checks whether a k/v free operation frees an area
|
|
which contains an object which has not been deactivated
|
|
properly. This can make kmalloc/kfree-intensive workloads
|
|
much slower.
|
|
|
|
config DEBUG_OBJECTS_TIMERS
|
|
bool "Debug timer objects"
|
|
depends on DEBUG_OBJECTS
|
|
help
|
|
If you say Y here, additional code will be inserted into the
|
|
timer routines to track the life time of timer objects and
|
|
validate the timer operations.
|
|
|
|
config DEBUG_SLAB
|
|
bool "Debug slab memory allocations"
|
|
depends on DEBUG_KERNEL && SLAB
|
|
help
|
|
Say Y here to have the kernel do limited verification on memory
|
|
allocation as well as poisoning memory on free to catch use of freed
|
|
memory. This can make kmalloc/kfree-intensive workloads much slower.
|
|
|
|
config DEBUG_SLAB_LEAK
|
|
bool "Memory leak debugging"
|
|
depends on DEBUG_SLAB
|
|
|
|
config SLUB_DEBUG_ON
|
|
bool "SLUB debugging on by default"
|
|
depends on SLUB && SLUB_DEBUG
|
|
default n
|
|
help
|
|
Boot with debugging on by default. SLUB boots by default with
|
|
the runtime debug capabilities switched off. Enabling this is
|
|
equivalent to specifying the "slub_debug" parameter on boot.
|
|
There is no support for more fine grained debug control like
|
|
possible with slub_debug=xxx. SLUB debugging may be switched
|
|
off in a kernel built with CONFIG_SLUB_DEBUG_ON by specifying
|
|
"slub_debug=-".
|
|
|
|
config SLUB_STATS
|
|
default n
|
|
bool "Enable SLUB performance statistics"
|
|
depends on SLUB && SLUB_DEBUG && SYSFS
|
|
help
|
|
SLUB statistics are useful to debug SLUBs allocation behavior in
|
|
order find ways to optimize the allocator. This should never be
|
|
enabled for production use since keeping statistics slows down
|
|
the allocator by a few percentage points. The slabinfo command
|
|
supports the determination of the most active slabs to figure
|
|
out which slabs are relevant to a particular load.
|
|
Try running: slabinfo -DA
|
|
|
|
config DEBUG_PREEMPT
|
|
bool "Debug preemptible kernel"
|
|
depends on DEBUG_KERNEL && PREEMPT && (TRACE_IRQFLAGS_SUPPORT || PPC64)
|
|
default y
|
|
help
|
|
If you say Y here then the kernel will use a debug variant of the
|
|
commonly used smp_processor_id() function and will print warnings
|
|
if kernel code uses it in a preemption-unsafe way. Also, the kernel
|
|
will detect preemption count underflows.
|
|
|
|
config DEBUG_RT_MUTEXES
|
|
bool "RT Mutex debugging, deadlock detection"
|
|
depends on DEBUG_KERNEL && RT_MUTEXES
|
|
help
|
|
This allows rt mutex semantics violations and rt mutex related
|
|
deadlocks (lockups) to be detected and reported automatically.
|
|
|
|
config DEBUG_PI_LIST
|
|
bool
|
|
default y
|
|
depends on DEBUG_RT_MUTEXES
|
|
|
|
config RT_MUTEX_TESTER
|
|
bool "Built-in scriptable tester for rt-mutexes"
|
|
depends on DEBUG_KERNEL && RT_MUTEXES
|
|
help
|
|
This option enables a rt-mutex tester.
|
|
|
|
config DEBUG_SPINLOCK
|
|
bool "Spinlock and rw-lock debugging: basic checks"
|
|
depends on DEBUG_KERNEL
|
|
help
|
|
Say Y here and build SMP to catch missing spinlock initialization
|
|
and certain other kinds of spinlock errors commonly made. This is
|
|
best used in conjunction with the NMI watchdog so that spinlock
|
|
deadlocks are also debuggable.
|
|
|
|
config DEBUG_MUTEXES
|
|
bool "Mutex debugging: basic checks"
|
|
depends on DEBUG_KERNEL
|
|
help
|
|
This feature allows mutex semantics violations to be detected and
|
|
reported.
|
|
|
|
config DEBUG_LOCK_ALLOC
|
|
bool "Lock debugging: detect incorrect freeing of live locks"
|
|
depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
|
|
select DEBUG_SPINLOCK
|
|
select DEBUG_MUTEXES
|
|
select LOCKDEP
|
|
help
|
|
This feature will check whether any held lock (spinlock, rwlock,
|
|
mutex or rwsem) is incorrectly freed by the kernel, via any of the
|
|
memory-freeing routines (kfree(), kmem_cache_free(), free_pages(),
|
|
vfree(), etc.), whether a live lock is incorrectly reinitialized via
|
|
spin_lock_init()/mutex_init()/etc., or whether there is any lock
|
|
held during task exit.
|
|
|
|
config PROVE_LOCKING
|
|
bool "Lock debugging: prove locking correctness"
|
|
depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
|
|
select LOCKDEP
|
|
select DEBUG_SPINLOCK
|
|
select DEBUG_MUTEXES
|
|
select DEBUG_LOCK_ALLOC
|
|
default n
|
|
help
|
|
This feature enables the kernel to prove that all locking
|
|
that occurs in the kernel runtime is mathematically
|
|
correct: that under no circumstance could an arbitrary (and
|
|
not yet triggered) combination of observed locking
|
|
sequences (on an arbitrary number of CPUs, running an
|
|
arbitrary number of tasks and interrupt contexts) cause a
|
|
deadlock.
|
|
|
|
In short, this feature enables the kernel to report locking
|
|
related deadlocks before they actually occur.
|
|
|
|
The proof does not depend on how hard and complex a
|
|
deadlock scenario would be to trigger: how many
|
|
participant CPUs, tasks and irq-contexts would be needed
|
|
for it to trigger. The proof also does not depend on
|
|
timing: if a race and a resulting deadlock is possible
|
|
theoretically (no matter how unlikely the race scenario
|
|
is), it will be proven so and will immediately be
|
|
reported by the kernel (once the event is observed that
|
|
makes the deadlock theoretically possible).
|
|
|
|
If a deadlock is impossible (i.e. the locking rules, as
|
|
observed by the kernel, are mathematically correct), the
|
|
kernel reports nothing.
|
|
|
|
NOTE: this feature can also be enabled for rwlocks, mutexes
|
|
and rwsems - in which case all dependencies between these
|
|
different locking variants are observed and mapped too, and
|
|
the proof of observed correctness is also maintained for an
|
|
arbitrary combination of these separate locking variants.
|
|
|
|
For more details, see Documentation/lockdep-design.txt.
|
|
|
|
config LOCKDEP
|
|
bool
|
|
depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
|
|
select STACKTRACE
|
|
select FRAME_POINTER if !X86 && !MIPS && !PPC
|
|
select KALLSYMS
|
|
select KALLSYMS_ALL
|
|
|
|
config LOCK_STAT
|
|
bool "Lock usage statistics"
|
|
depends on DEBUG_KERNEL && TRACE_IRQFLAGS_SUPPORT && STACKTRACE_SUPPORT && LOCKDEP_SUPPORT
|
|
select LOCKDEP
|
|
select DEBUG_SPINLOCK
|
|
select DEBUG_MUTEXES
|
|
select DEBUG_LOCK_ALLOC
|
|
default n
|
|
help
|
|
This feature enables tracking lock contention points
|
|
|
|
For more details, see Documentation/lockstat.txt
|
|
|
|
config DEBUG_LOCKDEP
|
|
bool "Lock dependency engine debugging"
|
|
depends on DEBUG_KERNEL && LOCKDEP
|
|
help
|
|
If you say Y here, the lock dependency engine will do
|
|
additional runtime checks to debug itself, at the price
|
|
of more runtime overhead.
|
|
|
|
config TRACE_IRQFLAGS
|
|
depends on DEBUG_KERNEL
|
|
bool
|
|
default y
|
|
depends on TRACE_IRQFLAGS_SUPPORT
|
|
depends on PROVE_LOCKING
|
|
|
|
config DEBUG_SPINLOCK_SLEEP
|
|
bool "Spinlock debugging: sleep-inside-spinlock checking"
|
|
depends on DEBUG_KERNEL
|
|
help
|
|
If you say Y here, various routines which may sleep will become very
|
|
noisy if they are called with a spinlock held.
|
|
|
|
config DEBUG_LOCKING_API_SELFTESTS
|
|
bool "Locking API boot-time self-tests"
|
|
depends on DEBUG_KERNEL
|
|
help
|
|
Say Y here if you want the kernel to run a short self-test during
|
|
bootup. The self-test checks whether common types of locking bugs
|
|
are detected by debugging mechanisms or not. (if you disable
|
|
lock debugging then those bugs wont be detected of course.)
|
|
The following locking APIs are covered: spinlocks, rwlocks,
|
|
mutexes and rwsems.
|
|
|
|
config STACKTRACE
|
|
bool
|
|
depends on STACKTRACE_SUPPORT
|
|
|
|
config DEBUG_KOBJECT
|
|
bool "kobject debugging"
|
|
depends on DEBUG_KERNEL
|
|
help
|
|
If you say Y here, some extra kobject debugging messages will be sent
|
|
to the syslog.
|
|
|
|
config DEBUG_HIGHMEM
|
|
bool "Highmem debugging"
|
|
depends on DEBUG_KERNEL && HIGHMEM
|
|
help
|
|
This options enables addition error checking for high memory systems.
|
|
Disable for production systems.
|
|
|
|
config DEBUG_BUGVERBOSE
|
|
bool "Verbose BUG() reporting (adds 70K)" if DEBUG_KERNEL && EMBEDDED
|
|
depends on BUG
|
|
depends on ARM || AVR32 || M32R || M68K || SPARC32 || SPARC64 || \
|
|
FRV || SUPERH || GENERIC_BUG || BLACKFIN || MN10300
|
|
default !EMBEDDED
|
|
help
|
|
Say Y here to make BUG() panics output the file name and line number
|
|
of the BUG call as well as the EIP and oops trace. This aids
|
|
debugging but costs about 70-100K of memory.
|
|
|
|
config DEBUG_INFO
|
|
bool "Compile the kernel with debug info"
|
|
depends on DEBUG_KERNEL
|
|
help
|
|
If you say Y here the resulting kernel image will include
|
|
debugging info resulting in a larger kernel image.
|
|
This adds debug symbols to the kernel and modules (gcc -g), and
|
|
is needed if you intend to use kernel crashdump or binary object
|
|
tools like crash, kgdb, LKCD, gdb, etc on the kernel.
|
|
Say Y here only if you plan to debug the kernel.
|
|
|
|
If unsure, say N.
|
|
|
|
config DEBUG_VM
|
|
bool "Debug VM"
|
|
depends on DEBUG_KERNEL
|
|
help
|
|
Enable this to turn on extended checks in the virtual-memory system
|
|
that may impact performance.
|
|
|
|
If unsure, say N.
|
|
|
|
config DEBUG_VIRTUAL
|
|
bool "Debug VM translations"
|
|
depends on DEBUG_KERNEL && X86
|
|
help
|
|
Enable some costly sanity checks in virtual to page code. This can
|
|
catch mistakes with virt_to_page() and friends.
|
|
|
|
If unsure, say N.
|
|
|
|
config DEBUG_WRITECOUNT
|
|
bool "Debug filesystem writers count"
|
|
depends on DEBUG_KERNEL
|
|
help
|
|
Enable this to catch wrong use of the writers count in struct
|
|
vfsmount. This will increase the size of each file struct by
|
|
32 bits.
|
|
|
|
If unsure, say N.
|
|
|
|
config DEBUG_MEMORY_INIT
|
|
bool "Debug memory initialisation" if EMBEDDED
|
|
default !EMBEDDED
|
|
help
|
|
Enable this for additional checks during memory initialisation.
|
|
The sanity checks verify aspects of the VM such as the memory model
|
|
and other information provided by the architecture. Verbose
|
|
information will be printed at KERN_DEBUG loglevel depending
|
|
on the mminit_loglevel= command-line option.
|
|
|
|
If unsure, say Y
|
|
|
|
config DEBUG_LIST
|
|
bool "Debug linked list manipulation"
|
|
depends on DEBUG_KERNEL
|
|
help
|
|
Enable this to turn on extended checks in the linked-list
|
|
walking routines.
|
|
|
|
If unsure, say N.
|
|
|
|
config DEBUG_SG
|
|
bool "Debug SG table operations"
|
|
depends on DEBUG_KERNEL
|
|
help
|
|
Enable this to turn on checks on scatter-gather tables. This can
|
|
help find problems with drivers that do not properly initialize
|
|
their sg tables.
|
|
|
|
If unsure, say N.
|
|
|
|
config FRAME_POINTER
|
|
bool "Compile the kernel with frame pointers"
|
|
depends on DEBUG_KERNEL && \
|
|
(X86 || CRIS || M68K || M68KNOMMU || FRV || UML || S390 || \
|
|
AVR32 || SUPERH || BLACKFIN || MN10300)
|
|
default y if DEBUG_INFO && UML
|
|
help
|
|
If you say Y here the resulting kernel image will be slightly larger
|
|
and slower, but it might give very useful debugging information on
|
|
some architectures or if you use external debuggers.
|
|
If you don't debug the kernel, you can say N.
|
|
|
|
config BOOT_PRINTK_DELAY
|
|
bool "Delay each boot printk message by N milliseconds"
|
|
depends on DEBUG_KERNEL && PRINTK && GENERIC_CALIBRATE_DELAY
|
|
help
|
|
This build option allows you to read kernel boot messages
|
|
by inserting a short delay after each one. The delay is
|
|
specified in milliseconds on the kernel command line,
|
|
using "boot_delay=N".
|
|
|
|
It is likely that you would also need to use "lpj=M" to preset
|
|
the "loops per jiffie" value.
|
|
See a previous boot log for the "lpj" value to use for your
|
|
system, and then set "lpj=M" before setting "boot_delay=N".
|
|
NOTE: Using this option may adversely affect SMP systems.
|
|
I.e., processors other than the first one may not boot up.
|
|
BOOT_PRINTK_DELAY also may cause DETECT_SOFTLOCKUP to detect
|
|
what it believes to be lockup conditions.
|
|
|
|
config RCU_TORTURE_TEST
|
|
tristate "torture tests for RCU"
|
|
depends on DEBUG_KERNEL
|
|
default n
|
|
help
|
|
This option provides a kernel module that runs torture tests
|
|
on the RCU infrastructure. The kernel module may be built
|
|
after the fact on the running kernel to be tested, if desired.
|
|
|
|
Say Y here if you want RCU torture tests to be built into
|
|
the kernel.
|
|
Say M if you want the RCU torture tests to build as a module.
|
|
Say N if you are unsure.
|
|
|
|
config RCU_TORTURE_TEST_RUNNABLE
|
|
bool "torture tests for RCU runnable by default"
|
|
depends on RCU_TORTURE_TEST = y
|
|
default n
|
|
help
|
|
This option provides a way to build the RCU torture tests
|
|
directly into the kernel without them starting up at boot
|
|
time. You can use /proc/sys/kernel/rcutorture_runnable
|
|
to manually override this setting. This /proc file is
|
|
available only when the RCU torture tests have been built
|
|
into the kernel.
|
|
|
|
Say Y here if you want the RCU torture tests to start during
|
|
boot (you probably don't).
|
|
Say N here if you want the RCU torture tests to start only
|
|
after being manually enabled via /proc.
|
|
|
|
config RCU_CPU_STALL_DETECTOR
|
|
bool "Check for stalled CPUs delaying RCU grace periods"
|
|
depends on CLASSIC_RCU
|
|
default n
|
|
help
|
|
This option causes RCU to printk information on which
|
|
CPUs are delaying the current grace period, but only when
|
|
the grace period extends for excessive time periods.
|
|
|
|
Say Y if you want RCU to perform such checks.
|
|
|
|
Say N if you are unsure.
|
|
|
|
config RCU_CPU_STALL_DETECTOR
|
|
bool "Check for stalled CPUs delaying RCU grace periods"
|
|
depends on CLASSIC_RCU || TREE_RCU
|
|
default n
|
|
help
|
|
This option causes RCU to printk information on which
|
|
CPUs are delaying the current grace period, but only when
|
|
the grace period extends for excessive time periods.
|
|
|
|
Say Y if you want RCU to perform such checks.
|
|
|
|
Say N if you are unsure.
|
|
|
|
config KPROBES_SANITY_TEST
|
|
bool "Kprobes sanity tests"
|
|
depends on DEBUG_KERNEL
|
|
depends on KPROBES
|
|
default n
|
|
help
|
|
This option provides for testing basic kprobes functionality on
|
|
boot. A sample kprobe, jprobe and kretprobe are inserted and
|
|
verified for functionality.
|
|
|
|
Say N if you are unsure.
|
|
|
|
config BACKTRACE_SELF_TEST
|
|
tristate "Self test for the backtrace code"
|
|
depends on DEBUG_KERNEL
|
|
default n
|
|
help
|
|
This option provides a kernel module that can be used to test
|
|
the kernel stack backtrace code. This option is not useful
|
|
for distributions or general kernels, but only for kernel
|
|
developers working on architecture code.
|
|
|
|
Note that if you want to also test saved backtraces, you will
|
|
have to enable STACKTRACE as well.
|
|
|
|
Say N if you are unsure.
|
|
|
|
config DEBUG_BLOCK_EXT_DEVT
|
|
bool "Force extended block device numbers and spread them"
|
|
depends on DEBUG_KERNEL
|
|
depends on BLOCK
|
|
default n
|
|
help
|
|
BIG FAT WARNING: ENABLING THIS OPTION MIGHT BREAK BOOTING ON
|
|
SOME DISTRIBUTIONS. DO NOT ENABLE THIS UNLESS YOU KNOW WHAT
|
|
YOU ARE DOING. Distros, please enable this and fix whatever
|
|
is broken.
|
|
|
|
Conventionally, block device numbers are allocated from
|
|
predetermined contiguous area. However, extended block area
|
|
may introduce non-contiguous block device numbers. This
|
|
option forces most block device numbers to be allocated from
|
|
the extended space and spreads them to discover kernel or
|
|
userland code paths which assume predetermined contiguous
|
|
device number allocation.
|
|
|
|
Note that turning on this debug option shuffles all the
|
|
device numbers for all IDE and SCSI devices including libata
|
|
ones, so root partition specified using device number
|
|
directly (via rdev or root=MAJ:MIN) won't work anymore.
|
|
Textual device names (root=/dev/sdXn) will continue to work.
|
|
|
|
Say N if you are unsure.
|
|
|
|
config LKDTM
|
|
tristate "Linux Kernel Dump Test Tool Module"
|
|
depends on DEBUG_KERNEL
|
|
depends on KPROBES
|
|
depends on BLOCK
|
|
default n
|
|
help
|
|
This module enables testing of the different dumping mechanisms by
|
|
inducing system failures at predefined crash points.
|
|
If you don't need it: say N
|
|
Choose M here to compile this code as a module. The module will be
|
|
called lkdtm.
|
|
|
|
Documentation on how to use the module can be found in
|
|
drivers/misc/lkdtm.c
|
|
|
|
config FAULT_INJECTION
|
|
bool "Fault-injection framework"
|
|
depends on DEBUG_KERNEL
|
|
help
|
|
Provide fault-injection framework.
|
|
For more details, see Documentation/fault-injection/.
|
|
|
|
config FAILSLAB
|
|
bool "Fault-injection capability for kmalloc"
|
|
depends on FAULT_INJECTION
|
|
help
|
|
Provide fault-injection capability for kmalloc.
|
|
|
|
config FAIL_PAGE_ALLOC
|
|
bool "Fault-injection capabilitiy for alloc_pages()"
|
|
depends on FAULT_INJECTION
|
|
help
|
|
Provide fault-injection capability for alloc_pages().
|
|
|
|
config FAIL_MAKE_REQUEST
|
|
bool "Fault-injection capability for disk IO"
|
|
depends on FAULT_INJECTION && BLOCK
|
|
help
|
|
Provide fault-injection capability for disk IO.
|
|
|
|
config FAIL_IO_TIMEOUT
|
|
bool "Faul-injection capability for faking disk interrupts"
|
|
depends on FAULT_INJECTION && BLOCK
|
|
help
|
|
Provide fault-injection capability on end IO handling. This
|
|
will make the block layer "forget" an interrupt as configured,
|
|
thus exercising the error handling.
|
|
|
|
Only works with drivers that use the generic timeout handling,
|
|
for others it wont do anything.
|
|
|
|
config FAULT_INJECTION_DEBUG_FS
|
|
bool "Debugfs entries for fault-injection capabilities"
|
|
depends on FAULT_INJECTION && SYSFS && DEBUG_FS
|
|
help
|
|
Enable configuration of fault-injection capabilities via debugfs.
|
|
|
|
config FAULT_INJECTION_STACKTRACE_FILTER
|
|
bool "stacktrace filter for fault-injection capabilities"
|
|
depends on FAULT_INJECTION_DEBUG_FS && STACKTRACE_SUPPORT
|
|
depends on !X86_64
|
|
select STACKTRACE
|
|
select FRAME_POINTER if !PPC
|
|
help
|
|
Provide stacktrace filter for fault-injection capabilities
|
|
|
|
config LATENCYTOP
|
|
bool "Latency measuring infrastructure"
|
|
select FRAME_POINTER if !MIPS && !PPC
|
|
select KALLSYMS
|
|
select KALLSYMS_ALL
|
|
select STACKTRACE
|
|
select SCHEDSTATS
|
|
select SCHED_DEBUG
|
|
depends on HAVE_LATENCYTOP_SUPPORT
|
|
help
|
|
Enable this option if you want to use the LatencyTOP tool
|
|
to find out which userspace is blocking on what kernel operations.
|
|
|
|
config SYSCTL_SYSCALL_CHECK
|
|
bool "Sysctl checks"
|
|
depends on SYSCTL_SYSCALL
|
|
---help---
|
|
sys_sysctl uses binary paths that have been found challenging
|
|
to properly maintain and use. This enables checks that help
|
|
you to keep things correct.
|
|
|
|
source kernel/trace/Kconfig
|
|
|
|
config PROVIDE_OHCI1394_DMA_INIT
|
|
bool "Remote debugging over FireWire early on boot"
|
|
depends on PCI && X86
|
|
help
|
|
If you want to debug problems which hang or crash the kernel early
|
|
on boot and the crashing machine has a FireWire port, you can use
|
|
this feature to remotely access the memory of the crashed machine
|
|
over FireWire. This employs remote DMA as part of the OHCI1394
|
|
specification which is now the standard for FireWire controllers.
|
|
|
|
With remote DMA, you can monitor the printk buffer remotely using
|
|
firescope and access all memory below 4GB using fireproxy from gdb.
|
|
Even controlling a kernel debugger is possible using remote DMA.
|
|
|
|
Usage:
|
|
|
|
If ohci1394_dma=early is used as boot parameter, it will initialize
|
|
all OHCI1394 controllers which are found in the PCI config space.
|
|
|
|
As all changes to the FireWire bus such as enabling and disabling
|
|
devices cause a bus reset and thereby disable remote DMA for all
|
|
devices, be sure to have the cable plugged and FireWire enabled on
|
|
the debugging host before booting the debug target for debugging.
|
|
|
|
This code (~1k) is freed after boot. By then, the firewire stack
|
|
in charge of the OHCI-1394 controllers should be used instead.
|
|
|
|
See Documentation/debugging-via-ohci1394.txt for more information.
|
|
|
|
config FIREWIRE_OHCI_REMOTE_DMA
|
|
bool "Remote debugging over FireWire with firewire-ohci"
|
|
depends on FIREWIRE_OHCI
|
|
help
|
|
This option lets you use the FireWire bus for remote debugging
|
|
with help of the firewire-ohci driver. It enables unfiltered
|
|
remote DMA in firewire-ohci.
|
|
See Documentation/debugging-via-ohci1394.txt for more information.
|
|
|
|
If unsure, say N.
|
|
|
|
menuconfig BUILD_DOCSRC
|
|
bool "Build targets in Documentation/ tree"
|
|
depends on HEADERS_CHECK
|
|
help
|
|
This option attempts to build objects from the source files in the
|
|
kernel Documentation/ tree.
|
|
|
|
Say N if you are unsure.
|
|
|
|
config DYNAMIC_PRINTK_DEBUG
|
|
bool "Enable dynamic printk() call support"
|
|
default n
|
|
depends on PRINTK
|
|
select PRINTK_DEBUG
|
|
help
|
|
|
|
Compiles debug level messages into the kernel, which would not
|
|
otherwise be available at runtime. These messages can then be
|
|
enabled/disabled on a per module basis. This mechanism implicitly
|
|
enables all pr_debug() and dev_dbg() calls. The impact of this
|
|
compile option is a larger kernel text size of about 2%.
|
|
|
|
Usage:
|
|
|
|
Dynamic debugging is controlled by the debugfs file,
|
|
dynamic_printk/modules. This file contains a list of the modules that
|
|
can be enabled. The format of the file is the module name, followed
|
|
by a set of flags that can be enabled. The first flag is always the
|
|
'enabled' flag. For example:
|
|
|
|
<module_name> <enabled=0/1>
|
|
.
|
|
.
|
|
.
|
|
|
|
<module_name> : Name of the module in which the debug call resides
|
|
<enabled=0/1> : whether the messages are enabled or not
|
|
|
|
From a live system:
|
|
|
|
snd_hda_intel enabled=0
|
|
fixup enabled=0
|
|
driver enabled=0
|
|
|
|
Enable a module:
|
|
|
|
$echo "set enabled=1 <module_name>" > dynamic_printk/modules
|
|
|
|
Disable a module:
|
|
|
|
$echo "set enabled=0 <module_name>" > dynamic_printk/modules
|
|
|
|
Enable all modules:
|
|
|
|
$echo "set enabled=1 all" > dynamic_printk/modules
|
|
|
|
Disable all modules:
|
|
|
|
$echo "set enabled=0 all" > dynamic_printk/modules
|
|
|
|
Finally, passing "dynamic_printk" at the command line enables
|
|
debugging for all modules. This mode can be turned off via the above
|
|
disable command.
|
|
|
|
source "samples/Kconfig"
|
|
|
|
source "lib/Kconfig.kgdb"
|