908 Commits

Author SHA1 Message Date
Namhyung Kim
1fa4f3b57c init/main.c: remove BKL notations
According to commit 5e3d20a68f63fc5a310687d81956c3b96e488b84
(init: Remove the BKL from startup code) these sparse notations
should be removed also.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2010-10-19 11:29:58 +02:00
Peter Zijlstra
e360adbe29 irq_work: Add generic hardirq context callbacks
Provide a mechanism that allows running code in IRQ context. It is
most useful for NMI code that needs to interact with the rest of the
system -- like wakeup a task to drain buffers.

Perf currently has such a mechanism, so extract that and provide it as
a generic feature, independent of perf so that others may also
benefit.

The IRQ context callback is generated through self-IPIs where
possible, or on architectures like powerpc the decrementer (the
built-in timer facility) is set to generate an interrupt immediately.

Architectures that don't have anything like this get to do with a
callback from the timer tick. These architectures can call
irq_work_run() at the tail of any IRQ handlers that might enqueue such
work (like the perf IRQ handler) to avoid undue latencies in
processing the work.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
[ various fixes ]
Signed-off-by: Huang Ying <ying.huang@intel.com>
LKML-Reference: <1287036094.7768.291.camel@yhuang-dev>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-18 19:58:50 +02:00
Thomas Gleixner
154cd387cd genirq: Remove early_init_irq_lock_class()
early_init_irq_lock_class() is called way before anything touches the
irq descriptors. In case of SPARSE_IRQ=y this is a NOP operation
because the radix tree is empty at this point. For the SPARSE_IRQ=n
case it's sufficient to set the lock class in early_init_irq(). 

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-12 16:39:06 +02:00
Thomas Gleixner
d9817ebeee genirq: Provide Kconfig
The generic irq Kconfig options are copied around all archs. Provide a
generic Kconfig file which can be included.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20100927121843.217333624@linutronix.de>
Reviewed-by: H. Peter Anvin <hpa@zytor.com>
Reviewed-by: Ingo Molnar <mingo@elte.hu>
2010-10-04 11:01:05 +02:00
Hendrik Brueckner
ffe8018c34 initramfs: fix initramfs size calculation
The size of a built-in initramfs is calculated in init/initramfs.c by
"__initramfs_end - __initramfs_start".  Those symbols are defined in the
linker script include/asm-generic/vmlinux.lds.h:

#define INIT_RAM_FS                                                     \
        . = ALIGN(PAGE_SIZE);                                           \
        VMLINUX_SYMBOL(__initramfs_start) = .;                          \
        *(.init.ramfs)                                                  \
        VMLINUX_SYMBOL(__initramfs_end) = .;

If the initramfs file has an odd number of bytes, the "__initramfs_end"
symbol points to an odd address, for example, the symbols in the
System.map might look like:

    0000000000572000 T __initramfs_start
    00000000005bcd05 T __initramfs_end	  <-- odd address

At least on s390 this causes a problem:

Certain s390 instructions, especially instructions for loading addresses
(larl) or branch addresses must be on even addresses.  The compiler loads
the symbol addresses with the "larl" instruction.  This instruction sets
the last bit to 0 and, therefore, for odd size files, the calculated size
is one byte less than it should be:

    0000000000540a9c <populate_rootfs>:
      540a9c:     eb cf f0 78 00 24       stmg    %r12,%r15,120(%r15),
      540aa2:     c0 10 00 01 8a af       larl    %r1,572000 <__initramfs_start>
      540aa8:     c0 c0 00 03 e1 2e       larl    %r12,5bcd04 <initramfs_end>
                                                  (Instead of  5bcd05)
      ...
      540abe:     1b c1                   sr      %r12,%r1

To fix the problem, this patch introduces the global variable
__initramfs_size, which is calculated in the "usr/initramfs_data.S" file.
The populate_rootfs() function can then use the start marker of the
.init.ramfs section and the value of __initramfs_size for loading the
initramfs.  Because the start marker and size is sufficient, the
__initramfs_end symbol is no longer needed and is removed.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Acked-by: Michal Marek <mmarek@suse.cz>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2010-09-29 16:28:59 +02:00
Chuck Lever
56463e50d1 NFS: Use super.c for NFSROOT mount option parsing
Replace duplicate code in NFSROOT for mounting an NFS server on '/'
with logic that uses the existing mainline text-based logic in the NFS
client.

Add documenting comments where appropriate.

Note that this means NFSROOT mounts now use the same default settings
as v2/v3 mounts done via mount(2) from user space.

  vers=3,tcp,rsize=<negotiated default>,wsize=<negotiated default>

As before, however, no version/protocol negotiation with the server is
done.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2010-09-17 10:54:37 -04:00
Jens Axboe
6d0aed7a38 do_mounts: only enable PARTUUID for CONFIG_BLOCK
When CONFIG_BLOCK is not enabled:

init/do_mounts.c:71: error: implicit declaration of function 'dev_to_part'
init/do_mounts.c:71: warning: initialization makes pointer from integer without a cast
init/do_mounts.c:73: error: dereferencing pointer to incomplete type
init/do_mounts.c:76: error: dereferencing pointer to incomplete type
init/do_mounts.c:76: error: dereferencing pointer to incomplete type
init/do_mounts.c:102: error: implicit declaration of function 'part_pack_uuid'
init/do_mounts.c:104: error: 'block_class' undeclared (first use in this function)

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-17 10:00:46 +02:00
Vivek Goyal
e43473b7f2 blkio: Core implementation of throttle policy
o Actual implementation of throttling policy in block layer. Currently it
  implements READ and WRITE bytes per second throttling logic. IOPS throttling
  comes in later patches.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-16 08:42:52 +02:00
Jens Axboe
38b6f45a97 core: match_dev_by_uuid() should not be marked __init
It is also called outside the scope of init functions. Stephen
reports:

WARNING: init/mounts.o(.text+0x21a): Section mismatch in reference from the function name_to_dev_t() to the function .init.text:match_dev_by_uuid()
The function name_to_dev_t() references
the function __init match_dev_by_uuid().
This is often because name_to_dev_t lacks a __init
annotation or the annotation of match_dev_by_uuid is wrong.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-16 08:33:54 +02:00
Will Drewry
b5af921ec0 init: add support for root devices specified by partition UUID
This is the third patch in a series which adds support for
storing partition metadata, optionally, off of the hd_struct.

One major use for that data is being able to resolve partition
by other identities than just the index on a block device.  Device
enumeration varies by platform and there's a benefit to being able
to use something like EFI GPT's GUIDs to determine the correct
block device and partition to mount as the root.

This change adds that support to root= by adding support for
the following syntax:

  root=PARTUUID=hex-uuid

Signed-off-by: Will Drewry <wad@chromium.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-09-15 16:14:03 +02:00
Stephan Sperber
681b3049dd Kconfig: delete duplicate word
Deleted a word which apeared twice.

Signed-off-by: Stephan Sperber <sperberstephan@googlemail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2010-08-23 15:35:15 +02:00
Paul E. McKenney
a57eb940d1 rcu: Add a TINY_PREEMPT_RCU
Implement a small-memory-footprint uniprocessor-only implementation of
preemptible RCU.  This implementation uses but a single blocked-tasks
list rather than the combinatorial number used per leaf rcu_node by
TREE_PREEMPT_RCU, which reduces memory consumption and greatly simplifies
processing.  This version also takes advantage of uniprocessor execution
to accelerate grace periods in the case where there are no readers.

The general design is otherwise broadly similar to that of TREE_PREEMPT_RCU.

This implementation is a step towards having RCU implementation driven
off of the SMP and PREEMPT kernel configuration variables, which can
happen once this implementation has accumulated sufficient experience.

Removed ACCESS_ONCE() from __rcu_read_unlock() and added barrier() as
suggested by Steve Rostedt in order to avoid the compiler-reordering
issue noted by Mathieu Desnoyers (http://lkml.org/lkml/2010/8/16/183).

As can be seen below, CONFIG_TINY_PREEMPT_RCU represents almost 5Kbyte
savings compared to CONFIG_TREE_PREEMPT_RCU.  Of course, for non-real-time
workloads, CONFIG_TINY_RCU is even better.

	CONFIG_TREE_PREEMPT_RCU

	   text	   data	    bss	    dec	   filename
	     13	      0	      0	     13	   kernel/rcupdate.o
	   6170	    825	     28	   7023	   kernel/rcutree.o
				   ----
				   7026    Total

	CONFIG_TINY_PREEMPT_RCU

	   text	   data	    bss	    dec	   filename
	     13	      0	      0	     13	   kernel/rcupdate.o
	   2081	     81	      8	   2170	   kernel/rcutiny.o
				   ----
				   2183    Total

	CONFIG_TINY_RCU (non-preemptible)

	   text	   data	    bss	    dec	   filename
	     13	      0	      0	     13	   kernel/rcupdate.o
	    719	     25	      0	    744	   kernel/rcutiny.o
				    ---
				    757    Total

Requested-by: Loïc Minier <loic.minier@canonical.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-08-20 08:55:00 -07:00
Paul E. McKenney
4d87ffadbb rcu: Fix RCU_FANOUT help message
Commit cf244dc01bf68 added a fourth level to the TREE_RCU hierarchy,
but the RCU_FANOUT help message still said "cube root".  This commit
fixes this to "fourth root" and also emphasizes that production
systems are well-served by the default.  (Stress-testing RCU itself
uses small RCU_FANOUT values in order to test large-system code paths
on small(er) systems.)

Located-by: John Kacur <jkacur@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-08-19 17:18:04 -07:00
Paul E. McKenney
687d7a960a rcu: restrict TREE_RCU to SMP builds with !PREEMPT
Because both TINY_RCU and TREE_PREEMPT_RCU have been in mainline for
several releases, it is time to restrict the use of TREE_RCU to SMP
non-preemptible systems.  This reduces testing/validation effort.  This
commit is a first step towards driving the selection of RCU implementation
directly off of the SMP and PREEMPT configuration parameters.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2010-08-19 17:18:04 -07:00
David Howells
d7627467b7 Make do_execve() take a const filename pointer
Make do_execve() take a const filename pointer so that kernel_execve() compiles
correctly on ARM:

arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type

This also requires the argv and envp arguments to be consted twice, once for
the pointer array and once for the strings the array points to.  This is
because do_execve() passes a pointer to the filename (now const) to
copy_strings_kernel().  A simpler alternative would be to cast the filename
pointer in do_execve() when it's passed to copy_strings_kernel().

do_execve() may not change any of the strings it is passed as part of the argv
or envp lists as they are some of them in .rodata, so marking these strings as
const should be fine.

Further kernel_execve() and sys_execve() need to be changed to match.

This has been test built on x86_64, frv, arm and mips.

Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-17 18:07:43 -07:00
Linus Torvalds
26df0766a7 Merge branch 'params' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus
* 'params' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (22 commits)
  param: don't deref arg in __same_type() checks
  param: update drivers/acpi/debug.c to new scheme
  param: use module_param in drivers/message/fusion/mptbase.c
  ide: use module_param_named rather than module_param_call
  param: update drivers/char/ipmi/ipmi_watchdog.c to new scheme
  param: lock if_sdio's lbs_helper_name and lbs_fw_name against sysfs changes.
  param: lock myri10ge_fw_name against sysfs changes.
  param: simple locking for sysfs-writable charp parameters
  param: remove unnecessary writable charp
  param: add kerneldoc to moduleparam.h
  param: locking for kernel parameters
  param: make param sections const.
  param: use free hook for charp (fix leak of charp parameters)
  param: add a free hook to kernel_param_ops.
  param: silence .init.text references from param ops
  Add param ops struct for hvc_iucv driver.
  nfs: update for module_param_named API change
  AppArmor: update for module_param_named API change
  param: use ops in struct kernel_param, rather than get and set fns directly
  param: move the EXPORT_SYMBOL to after the definitions.
  ...
2010-08-12 10:01:59 -07:00
KAMEZAWA Hiroyuki
65e0e81166 memcg: remove experimental from swap account config
It's 11 months since we changed swap_map[] to indicates SWAP_HAS_CACHE.
Since that, memcg's swap accounting has been very stable and it seems
it can be maintained.

So, I'd like to remove EXPERIMENTAL from the config.

Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-11 08:59:18 -07:00
Rusty Russell
914dcaa84c param: make param sections const.
Since this section can be read-only (they're in .rodata), they should
always have been const.  Minor flow-through various functions.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Phil Carmody <ext-phil.2.carmody@nokia.com>
2010-08-11 23:04:19 +09:30
Linus Torvalds
8c8946f509 Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify
* 'for-linus' of git://git.infradead.org/users/eparis/notify: (132 commits)
  fanotify: use both marks when possible
  fsnotify: pass both the vfsmount mark and inode mark
  fsnotify: walk the inode and vfsmount lists simultaneously
  fsnotify: rework ignored mark flushing
  fsnotify: remove global fsnotify groups lists
  fsnotify: remove group->mask
  fsnotify: remove the global masks
  fsnotify: cleanup should_send_event
  fanotify: use the mark in handler functions
  audit: use the mark in handler functions
  dnotify: use the mark in handler functions
  inotify: use the mark in handler functions
  fsnotify: send fsnotify_mark to groups in event handling functions
  fsnotify: Exchange list heads instead of moving elements
  fsnotify: srcu to protect read side of inode and vfsmount locks
  fsnotify: use an explicit flag to indicate fsnotify_destroy_mark has been called
  fsnotify: use _rcu functions for mark list traversal
  fsnotify: place marks on object in order of group memory address
  vfs/fsnotify: fsnotify_close can delay the final work in fput
  fsnotify: store struct file not struct path
  ...

Fix up trivial delete/modify conflict in fs/notify/inotify/inotify.c.
2010-08-10 11:39:13 -07:00
Kevin Winchester
e446127134 init/main.c: mark do_one_initcall* as __init_or_module
Andrew Morton suggested that the do_one_initcall and do_one_initcall_debug
functions can be marked __init_or_module such that they can be discarded
for the CONFIG_MODULES=N case.

Signed-off-by: Kevin Winchester <kjwinchester@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:06 -07:00
Kevin Winchester
22c5c03b42 init/main.c: fix warning: 'calltime.tv64' may be used uninitialized
Using:

	gcc (GCC) 4.5.0 20100610 (prerelease)

The following warning appears:

	init/main.c: In function `do_one_initcall':
	init/main.c:730:10: warning: `calltime.tv64' may be used uninitialized in this function

This warning is actually correct, as the global initcall_debug could
arguably be changed by the initcall.

Correct this warning by extracting a new function, do_one_initcall_debug,
that performs the initcall for the debug case.

Signed-off-by: Kevin Winchester <kjwinchester@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:06 -07:00
Linus Torvalds
78417334b5 Merge branch 'bkl/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing
* 'bkl/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing:
  do_coredump: Do not take BKL
  init: Remove the BKL from startup code
2010-08-07 17:06:54 -07:00
Linus Torvalds
3b7433b8a8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (55 commits)
  workqueue: mark init_workqueues() as early_initcall()
  workqueue: explain for_each_*cwq_cpu() iterators
  fscache: fix build on !CONFIG_SYSCTL
  slow-work: kill it
  gfs2: use workqueue instead of slow-work
  drm: use workqueue instead of slow-work
  cifs: use workqueue instead of slow-work
  fscache: drop references to slow-work
  fscache: convert operation to use workqueue instead of slow-work
  fscache: convert object to use workqueue instead of slow-work
  workqueue: fix how cpu number is stored in work->data
  workqueue: fix mayday_mask handling on UP
  workqueue: fix build problem on !CONFIG_SMP
  workqueue: fix locking in retry path of maybe_create_worker()
  async: use workqueue for worker pool
  workqueue: remove WQ_SINGLE_CPU and use WQ_UNBOUND instead
  workqueue: implement unbound workqueue
  workqueue: prepare for WQ_UNBOUND implementation
  libata: take advantage of cmwq and remove concurrency limitations
  workqueue: fix worker management invocation without pending works
  ...

Fixed up conflicts in fs/cifs/* as per Tejun. Other trivial conflicts in
include/linux/workqueue.h, kernel/trace/Kconfig and kernel/workqueue.c
2010-08-07 12:42:58 -07:00
Linus Torvalds
4aed2fd8e3 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits)
  tracing/kprobes: unregister_trace_probe needs to be called under mutex
  perf: expose event__process function
  perf events: Fix mmap offset determination
  perf, powerpc: fsl_emb: Restore setting perf_sample_data.period
  perf, powerpc: Convert the FSL driver to use local64_t
  perf tools: Don't keep unreferenced maps when unmaps are detected
  perf session: Invalidate last_match when removing threads from rb_tree
  perf session: Free the ref_reloc_sym memory at the right place
  x86,mmiotrace: Add support for tracing STOS instruction
  perf, sched migration: Librarize task states and event headers helpers
  perf, sched migration: Librarize the GUI class
  perf, sched migration: Make the GUI class client agnostic
  perf, sched migration: Make it vertically scrollable
  perf, sched migration: Parameterize cpu height and spacing
  perf, sched migration: Fix key bindings
  perf, sched migration: Ignore unhandled task states
  perf, sched migration: Handle ignored migrate out events
  perf: New migration tool overview
  tracing: Drop cpparg() macro
  perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call
  ...

Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c
2010-08-06 09:30:52 -07:00
Linus Torvalds
ffd386a9a8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: allow limited allocation before slab is online
  percpu: make @dyn_size always mean min dyn_size in first chunk init functions
2010-08-04 15:17:52 -07:00
Suresh Siddha
6ee0578b4d workqueue: mark init_workqueues() as early_initcall()
Mark init_workqueues() as early_initcall() and thus it will be initialized
before smp bringup. init_workqueues() registers for the hotcpu notifier
and thus it should cope with the processors that are brought online after
the workqueues are initialized.

x86 smp bringup code uses workqueues and uses a workaround for the
cold boot process (as the workqueues are initialized post smp_init()).
Marking init_workqueues() as early_initcall() will pave the way for
cleaning up this code.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
2010-08-01 13:05:29 +02:00
Eric Paris
939a67fc4c Audit: split audit watch Kconfig
Audit watch should depend on CONFIG_AUDIT_SYSCALL and should select
FSNOTIFY.  This splits the spagetti like mixing of audit_watch and
audit_filter code so they can be configured seperately.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:19 -04:00
Eric Paris
67640b602f Audit: audit watches depend on fsnotify
CONFIG_AUDIT builds audit_watches which depend on fsnotify.  Make
CONFIG_AUDIT select fsnotify.

Reported-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:18 -04:00
Eric Paris
28a3a7eb3b audit: reimplement audit_trees using fsnotify rather than inotify
Simply switch audit_trees from using inotify to using fsnotify for it's
inode pinning and disappearing act information.

Signed-off-by: Eric Paris <eparis@redhat.com>
2010-07-28 09:58:17 -04:00
Tejun Heo
181a51f6e0 slow-work: kill it
slow-work doesn't have any user left.  Kill it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: David Howells <dhowells@redhat.com>
2010-07-23 13:14:49 +02:00
Arnd Bergmann
5e3d20a68f init: Remove the BKL from startup code
I have shown by code review that no driver takes
the BKL at init time any more, so whatever the
init code was locking against is no longer there
and it is now safe to remove the BKL there.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Steven Rostedt <rostedt@goodmis>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-07-09 15:40:32 +02:00
Ingo Molnar
08f8ba0799 Merge commit 'v2.6.35-rc4' into perf/core
Merge reason: Pick up the latest perf fixes

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-07-05 08:30:58 +02:00
Linus Torvalds
123f94f22e Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Cure nr_iowait_cpu() users
  init: Fix comment
  init, sched: Fix race between init and kthreadd
2010-07-02 09:52:58 -07:00
Peter Zijlstra
9715856922 init: Fix comment
Apparently "pid-1" confuses people...

Requested-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: torvalds@linux-foundation.org
Cc: randy.dunlap@oracle.com
Cc: Ilya Loginov <isloginov@gmail.com>
LKML-Reference: <1277887031.1868.82.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-06-30 10:42:44 +02:00
Thomas Gleixner
f384c954c9 Merge branch 'linus' into perf/core
Reason: Further changes conflict with upstream fixes

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2010-06-28 22:33:24 +02:00
Peter Zijlstra
b433c3d454 init, sched: Fix race between init and kthreadd
Ilya reported that on a very slow machine he could reliably
reproduce a race between forking init and kthreadd. We first
fork init so that it  obtains pid-1, however since the scheduler
is already fully running at this point it can preempt and run
the init thread before we spawn and set kthreadd_task.

The init thread can then attempt spawning kthreads without
kthreadd being present which results in an OOPS.

Reported-by: Ilya Loginov <isloginov@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <1277736661.3561.110.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-06-28 18:21:30 +02:00
Tejun Heo
099a19d91c percpu: allow limited allocation before slab is online
This patch updates percpu allocator such that it can serve limited
amount of allocation before slab comes online.  This is primarily to
allow slab to depend on working percpu allocator.

Two parameters, PERCPU_DYNAMIC_EARLY_SIZE and SLOTS, determine how
much memory space and allocation map slots are reserved.  If this
reserved area is exhausted, WARN_ON_ONCE() will trigger and allocation
will fail till slab comes online.

The following changes are made to implement early alloc.

* pcpu_mem_alloc() now checks slab_is_available()

* Chunks are allocated using pcpu_mem_alloc()

* Init paths make sure ai->dyn_size is at least as large as
  PERCPU_DYNAMIC_EARLY_SIZE.

* Initial alloc maps are allocated in __initdata and copied to
  kmalloc'd areas once slab is online.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Lameter <cl@linux-foundation.org>
2010-06-27 18:50:00 +02:00
Ingo Molnar
646b1db495 Merge commit 'v2.6.35-rc3' into perf/core
Merge reason: Go from -rc1 base to -rc3 base, merge in fixes.
2010-06-18 10:53:19 +02:00
Thomas Renninger
75cbfb97a1 ACPI: Do not try to set up acpi processor stuff on cores exceeding maxcpus=
Patch is against latest Linus master branch and is expected to be
safe bug fix.

You get:
ACPI: HARDWARE addr space,NOT supported yet
for each ACPI defined CPU which status is active, but exceeds
maxcpus= count.

As these "not booted" CPUs do not run an idle routine
and echo X >/proc/acpi/processor/*/throttling did not work
I couldn't find a way to really access not onlined/booted
machines. Still this should get fixed and
/proc/acpi/processor/X dirs of cores exceeding maxcpus
should not show up.

I wonder whether this could get cleaned up by truncating possible cpu mask
and nr_cpu_ids to setup_max_cpus early some day
(and not exporting setup_max_cpus anymore then).
But this needs touching of a lot other places...

Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: travis@sgi.com
CC: linux-acpi@vger.kernel.org
CC: lenb@kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
2010-06-09 18:04:12 -04:00
Li Zefan
039ca4e74a tracing: Remove kmemtrace ftrace plugin
We have been resisting new ftrace plugins and removing existing
ones, and kmemtrace has been superseded by kmem trace events
and perf-kmem, so we remove it.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
[ remove kmemtrace from the makefile, handle slob too ]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-06-09 17:31:22 +02:00
Américo Wang
30dbb20e68 tracing: Remove boot tracer
The boot tracer is useless. It simply logs the initcalls
but in fact these initcalls are also logged through printk
while using the initcall_debug kernel parameter.

Nobody seem to be using it so far. Then just remove it.

Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Chase Douglas <chase.douglas@canonical.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <20100526105753.GA5677@cr0.nay.redhat.com>
[ remove the hooks in main.c, and the headers ]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-06-08 23:31:28 +02:00
Linus Torvalds
1f73897861 Merge branch 'for-35' of git://repo.or.cz/linux-kbuild
* 'for-35' of git://repo.or.cz/linux-kbuild: (81 commits)
  kbuild: Revert part of e8d400a to resolve a conflict
  kbuild: Fix checking of scm-identifier variable
  gconfig: add support to show hidden options that have prompts
  menuconfig: add support to show hidden options which have prompts
  gconfig: remove show_debug option
  gconfig: remove dbg_print_ptype() and dbg_print_stype()
  kconfig: fix zconfdump()
  kconfig: some small fixes
  add random binaries to .gitignore
  kbuild: Include gen_initramfs_list.sh and the file list in the .d file
  kconfig: recalc symbol value before showing search results
  .gitignore: ignore *.lzo files
  headerdep: perlcritic warning
  scripts/Makefile.lib: Align the output of LZO
  kbuild: Generate modules.builtin in make modules_install
  Revert "kbuild: specify absolute paths for cscope"
  kbuild: Do not unnecessarily regenerate modules.builtin
  headers_install: use local file handles
  headers_check: fix perl warnings
  export_report: fix perl warnings
  ...
2010-06-01 08:55:52 -07:00
Haicheng Li
1f522509c7 mem-hotplug: avoid multiple zones sharing same boot strapping boot_pageset
For each new populated zone of hotadded node, need to update its pagesets
with dynamically allocated per_cpu_pageset struct for all possible CPUs:

    1) Detach zone->pageset from the shared boot_pageset
       at end of __build_all_zonelists().

    2) Use mutex to protect zone->pageset when it's still
       shared in onlined_pages()

Otherwises, multiple zones of different nodes would share same boot strapping
boot_pageset for same CPU, which will finally cause below kernel panic:

  ------------[ cut here ]------------
  kernel BUG at mm/page_alloc.c:1239!
  invalid opcode: 0000 [#1] SMP
  ...
  Call Trace:
   [<ffffffff811300c1>] __alloc_pages_nodemask+0x131/0x7b0
   [<ffffffff81162e67>] alloc_pages_current+0x87/0xd0
   [<ffffffff81128407>] __page_cache_alloc+0x67/0x70
   [<ffffffff811325f0>] __do_page_cache_readahead+0x120/0x260
   [<ffffffff81132751>] ra_submit+0x21/0x30
   [<ffffffff811329c6>] ondemand_readahead+0x166/0x2c0
   [<ffffffff81132ba0>] page_cache_async_readahead+0x80/0xa0
   [<ffffffff8112a0e4>] generic_file_aio_read+0x364/0x670
   [<ffffffff81266cfa>] nfs_file_read+0xca/0x130
   [<ffffffff8117b20a>] do_sync_read+0xfa/0x140
   [<ffffffff8117bf75>] vfs_read+0xb5/0x1a0
   [<ffffffff8117c151>] sys_read+0x51/0x80
   [<ffffffff8103c032>] system_call_fastpath+0x16/0x1b
  RIP  [<ffffffff8112ff13>] get_page_from_freelist+0x883/0x900
   RSP <ffff88000d1e78a8>
  ---[ end trace 4bda28328b9990db ]

[akpm@linux-foundation.org: merge fix]
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Reviewed-by: Andi Kleen <andi.kleen@intel.com>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-25 08:07:01 -07:00
Jens Axboe
ee9a3607fb Merge branch 'master' into for-2.6.35
Conflicts:
	fs/ext3/fsync.c

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-05-21 21:27:26 +02:00
Jason Wessel
0b4b3827db x86, kgdb, init: Add early and late debug states
The kernel debugger can operate well before mm_init(), but the x86
hardware breakpoint code which uses the perf api requires that the
kernel allocators are initialized.

This means the kernel debug core needs to provide an optional arch
specific call back to allow the initialization functions to run after
the kernel has been further initialized.

The kdb shell already had a similar restriction with an early
initialization and late initialization.  The kdb_init() was moved into
the debug core's version of the late init which is called
dbg_late_init();

CC: kgdb-bugreport@lists.sourceforge.net
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
2010-05-20 21:04:29 -05:00
Jason Wessel
67fc4e0cb9 kdb: core for kgdb back end (2 of 2)
This patch contains the hooks and instrumentation into kernel which
live outside the kernel/debug directory, which the kdb core
will call to run commands like lsmod, dmesg, bt etc...

CC: linux-arch@vger.kernel.org
Signed-off-by: Jason Wessel <jason.wessel@windriver.com>
Signed-off-by: Martin Hicks <mort@sgi.com>
2010-05-20 21:04:21 -05:00
Ingo Molnar
48652ced15 Merge commit 'v2.6.34-rc6' into sched/core 2010-05-07 11:27:54 +02:00
Jens Axboe
7407cf355f Merge branch 'master' into for-2.6.35
Conflicts:
	fs/block_dev.c

Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-29 09:36:24 +02:00
Vivek Goyal
afc24d49c1 blk-cgroup: config options re-arrangement
This patch fixes few usability and configurability issues.

o All the cgroup based controller options are configurable from
  "Genral Setup/Control Group Support/" menu. blkio is the only exception.
  Hence make this option visible in above menu and make it configurable from
  there to bring it inline with rest of the cgroup based controllers.

o Get rid of CONFIG_DEBUG_CFQ_IOSCHED.

  This option currently does two things.

  - Enable printing of cgroup paths in blktrace
  - Enables CONFIG_DEBUG_BLK_CGROUP, which in turn displays additional stat
    files in cgroup.

  If we are using group scheduling, blktrace data is of not really much use
  if cgroup information is not present. To get this data, currently one has to
  also enable CONFIG_DEBUG_CFQ_IOSCHED, which in turn brings the overhead of
  all the additional debug stat files which is not desired.

  Hence, this patch moves printing of cgroup paths under
  CONFIG_CFQ_GROUP_IOSCHED.

  This allows us to get rid of CONFIG_DEBUG_CFQ_IOSCHED completely. Now all
  the debug stat files are controlled only by CONFIG_DEBUG_BLK_CGROUP which
  can be enabled through config menu.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Divyesh Shah <dpshah@google.com>
Reviewed-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2010-04-26 19:27:56 +02:00
Phillip Lougher
df37bd156d initramfs: handle unrecognised decompressor when unpacking
The unpack routine fails to handle the decompress_method() returning
unrecognised decompressor (compress_name == NULL).  This results in the
routine looping eventually oopsing on an out of bounds memory access.

Note this bug is usually hidden, only triggering on trailing junk after
one or more correct compressed blocks.  The case of the compressed archive
being complete junk is (by accident?) caught by the if (state != Reset)
check because state is initialised to Start, but not updated due to the
decompressor not having been called.  Obviously if the junk is trailing a
correctly decompressed buffer, state == Reset from the previous call to
the decompressor.

Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-24 11:31:26 -07:00