linux-stable

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2025-01-17 02:36:21 +00:00

Author	SHA1	Message	Date
Peter Zijlstra	c2eb505b9b	MAINTAINERS, sched: Update file pattern Took a while to sort out these bits and we'd like to be Cc:-ed on future modifications to the waitqueue APIs and all that. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-0ix315c7qcz88slmnrpshvmf@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:16:27 +02:00
Peter Zijlstra	35a2af94c7	sched/wait: Make the __wait_event() interface more friendly Change all __wait_event() implementations to match the corresponding wait_event() signature for convenience. In particular this does away with the weird 'ret' logic. Since there are __wait_event() users this requires we update them too. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092529.042563462@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:16:25 +02:00
Peter Zijlstra	ebdc195f2e	sched/wait: Collapse __wait_event_hrtimeout() While not a whole-sale replacement like the others we can still reduce the size of __wait_event_hrtimeout() considerably by noting that the actual core of __wait_event_hrtimeout() is identical to what ___wait_event() generates. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.972793648@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:16:22 +02:00
Peter Zijlstra	cf7361fd96	sched/wait: Collapse __wait_event_killable() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.898691966@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:16:21 +02:00
Peter Zijlstra	0d1e1c8a43	sched/wait: Collapse __wait_event_interruptible_tty() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.831085521@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:16:20 +02:00
Peter Zijlstra	a1dc6852ac	sched/wait: Collapse __wait_event_interruptible_lock_irq_timeout() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.759956109@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:16:20 +02:00
Peter Zijlstra	8fbd88fa17	sched/wait: Collapse __wait_event_interruptible_lock_irq() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.686006009@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:16:19 +02:00
Peter Zijlstra	13cb5042a4	sched/wait: Collapse __wait_event_lock_irq() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.612813379@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:50 +02:00
Peter Zijlstra	48c2521717	sched/wait: Collapse __wait_event_interruptible_exclusive() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.541716442@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:49 +02:00
Peter Zijlstra	c2ebb1fb4e	sched/wait: Collapse __wait_event_interruptible_timeout() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.469616907@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:49 +02:00
Peter Zijlstra	f13f4c41c9	sched/wait: Collapse __wait_event_interruptible() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.396949919@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:48 +02:00
Peter Zijlstra	ddc1994b82	sched/wait: Collapse __wait_event_timeout() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.325264677@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:47 +02:00
Peter Zijlstra	854267f438	sched/wait: Collapse __wait_event() Reduce macro complexity by using the new ___wait_event() helper. No change in behaviour, identical generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.254863348@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:46 +02:00
Peter Zijlstra	41a1431b17	sched/wait: Introduce ___wait_event() There's far too much duplication in the __wait_event macros; in order to fix this introduce ___wait_event() a macro with the capability to replace most other macros. With the previous patches changing the various __wait_event*() implementations to be more uniform; we can now collapse the lot without also changing generated code. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.181897111@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:46 +02:00
Peter Zijlstra	bb632bc449	sched/wait: Change the wait_exclusive control flow Purely a preparatory patch; it changes the control flow to match what will soon be generated by generic code so that that patch can be a unity transform. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.107994763@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:45 +02:00
Peter Zijlstra	2953ef246b	sched/wait: Change timeout logic Commit 4c663cf ("wait: fix false timeouts when using wait_event_timeout()") introduced an additional condition check after a timeout but there's a few issues; - it forgot one site - it put the check after the main loop; not at the actual timeout check. Cure both; by wrapping the condition (as suggested by Oleg), this avoids double evaluation of 'condition' which could be quite big. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092528.028892896@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:44 +02:00
Peter Zijlstra	2f2a2b60ad	sched/wait: Make the signal_pending() checks consistent There's two patterns to check signals in the __wait_event*() macros: if (!signal_pending(current)) { schedule(); continue; } ret = -ERESTARTSYS; break; And the more natural: if (signal_pending(current)) { ret = -ERESTARTSYS; break; } schedule(); Change them all into the latter form. Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131002092527.956416254@infradead.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-04 10:14:44 +02:00
Peter Zijlstra	75f93fed50	sched: Revert need_resched() to look at TIF_NEED_RESCHED Yuanhan reported a serious throughput regression in his pigz benchmark. Using the ftrace patch I found that several idle paths need more TLC before we can switch the generic need_resched() over to preempt_need_resched. The preemption paths benefit most from preempt_need_resched and do indeed use it; all other need_resched() users don't really care that much so reverting need_resched() back to tif_need_resched() is the simple and safe solution. Reported-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: lkp@linux.intel.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20130927153003.GF15690@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-28 10:04:47 +02:00
Peter Zijlstra	1a338ac32c	sched, x86: Optimize the preempt_schedule() call Remove the bloat of the C calling convention out of the preempt_enable() sites by creating an ASM wrapper which allows us to do an asm("call ___preempt_schedule") instead. calling.h bits by Andi Kleen Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-tk7xdi1cvvxewixzke8t8le1@git.kernel.org [ Fixed build error. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:23:07 +02:00
Peter Zijlstra	c2daa3bed5	sched, x86: Provide a per-cpu preempt_count implementation Convert x86 to use a per-cpu preemption count. The reason for doing so is that accessing per-cpu variables is a lot cheaper than accessing thread_info variables. We still need to save/restore the actual preemption count due to PREEMPT_ACTIVE so we place the per-cpu __preempt_count variable in the same cache-line as the other hot __switch_to() variables such as current_task. NOTE: this save/restore is required even for !PREEMPT kernels as cond_resched() also relies on preempt_count's PREEMPT_ACTIVE to ignore task_struct::state. Also rename thread_info::preempt_count to ensure nobody is 'accidentally' still poking at it. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-gzn5rfsf8trgjoqx8hyayy3q@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:57 +02:00
Peter Zijlstra	a233f1120c	sched: Prepare for per-cpu preempt_count When using per-cpu preempt_count variables we need to save/restore the preempt_count on context switch (into per task storage; for instance the old thread_info::preempt_count variable) because of PREEMPT_ACTIVE. However, this means that on fork() the preempt_count value of the last context switch gets copied and if we had a PREEMPT_ACTIVE switch right before cloning a child task the child task will now too have PREEMPT_ACTIVE set and start its life with an extra PREEMPT_ACTIVE count. Therefore we need to make init_task_preempt_count() unconditional; this resets whatever preempt_count we inherited from our parent process. Doing so for !per-cpu implementations is harmless. For !PREEMPT_COUNT kernels we need to be careful not to start life with an increased preempt_count. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-4k0b7oy1rcdyzochwiixuwi9@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:55 +02:00
Peter Zijlstra	bdb4380658	sched: Extract the basic add/sub preempt_count modifiers Rewrite the preempt_count macros in order to extract the 3 basic preempt_count value modifiers: __preempt_count_add() __preempt_count_sub() and the new: __preempt_count_dec_and_test() And since we're at it anyway, replace the unconventional $op_preempt_count names with the more conventional preempt_count_$op. Since these basic operators are equivalent to the previous _notrace() variants, do away with the _notrace() versions. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-ewbpdbupy9xpsjhg960zwbv8@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:54 +02:00
Peter Zijlstra	0102874755	sched: Create more preempt_count accessors We need a few special preempt_count accessors: - task_preempt_count() for when we're interested in the preemption count of another (non-running) task. - init_task_preempt_count() for properly initializing the preemption count. - init_idle_preempt_count() a special case of the above for the idle threads. With these no generic code ever touches thread_info::preempt_count anymore and architectures could choose to remove it. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-jf5swrio8l78j37d06fzmo4r@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:52 +02:00
Peter Zijlstra	a787870924	sched, arch: Create asm/preempt.h In order to prepare to per-arch implementations of preempt_count move the required bits into an asm-generic header and use this for all archs. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-h5j0c1r3e3fk015m30h8f1zx@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:50 +02:00
Peter Zijlstra	f27dde8dee	sched: Add NEED_RESCHED to the preempt_count In order to combine the preemption and need_resched test we need to fold the need_resched information into the preempt_count value. Since the NEED_RESCHED flag is set across CPUs this needs to be an atomic operation, however we very much want to avoid making preempt_count atomic, therefore we keep the existing TIF_NEED_RESCHED infrastructure in place but at 3 sites test it and fold its value into preempt_count; namely: - resched_task() when setting TIF_NEED_RESCHED on the current task - scheduler_ipi() when resched_task() sets TIF_NEED_RESCHED on a remote task it follows it up with a reschedule IPI and we can modify the cpu local preempt_count from there. - cpu_idle_loop() for when resched_task() found tsk_is_polling(). We use an inverted bitmask to indicate need_resched so that a 0 means both need_resched and !atomic. Also remove the barrier() in preempt_enable() between preempt_enable_no_resched() and preempt_check_resched() to avoid having to reload the preemption value and allow the compiler to use the flags of the previuos decrement. I couldn't come up with any sane reason for this barrier() to be there as preempt_enable_no_resched() already has a barrier() before doing the decrement. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-7a7m5qqbn5pmwnd4wko9u6da@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:49 +02:00
Peter Zijlstra	4a2b4b2227	sched: Introduce preempt_count accessor functions Replace the single preempt_count() 'function' that's an lvalue with two proper functions: preempt_count() - returns the preempt_count value as rvalue preempt_count_set() - Allows setting the preempt-count value Also provide preempt_count_ptr() as a convenience wrapper to implement all modifying operations. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-orxrbycjozopqfhb4dxdkdvb@git.kernel.org [ Fixed build failure. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 14:07:32 +02:00
Peter Zijlstra	ea81174789	sched, idle: Fix the idle polling state logic Mike reported that commit 7d1a9417 ("x86: Use generic idle loop") regressed several workloads and caused excessive reschedule interrupts. The patch in question failed to notice that the x86 code had an inverted sense of the polling state versus the new generic code (x86: default polling, generic: default !polling). Fix the two prominent x86 mwait based idle drivers and introduce a few new generic polling helpers (fixing the wrong smp_mb__after_clear_bit usage). Also switch the idle routines to using tif_need_resched() which is an immediate TIF_NEED_RESCHED test as opposed to need_resched which will end up being slightly different. Reported-by: Mike Galbraith <bitbucket@online.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: lenb@kernel.org Cc: tglx@linutronix.de Link: http://lkml.kernel.org/n/tip-nc03imb0etuefmzybzj7sprf@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 13:53:10 +02:00
Peter Zijlstra	3150398626	sched: Remove {set,clear}_need_resched Preemption semantics are going to change which mandate a change. All DRM usage sites are already broken and will not be affected (much) by this change. DRM people are aware and will remove the last few stragglers. For now, leave an empty stub that generates a warning, once all users are gone we can remove this. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: airlied@linux.ie Cc: daniel.vetter@ffwll.ch Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/n/tip-qfc1el2zvhxiyut4ai99ij4n@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 13:53:09 +02:00
Peter Zijlstra	b021fe3e25	sched, rcu: Make RCU use resched_cpu() We're going to deprecate and remove set_need_resched() for it will do the wrong thing. Make an exception for RCU and allow it to use resched_cpu() which will do the right thing. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-2eywnacjl1nllctl1nszqa5w@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 13:53:08 +02:00
Peter Zijlstra	0c44c2d0f4	x86: Use asm goto to implement better modify_and_test() functions Linus suggested using asm goto to get rid of the typical SETcc + TEST instruction pair -- which also clobbers an extra register -- for our typical modify_and_test() functions. Because asm goto doesn't allow output fields it has to include an unconditinal memory clobber when it changes a memory variable to force a reload. Luckily all atomic ops already imply a compiler barrier to go along with their memory barrier semantics. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-0mtn9siwbeo1d33bap1422se@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 13:53:08 +02:00
Michael S. Tsirkin	4314895165	sched: Micro-optimize by dropping unnecessary task_rq() calls We always know the rq used, let's just pass it around. This seems to cut the size of scheduler core down a tiny bit: Before: [linux]$ size kernel/sched/core.o.orig text data bss dec hex filename 62760 16130 3876 82766 1434e kernel/sched/core.o.orig After: [linux]$ size kernel/sched/core.o.patched text data bss dec hex filename 62566 16130 3876 82572 1428c kernel/sched/core.o.patched Probably speeds it up as well. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130922142054.GA11499@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 13:51:06 +02:00
Jason Low	f48627e686	sched/balancing: Periodically decay max cost of idle balance This patch builds on patch 2 and periodically decays that max value to do idle balancing per sched domain by approximately 1% per second. Also decay the rq's max_idle_balance_cost value. Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379096813-3032-4-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-20 12:03:46 +02:00
Jason Low	9bd721c55c	sched/balancing: Consider max cost of idle balance per sched domain In this patch, we keep track of the max cost we spend doing idle load balancing for each sched domain. If the avg time the CPU remains idle is less then the time we have already spent on idle balancing + the max cost of idle balancing in the sched domain, then we don't continue to attempt the balance. We also keep a per rq variable, max_idle_balance_cost, which keeps track of the max time spent on newidle load balances throughout all its domains so that we can determine the avg_idle's max value. By using the max, we avoid overrunning the average. This further reduces the chance we attempt balancing when the CPU is not idle for longer than the cost to balance. Signed-off-by: Jason Low <jason.low2@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379096813-3032-3-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-20 12:03:44 +02:00
Jason Low	abfafa54db	sched: Reduce overestimating rq->avg_idle When updating avg_idle, if the delta exceeds some max value, then avg_idle gets set to the max, regardless of what the previous avg was. This can cause avg_idle to often be overestimated. This patch modifies the way we update avg_idle by always updating it with the function call to update_avg() first. Then, if avg_idle exceeds the max, we set it to the max. Signed-off-by: Jason Low <jason.low2@hp.com> Reviewed-by: Rik van Riel <riel@redhat.com> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379096813-3032-2-git-send-email-jason.low2@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-20 12:03:41 +02:00
Vladimir Davydov	7aff2e3a56	sched/balancing: Prevent the reselection of a previous env.dst_cpu if some tasks are pinned Currently new_dst_cpu is prevented from being reselected actually, not dst_cpu. This can result in attempting to pull tasks to this_cpu twice. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/281f59b6e596c718dd565ad267fc38f5b8e5c995.1379265590.git.vdavydov@parallels.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-20 12:02:20 +02:00
Ingo Molnar	40a0c68ca9	Merge branch 'sched/urgent' into sched/core Merge in the latest fixes before applying a dependent patch. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-20 12:01:01 +02:00
Vladimir Davydov	7e3115ef51	sched/balancing: Fix cfs_rq->task_h_load calculation Patch a003a2 (sched: Consider runnable load average in move_tasks()) sets all top-level cfs_rqs' h_load to rq->avg.load_avg_contrib, which is always 0. This mistype leads to all tasks having weight 0 when load balancing in a cpu-cgroup enabled setup. There obviously should be sum of weights of all runnable tasks there instead. Fix it. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Reviewed-by: Paul Turner <pjt@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1379173186-11944-1-git-send-email-vdavydov@parallels.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-20 11:59:39 +02:00
Vladimir Davydov	3029ede393	sched/balancing: Fix 'local->avg_load > busiest->avg_load' case in fix_small_imbalance() In busiest->group_imb case we can come to fix_small_imbalance() with local->avg_load > busiest->avg_load. This can result in wrong imbalance fix-up, because there is the following check there where all the members are unsigned: if (busiest->avg_load - local->avg_load + scaled_busy_load_per_task >= (scaled_busy_load_per_task * imbn)) { env->imbalance = busiest->load_per_task; return; } As a result we can end up constantly bouncing tasks from one cpu to another if there are pinned tasks. Fix it by substituting the subtraction with an equivalent addition in the check. [ The bug can be caught by running 2*N cpuhogs pinned to two logical cpus belonging to different cores on an HT-enabled machine with N logical cpus: just look at se.nr_migrations growth. ] Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/ef167822e5c5b2d96cf5b0e3e4f4bdff3f0414a2.1379252740.git.vdavydov@parallels.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-20 11:59:38 +02:00
Vladimir Davydov	b18855500f	sched/balancing: Fix 'local->avg_load > sds->avg_load' case in calculate_imbalance() In busiest->group_imb case we can come to calculate_imbalance() with local->avg_load >= busiest->avg_load >= sds->avg_load. This can result in imbalance overflow, because it is calculated as follows env->imbalance = min( max_pull * busiest->group_power, (sds->avg_load - local->avg_load) * local->group_power) / SCHED_POWER_SCALE; As a result we can end up constantly bouncing tasks from one cpu to another if there are pinned tasks. Fix this by skipping the assignment and assuming imbalance=0 in case local->avg_load > sds->avg_load. [ The bug can be caught by running 2*N cpuhogs pinned to two logical cpus belonging to different cores on an HT-enabled machine with N logical cpus: just look at se.nr_migrations growth. ] Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/8f596cc6bc0e5e655119dc892c9bfcad26e971f4.1379252740.git.vdavydov@parallels.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-20 11:59:36 +02:00
Linus Torvalds	7e28b2712e	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Misc fixes" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix comment for sched_info_depart sched/Documentation: Update sched-design-CFS.txt documentation sched/debug: Take PID namespace into account sched/fair: Fix small race where child->se.parent,cfs_rq might point to invalid ones	2013-09-18 11:23:32 -05:00
Linus Torvalds	186844b292	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Two small fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix UAPI export of PERF_EVENT_IOC_ID perf/x86/intel: Fix Silvermont offcore masks	2013-09-18 11:22:53 -05:00
Vince Weaver	a8e0108cac	perf: Fix UAPI export of PERF_EVENT_IOC_ID Without the following patch I have problems compiling code using the new PERF_EVENT_IOC_ID ioctl(). It looks like u64 was used instead of __u64 Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1309171450380.11444@vincent-weaver-1.um.maine.edu Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-18 11:29:07 +02:00
Linus Torvalds	62d228b8c6	Merge branch 'fixes' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Gleb Natapov. * 'fixes' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: VMX: set "blocked by NMI" flag if EPT violation happens during IRET from NMI kvm: free resources after canceling async_pf KVM: nEPT: reset PDPTR register cache on nested vmentry emulation KVM: mmu: allow page tables to be in read-only slots KVM: x86 emulator: emulate RETF imm	2013-09-17 22:20:30 -04:00
Linus Torvalds	84fca9f38c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID updates from Jiri Kosina: "Fixes for CVE-2013-2897, CVE-2013-2895, CVE-2013-2897, CVE-2013-2894, CVE-2013-2893, CVE-2013-2891, CVE-2013-2890, CVE-2013-2889. All the bugs are triggerable only by specially crafted evil-on-purpose HW devices. Fixes by Kees Cook and Benjamin Tissoires" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: HID: lenovo-tpkbd: fix leak if tpkbd_probe_tp fails HID: multitouch: validate indexes details HID: logitech-dj: validate output report details HID: validate feature and input report details HID: lenovo-tpkbd: validate output report details HID: LG: validate HID output report details HID: steelseries: validate output report details HID: sony: validate HID output report details HID: zeroplus: validate output report details HID: provide a helper for validating hid reports	2013-09-17 21:54:05 -04:00
Oleg Nesterov	03e1261778	tty: disassociate_ctty() sends the extra SIGCONT Starting from v3.10 (probably commit f91e2590410b: "tty: Signal foreground group processes in hangup") disassociate_ctty() sends SIGCONT if tty && on_exit. This breaks LSB test-suite, in particular test8 in _exit.c and test40 in sigcon5.c. Put the "!on_exit" check back to restore the old behaviour. Review by Peter Hurley: "Yes, this regression was introduced by me in that commit. The effect of the regression is that ptys will receive a SIGCONT when, in similar circumstances, ttys would not. The fact that two test vectors accidentally tripped over this regression suggests that some other apps may as well. Thanks for catching this" Cc: stable@vger.kernel.org # v3.10+ Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Karel Srot <ksrot@redhat.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-17 21:53:32 -04:00
Gleb Natapov	0be9c7a89f	KVM: VMX: set "blocked by NMI" flag if EPT violation happens during IRET from NMI Set "blocked by NMI" flag if EPT violation happens during IRET from NMI otherwise NMI can be called recursively causing stack corruption. Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-09-17 19:09:47 +03:00
Linus Torvalds	de0bc3dfc3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile Pull more tile architecture updates from Chris Metcalf: "This second batch of changes is just cleanup of various kinds from doing some tidying work in the sources. Some dead code is removed, comment typos fixed, whitespace and style issues cleaned up, and some header updates from our internal "upstream" architecture team" * git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: tile: remove stray blank space tile: <arch/> header updates from upstream tile: improve gxio iorpc autogenerated code style tile: double default VMALLOC space tile: remove stale arch/tile/kernel/futex_64.S tile: remove HUGE_VMAP dead code tile: use pmd_pfn() instead of casting via pte_t tile: fix typos in comment in arch/tile/kernel/unaligned.c	2013-09-17 11:40:49 -04:00
Radim Krčmář	28b441e240	kvm: free resources after canceling async_pf When we cancel 'async_pf_execute()', we should behave as if the work was never scheduled in 'kvm_setup_async_pf()'. Fixes a bug when we can't unload module because the vm wasn't destroyed. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-09-17 12:53:15 +03:00
Gleb Natapov	72f857950f	KVM: nEPT: reset PDPTR register cache on nested vmentry emulation After nested vmentry stale cache can be used to reload L2 PDPTR pointers which will cause L2 guest to fail. Fix it by invalidating cache on nested vmentry emulation. https://bugzilla.kernel.org/show_bug.cgi?id=60830 Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-09-17 12:52:42 +03:00
Paolo Bonzini	ba6a354154	KVM: mmu: allow page tables to be in read-only slots Page tables in a read-only memory slot will currently cause a triple fault because the page walker uses gfn_to_hva and it fails on such a slot. OVMF uses such a page table; however, real hardware seems to be fine with that as long as the accessed/dirty bits are set. Save whether the slot is readonly, and later check it when updating the accessed and dirty bits. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-09-17 12:52:31 +03:00

1 2 3 4 5 ...

399568 Commits