Age | Commit message (Collapse) | Author |
|
|
|
commit 207d543f472c1ac9552df79838dc807cbcaa9740 upstream.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Disable stuff which is known to have issues on RT
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
We can't deal with the cpumask allocations which happen in atomic
context (see arch/x86/kernel/apic/io_apic.c) on RT right now.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Restrict the preempt disabled regions to the actual floating point
operations and enable preemption for the administrative actions.
This is necessary on RT to avoid that kfree and other operations are
called with preemption disabled.
Reported-and-tested-by: Carsten Emde <cbe@osadl.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: stable-rt@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
In fact, with migrate_disable() existing one could play games with
kmap_atomic. You could save/restore the kmap_atomic slots on context
switch (if there are any in use of course), this should be esp easy now
that we have a kmap_atomic stack.
Something like the below.. it wants replacing all the preempt_disable()
stuff with pagefault_disable() && migrate_disable() of course, but then
you can flip kmaps around like below.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
[dvhart@linux.intel.com: build fix]
Link: http://lkml.kernel.org/r/1311842631.5890.208.camel@twins
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Normally the x86-64 trap handlers for debug/int 3/stack fault run
on a special interrupt stack to make them more robust
when dealing with kernel code.
The PREEMPT_RT kernel can sleep in locks even while allocating
GFP_ATOMIC memory. When one of these trap handlers needs to send
real time signals for ptrace it allocates memory and could then
try to to schedule. But it is not allowed to schedule on a
IST stack. This can cause warnings and hangs.
This patch disables the IST stacks for these handlers for PREEMPT_RT
kernel. Instead let them run on the normal process stack.
The kernel only really needs the ISTs here to make kernel debuggers more
robust in case someone sets a break point somewhere where the stack is
invalid. But there are no kernel debuggers in the standard kernel
that do this.
It also means kprobes cannot be set in situations with invalid stack;
but that sounds like a reasonable restriction.
The stack fault change could minimally impact oops quality, but not very
much because stack faults are fairly rare.
A better solution would be to use similar logic as the NMI "paranoid"
path: check if signal is for user space, if yes go back to entry.S, switch stack,
call sync_regs, then do the signal sending etc.
But this patch is much simpler and should work too with minimal impact.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Simplifies the separation of anon_rw_semaphores and rw_semaphores for
-rt.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
CPU bringup calls into the random pool to initialize the stack
canary. During boot that works nicely even on RT as the might sleep
checks are disabled. During CPU hotplug the might sleep checks
trigger. Making the locks in random raw is a major PITA, so avoid the
call on RT is the only sensible solution. This is basically the same
randomness which we get during boot where the random pool has no
entropy and we rely on the TSC randomnness.
Reported-by: Carsten Emde <carsten.emde@osadl.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
mce_timer is started in atomic contexts of cpu bringup. This results
in might_sleep() warnings on RT. Convert mce_timer to a hrtimer to
avoid this.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Without this patch, ARM can not use SPLIT_PTLOCK_CPUS if
PREEMPT_RT_FULL=y because vectors_user_mapping() creates a
VM_ALWAYSDUMP mapping of the vector page (address 0xffff0000), but no
ptl->lock has been allocated for the page. An attempt to coredump
that page will result in a kernel NULL pointer dereference when
follow_page() attempts to lock the page.
The call tree to the NULL pointer dereference is:
do_notify_resume()
get_signal_to_deliver()
do_coredump()
elf_core_dump()
get_dump_page()
__get_user_pages()
follow_page()
pte_offset_map_lock() <----- a #define
...
rt_spin_lock()
The underlying problem is exposed by mm-shrink-the-page-frame-to-rt-size.patch.
Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Cc: Frank <Frank_Rowand@sonyusa.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/4E87C535.2030907@am.sony.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Use the local_irq_*_nort() variants.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Convert xtime_lock to raw_seqlock and fix up all users.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Preemption must be disabled before enabling interrupts in do_trap
on x86_64 because the stack in use for int3 and debug is a per CPU
stack set by th IST. But 32bit does not have an IST and the stack
still belongs to the current task and there is no problem in scheduling
out the task.
Keep preemption enabled on X86_32 when enabling interrupts for
do_trap().
The name of the function is changed from preempt_conditional_sti/cli()
to conditional_sti/cli_ist(), to annotate that this function is used
when the stack is on the IST.
Cc: stable-rt@vger.kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
With threaded interrupts we might see an interrupt in progress on
migration. Do not unmask it when this is the case.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The machine might survive that problem and be at least in a state
which allows us to get more information about the problem.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Wrap the test for pagefault_disabled() into a helper, this allows us
to remove the need for current->pagefault_disabled on !-rt kernels.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-3yy517m8zsi9fpsf14xfaqkw@git.kernel.org
|
|
Necessary for decoupling pagefault disable from preempt count.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Setup and remove the interrupt handler in clock event mode selection.
This avoids calling the (shared) interrupt handler when the device is
not used.
Signed-off-by: Benedikt Spranger <b.spranger@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
On x86_64 we must disable preemption before we enable interrupts
for stack faults, int3 and debugging, because the current task is using
a per CPU debug stack defined by the IST. If we schedule out, another task
can come in and use the same stack and cause the stack to be corrupted
and crash the kernel on return.
When CONFIG_PREEMPT_RT_FULL is enabled, spin_locks become mutexes, and
one of these is the spin lock used in signal handling.
Some of the debug code (int3) causes do_trap() to send a signal.
This function calls a spin lock that has been converted to a mutex
and has the possibility to sleep. If this happens, the above issues with
the corrupted stack is possible.
Instead of calling the signal right away, for PREEMPT_RT and x86_64,
the signal information is stored on the stacks task_struct and
TIF_NOTIFY_RESUME is set. Then on exit of the trap, the signal resume
code will send the signal when preemption is enabled.
[ rostedt: Switched from #ifdef CONFIG_PREEMPT_RT_FULL to
ARCH_RT_DELAYS_SIGNAL_SEND and added comments to the code. ]
Cc: stable-rt@vger.kernel.org
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Coccinelle based conversion.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The arm boot_lock is used by the secondary processor startup code. The locking
task is the idle thread, which has idle->sched_class == &idle_sched_class.
idle_sched_class->enqueue_task == NULL, so if the idle task blocks on the
lock, the attempt to wake it when the lock becomes available will fail:
try_to_wake_up()
...
activate_task()
enqueue_task()
p->sched_class->enqueue_task(rq, p, flags)
Fix by converting boot_lock to a raw spin lock.
Signed-off-by: Frank Rowand <frank.rowand@am.sony.com>
Link: http://lkml.kernel.org/r/4E77B952.3010606@am.sony.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
All timer interrupts and the perf interrupt are marked NO_THREAD, so
its safe to allow forced interrupt threading.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
PMU interrupt must not be threaded. Remove IRQF_DISABLED while at it
as we run all handlers with interrupts disabled anyway.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
All interrupts which must be non threaded are marked
IRQF_NO_THREAD. So it's safe to allow force threaded handlers.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
IPI handlers cannot be threaded. Remove the obsolete IRQF_DISABLED
flag (see commit e58aa3d2) while at it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Cascade handlers must run in hard interrupt context.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Cascade interrupt must run in hard interrupt context.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
MSI based per cpu timers lose interrupts when intel_idle() is enabled
- independent of the c-state. With idle=poll the problem cannot be
observed. We have no idea yet, whether this is a W510 specific issue
or a general chipset oddity. Blacklist the known problem machine.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The CONFIG_PREEMPT=n section of setup_singlestep() contains:
preempt_enable_no_resched();
That's bogus as it is asymetric - no preempt_disable() - and it just
never blew up because preempt_enable_no_resched() is a NOP when
CONFIG_PREEMPT=n. Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Interrupts notify the idle exit state before calling irq_enter(). But
the notifier code calls rcu_read_lock() and this is not allowed while
rcu is in an extended quiescent state. We need to wait for
rcu_irq_enter() to be called before doing so otherwise this results in
a grumpy RCU:
[ 0.099991] WARNING: at include/linux/rcupdate.h:194 __atomic_notifier_call_chain+0xd2/0x110()
[ 0.099991] Hardware name: AMD690VM-FMH
[ 0.099991] Modules linked in:
[ 0.099991] Pid: 0, comm: swapper Not tainted 3.0.0-rc6+ #255
[ 0.099991] Call Trace:
[ 0.099991] <IRQ> [<ffffffff81051c8a>] warn_slowpath_common+0x7a/0xb0
[ 0.099991] [<ffffffff81051cd5>] warn_slowpath_null+0x15/0x20
[ 0.099991] [<ffffffff817d6fa2>] __atomic_notifier_call_chain+0xd2/0x110
[ 0.099991] [<ffffffff817d6ff1>] atomic_notifier_call_chain+0x11/0x20
[ 0.099991] [<ffffffff81001873>] exit_idle+0x43/0x50
[ 0.099991] [<ffffffff81020439>] smp_apic_timer_interrupt+0x39/0xa0
[ 0.099991] [<ffffffff817da253>] apic_timer_interrupt+0x13/0x20
[ 0.099991] <EOI> [<ffffffff8100ae67>] ? default_idle+0xa7/0x350
[ 0.099991] [<ffffffff8100ae65>] ? default_idle+0xa5/0x350
[ 0.099991] [<ffffffff8100b19b>] amd_e400_idle+0x8b/0x110
[ 0.099991] [<ffffffff810cb01f>] ? rcu_enter_nohz+0x8f/0x160
[ 0.099991] [<ffffffff810019a0>] cpu_idle+0xb0/0x110
[ 0.099991] [<ffffffff817a7505>] rest_init+0xe5/0x140
[ 0.099991] [<ffffffff817a7468>] ? rest_init+0x48/0x140
[ 0.099991] [<ffffffff81cc5ca3>] start_kernel+0x3d1/0x3dc
[ 0.099991] [<ffffffff81cc5321>] x86_64_start_reservations+0x131/0x135
[ 0.099991] [<ffffffff81cc5412>] x86_64_start_kernel+0xed/0xf4
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20110929194047.GA10247@linux.vnet.ibm.com
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Henroid <andrew.d.henroid@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
commit 8ef5d844cc3a644ea6f7665932a4307e9fad01fa upstream.
following statement can only change device size from 8-bit(0) to 16-bit(1),
but not vice versa:
regval |= GPMC_CONFIG1_DEVICESIZE(wval);
so as this field has 1 reserved bit, that could be used in future,
just clear both bits and then OR with the desired value
Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 8130b9d7b9d858aa04ce67805e8951e3cb6e9b2f upstream.
If we are context switched whilst copying into a thread's
vfp_hard_struct then the partial copy may be corrupted by the VFP
context switching code (see "ARM: vfp: flush thread hwstate before
restoring context from sigframe").
This patch updates the ptrace VFP set code so that the thread state is
flushed before the copy, therefore disabling VFP and preventing
corruption from occurring.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 247f4993a5974e6759606c4d380748eecfd273ff upstream.
In a preemptible kernel, vfp_set() can be preempted, causing the
hardware VFP context to be switched while the thread vfp state is
being read and modified. This leads to a race condition which can
cause the thread vfp state to become corrupted if lazy VFP context
save occurs due to preemption in between the time thread->vfpstate
is read and the time the modified state is written back.
This may occur if preemption occurs during the execution of a
ptrace() call which modifies the VFP register state of a thread.
Such instances should be very rare in most realistic scenarios --
none has been reported, so far as I am aware. Only uniprocessor
systems should be affected, since VFP context save is not currently
lazy in SMP kernels.
The problem was introduced by my earlier patch migrating to use
regsets to implement ptrace.
This patch does a vfp_sync_hwstate() before reading
thread->vfpstate, to make sure that the thread's VFP state is not
live in the hardware registers while the registers are modified.
Thanks to Will Deacon for spotting this.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2af276dfb1722e97b190bd2e646b079a2aa674db upstream.
Following execution of a signal handler, we currently restore the VFP
context from the ucontext in the signal frame. This involves copying
from the user stack into the current thread's vfp_hard_struct and then
flushing the new data out to the hardware registers.
This is problematic when using a preemptible kernel because we could be
context switched whilst updating the vfp_hard_struct. If the current
thread has made use of VFP since the last context switch, the VFP
notifier will copy from the hardware registers into the vfp_hard_struct,
overwriting any data that had been partially copied by the signal code.
Disabling preemption across copy_from_user calls is a terrible idea, so
instead we move the VFP thread flush *before* we update the
vfp_hard_struct. Since the flushing is performed lazily, this has the
effect of disabling VFP and clearing the CPU's VFP state pointer,
therefore preventing the thread from being updated with stale data on
the next context switch.
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2ab1159e80e8f416071e9f51e4f77b9173948296 upstream.
MMC_CAP_SD_HIGHSPEED is not supported on Snowball board resulting on
initialization errors.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Fredrik Soderstedt <fredrik.soderstedt@stericsson.com>
Signed-off-by: Philippe Langlais <philippe.langlais@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
|
|
[ Upstream commit d00a9dd21bdf7908b70866794c8313ee8a5abd5c ]
Several problems fixed in this patch :
1) Target of the conditional jump in case a divide by 0 is performed
by a bpf is wrong.
2) Must 'generate' the full function prologue/epilogue at pass=0,
or else we can stop too early in pass=1 if the proglen doesnt change.
(if the increase of prologue/epilogue equals decrease of all
instructions length because some jumps are converted to near jumps)
3) Change the wrong length detection at the end of code generation to
issue a more explicit message, no need for a full stack trace.
Reported-by: Phil Oester <kernel@linuxace.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|