From 1d9d8639c063caf6efc2447f5f26aa637f844ff6 Mon Sep 17 00:00:00 2001 From: Stephane Eranian Date: Fri, 15 Mar 2013 14:26:07 +0100 Subject: perf,x86: fix kernel crash with PEBS/BTS after suspend/resume This patch fixes a kernel crash when using precise sampling (PEBS) after a suspend/resume. Turns out the CPU notifier code is not invoked on CPU0 (BP). Therefore, the DS_AREA (used by PEBS) is not restored properly by the kernel and keeps it power-on/resume value of 0 causing any PEBS measurement to crash when running on CPU0. The workaround is to add a hook in the actual resume code to restore the DS Area MSR value. It is invoked for all CPUS. So for all but CPU0, the DS_AREA will be restored twice but this is harmless. Reported-by: Linus Torvalds Signed-off-by: Stephane Eranian Signed-off-by: Linus Torvalds --- arch/x86/kernel/cpu/perf_event_intel_ds.c | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'arch/x86/kernel/cpu/perf_event_intel_ds.c') diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c index 826054a4f2e..0e9bdd3cb01 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c @@ -729,3 +729,11 @@ void intel_ds_init(void) } } } + +void perf_restore_debug_store(void) +{ + if (!x86_pmu.bts && !x86_pmu.pebs) + return; + + init_debug_store_on_cpu(smp_processor_id()); +} -- cgit v1.2.3-70-g09d2 From 2a6e06b2aed6995af401dcd4feb5e79a0c7ea554 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sun, 17 Mar 2013 15:44:43 -0700 Subject: perf,x86: fix wrmsr_on_cpu() warning on suspend/resume Commit 1d9d8639c063 ("perf,x86: fix kernel crash with PEBS/BTS after suspend/resume") fixed a crash when doing PEBS performance profiling after resuming, but in using init_debug_store_on_cpu() to restore the DS_AREA mtrr it also resulted in a new WARN_ON() triggering. init_debug_store_on_cpu() uses "wrmsr_on_cpu()", which in turn uses CPU cross-calls to do the MSR update. Which is not really valid at the early resume stage, and the warning is quite reasonable. Now, it all happens to _work_, for the simple reason that smp_call_function_single() ends up just doing the call directly on the CPU when the CPU number matches, but we really should just do the wrmsr() directly instead. This duplicates the wrmsr() logic, but hopefully we can just remove the wrmsr_on_cpu() version eventually. Reported-and-tested-by: Parag Warudkar Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds --- arch/x86/kernel/cpu/perf_event_intel_ds.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'arch/x86/kernel/cpu/perf_event_intel_ds.c') diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c index 0e9bdd3cb01..b05a575d56f 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c @@ -732,8 +732,10 @@ void intel_ds_init(void) void perf_restore_debug_store(void) { + struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds); + if (!x86_pmu.bts && !x86_pmu.pebs) return; - init_debug_store_on_cpu(smp_processor_id()); + wrmsrl(MSR_IA32_DS_AREA, (unsigned long)ds); } -- cgit v1.2.3-70-g09d2