diff options
author | Igor Nabirushkin <inabirushkin@nvidia.com> | 2014-06-18 16:59:33 +0400 |
---|---|---|
committer | Mandar Padmawar <mpadmawar@nvidia.com> | 2014-06-19 07:33:41 -0700 |
commit | 7a42a1bed7ee65f6cd75bd5c3141de5ba6b9cf09 (patch) | |
tree | 6c5bd58e52cb39efff69dbc2944baddccb8094ca | |
parent | a92fb9d984b4f45c6cec187d23086d0af7abbfd9 (diff) |
misc: tegra-profiler: squashed update to ver. 1.75
commit f8c056c12c7b72290c47afadaf8b2f16336b3238
Author: Igor Nabirushkin <inabirushkin@nvidia.com>
Date: Thu Jun 5 11:57:52 2014 +0400
misc: tegra-profiler: mixed backtraces
Unwinding: switch from code with frame pointers to code
with unwind tables.
Bug 1487488
Change-Id: I254a8fd762b5312f854db1fe79635a2b419091f0
Reviewed-on: http://git-master/r/419384
commit a1d7f98fb15d4578cd140fe03a4c748e1db86c57
Author: Igor Nabirushkin <inabirushkin@nvidia.com>
Date: Thu Jun 5 11:08:55 2014 +0400
misc: tegra-profiler: add sched samples
Tegra Profiler: capture task starting being scheduled on a core.
Add sched in/out samples.
Bug 1520808
Change-Id: I2c62e5c1918bdba0fc997d79d8aeb3b7b63530f0
Reviewed-on: http://git-master/r/419352
commit 6f847fd1257af28fc11b942a2f3b3dfc7eb4579f
Author: Igor Nabirushkin <inabirushkin@nvidia.com>
Date: Thu Jun 5 09:52:29 2014 +0400
misc: tegra-profiler: use cntvct as time source
Tegra Profiler: use Virtual Count register (CNTVCT) as
time source.
Bug 1508327
Change-Id: If37e2dbe0a256ec28575d7c1b7d601d6bc1090f5
Reviewed-on: http://git-master/r/419305
commit d79e4f5292dae4cccb510be2b47f4ee00baa53d7
Author: Igor Nabirushkin <inabirushkin@nvidia.com>
Date: Thu Jun 5 09:10:47 2014 +0400
misc: tegra-profiler: get perfmon extension
Add version of the ARMv8 NVIDIA perfmon extension to
device capabilities.
Bug 1520757
Change-Id: I18d10133272a10e3faf5022b4579c7dfea78791e
Reviewed-on: http://git-master/r/419274
commit afedef10f26475b98b7d42ab3bab6f0c2fbb6eae
Author: Igor Nabirushkin <inabirushkin@nvidia.com>
Date: Mon May 19 16:49:19 2014 +0400
misc: tegra-profiler: fix hang up bug for Norrin
Do not use probe_kernel_address.
Actually, it is not safe on Norrin: this can lead to system crash.
Bug 200005974
Bug 1522252
Change-Id: If8bae9afd7c7e1bbb5beaf430c0c61f552aeb036
Reviewed-on: http://git-master/r/411507
commit 1b4c5247c0ab284dbed25683cbfa5a301da787ff
Author: Igor Nabirushkin <inabirushkin@nvidia.com>
Date: Fri May 16 12:49:15 2014 +0400
misc: tegra-profiler: add unwind information
Tegra Profiler: add additional unwind information
for each call entry.
Bug 1514626
Change-Id: I2873941a4c903e0e7e909897ead55eb34d80b966
Reviewed-on: http://git-master/r/410770
commit b2f593d9bb00a380d4402f2a8cd9ed8d9646dcbd
Author: Igor Nabirushkin <inabirushkin@nvidia.com>
Date: Fri May 16 12:05:36 2014 +0400
misc: tegra-profiler: fixed recursive call chains
In some cases, recursive call chains can be broken.
This patch fixes this problem.
Bug 200005395
Change-Id: I7d31ec64b004109c3684cf0d143d9b1d6cd59f9f
Reviewed-on: http://git-master/r/410745
commit 6c9f626340a81daf124d4bbeff2254f63cc084b7
Author: Igor Nabirushkin <inabirushkin@nvidia.com>
Date: Fri May 16 11:24:50 2014 +0400
misc: tegra-profiler: support too deep stack level
Too deep stack level: handle it properly.
Appropriate unwind reason code has been added.
Unwinding based on frame pointers: add unwind reason codes.
Bug 200005380
Change-Id: I2199df90c746ada6a7f224a8b675638b69dc6da8
Reviewed-on: http://git-master/r/410717
commit ddea2fc86588bdf3ae313a270364052a0beab160
Author: Igor Nabirushkin <inabirushkin@nvidia.com>
Date: Fri May 16 10:44:06 2014 +0400
misc: tegra-profiler: fix setup bug
* Fix bug that happens when using non-standard profiling frequencies
* Allow root user to use any frequency in range [100 Hz; 100 kHz]
Bug 200005366
Change-Id: I9a07e2c9c1fec6d61f34009d1975ea7f5d0e2592
Reviewed-on: http://git-master/r/410705
commit 5c64bcefc4b3df0ba9612cd67703593d488ab38c
Author: Deepak Nibade <dnibade@nvidia.com>
Date: Mon May 19 15:48:02 2014 +0530
misc: tegra-profiler: fix resource leaks
Fix Coverity issue of resource leaks
Coverity id : 26481
Coverity id : 26483
Bug 1416640
Change-Id: Ib71950f196b5421ccbc21b3ac8d620e790e83366
Reviewed-on: http://git-master/r/411421
commit 2f5d99b96ba18129f6c708e3db9a1e32da24816f
Author: Igor Nabirushkin <inabirushkin@nvidia.com>
Date: Tue May 6 09:47:02 2014 +0400
tegra-profiler: add access to the exception tables
Tegra Profiler: add access to the exception tables via mmap areas.
Do not read directly from the user space.
Bug 200002243
Change-Id: I442daaecb11fd4416b3e485722efdf34234e0241
Reviewed-on: http://git-master/r/405671
commit 218d8cc8a573da49145c7104258fb290c83205b9
Author: Igor Nabirushkin <inabirushkin@nvidia.com>
Date: Thu Apr 17 13:02:07 2014 +0400
misc: tegra-profiler: unwinding: use RCU locking
Unwinding: use RCU locking instead of spinlocks to protect
map of regions.
Bug 1502205
Change-Id: If1089b74b1f317eeaae5059de40d7a3365ae4061
Reviewed-on: http://git-master/r/397599
Change-Id: I1ac2a5a290f723cab40463932c0a814a670cf9e7
Signed-off-by: Igor Nabirushkin <inabirushkin@nvidia.com>
Reviewed-on: http://git-master/r/424787
GVS: Gerrit_Virtual_Submit
Tested-by: Daniel Horowitz <dhorowitz@nvidia.com>
Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
-rw-r--r-- | drivers/misc/tegra-profiler/arm_pmu.h | 12 | ||||
-rw-r--r-- | drivers/misc/tegra-profiler/armv7_pmu.c | 34 | ||||
-rw-r--r-- | drivers/misc/tegra-profiler/armv8_events.h | 7 | ||||
-rw-r--r-- | drivers/misc/tegra-profiler/armv8_pmu.c | 61 | ||||
-rw-r--r-- | drivers/misc/tegra-profiler/backtrace.c | 298 | ||||
-rw-r--r-- | drivers/misc/tegra-profiler/backtrace.h | 14 | ||||
-rw-r--r-- | drivers/misc/tegra-profiler/comm.c | 171 | ||||
-rw-r--r-- | drivers/misc/tegra-profiler/comm.h | 16 | ||||
-rw-r--r-- | drivers/misc/tegra-profiler/eh_unwind.c | 815 | ||||
-rw-r--r-- | drivers/misc/tegra-profiler/eh_unwind.h | 10 | ||||
-rw-r--r-- | drivers/misc/tegra-profiler/hrt.c | 100 | ||||
-rw-r--r-- | drivers/misc/tegra-profiler/hrt.h | 7 | ||||
-rw-r--r-- | drivers/misc/tegra-profiler/main.c | 36 | ||||
-rw-r--r-- | drivers/misc/tegra-profiler/quadd.h | 2 | ||||
-rw-r--r-- | drivers/misc/tegra-profiler/quadd_proc.c | 20 | ||||
-rw-r--r-- | drivers/misc/tegra-profiler/version.h | 2 | ||||
-rw-r--r-- | include/linux/tegra_profiler.h | 46 |
17 files changed, 1193 insertions, 458 deletions
diff --git a/drivers/misc/tegra-profiler/arm_pmu.h b/drivers/misc/tegra-profiler/arm_pmu.h index b0d139a9488a..6071c469fe83 100644 --- a/drivers/misc/tegra-profiler/arm_pmu.h +++ b/drivers/misc/tegra-profiler/arm_pmu.h @@ -28,9 +28,17 @@ struct quadd_pmu_event_info { struct list_head list; }; +#define QUADD_ARCH_NAME_MAX 64 + +struct quadd_arch_info { + int type; + int ver; + + char name[QUADD_ARCH_NAME_MAX]; +}; + struct quadd_pmu_ctx { - int arch; - char arch_name[64]; + struct quadd_arch_info arch; u32 counters_mask; diff --git a/drivers/misc/tegra-profiler/armv7_pmu.c b/drivers/misc/tegra-profiler/armv7_pmu.c index 97ccb65255b5..1962a3ea0ba2 100644 --- a/drivers/misc/tegra-profiler/armv7_pmu.c +++ b/drivers/misc/tegra-profiler/armv7_pmu.c @@ -301,8 +301,8 @@ static u32 armv7_pmu_adjust_value(u32 value, int event_id) * so currently we are devided by two */ if (pmu_ctx.l1_cache_rw && - (pmu_ctx.arch == QUADD_ARM_CPU_TYPE_CORTEX_A8 || - pmu_ctx.arch == QUADD_ARM_CPU_TYPE_CORTEX_A9) && + (pmu_ctx.arch.type == QUADD_ARM_CPU_TYPE_CORTEX_A8 || + pmu_ctx.arch.type == QUADD_ARM_CPU_TYPE_CORTEX_A9) && (event_id == QUADD_EVENT_TYPE_L1_DCACHE_READ_MISSES || event_id == QUADD_EVENT_TYPE_L1_DCACHE_WRITE_MISSES)) { return value / 2; @@ -722,6 +722,11 @@ static int get_current_events(int *events, int max_events) return i; } +static struct quadd_arch_info *get_arch(void) +{ + return &pmu_ctx.arch; +} + static struct quadd_event_source_interface pmu_armv7_int = { .enable = pmu_enable, .disable = pmu_disable, @@ -737,6 +742,7 @@ static struct quadd_event_source_interface pmu_armv7_int = { .set_events = set_events, .get_supported_events = get_supported_events, .get_current_events = get_current_events, + .get_arch = get_arch, }; struct quadd_event_source_interface *quadd_armv7_pmu_init(void) @@ -748,11 +754,18 @@ struct quadd_event_source_interface *quadd_armv7_pmu_init(void) cpu_implementer = cpu_id >> 24; part_number = cpu_id & 0xFFF0; + pmu_ctx.arch.type = QUADD_ARM_CPU_TYPE_UNKNOWN; + pmu_ctx.arch.ver = 0; + strncpy(pmu_ctx.arch.name, "Unknown", + sizeof(pmu_ctx.arch.name)); + if (cpu_implementer == ARM_CPU_IMP_ARM) { switch (part_number) { case ARM_CPU_PART_CORTEX_A9: - pmu_ctx.arch = QUADD_ARM_CPU_TYPE_CORTEX_A9; - strcpy(pmu_ctx.arch_name, "Cortex A9"); + pmu_ctx.arch.type = QUADD_ARM_CPU_TYPE_CORTEX_A9; + strncpy(pmu_ctx.arch.name, "Cortex A9", + sizeof(pmu_ctx.arch.name)); + pmu_ctx.counters_mask = QUADD_ARMV7_COUNTERS_MASK_CORTEX_A9; pmu_ctx.current_map = quadd_armv7_a9_events_map; @@ -760,8 +773,10 @@ struct quadd_event_source_interface *quadd_armv7_pmu_init(void) break; case ARM_CPU_PART_CORTEX_A15: - pmu_ctx.arch = QUADD_ARM_CPU_TYPE_CORTEX_A15; - strcpy(pmu_ctx.arch_name, "Cortex A15"); + pmu_ctx.arch.type = QUADD_ARM_CPU_TYPE_CORTEX_A15; + strncpy(pmu_ctx.arch.name, "Cortex A15", + sizeof(pmu_ctx.arch.name)); + pmu_ctx.counters_mask = QUADD_ARMV7_COUNTERS_MASK_CORTEX_A15; pmu_ctx.current_map = quadd_armv7_a15_events_map; @@ -769,8 +784,7 @@ struct quadd_event_source_interface *quadd_armv7_pmu_init(void) break; default: - pmu_ctx.arch = QUADD_ARM_CPU_TYPE_UNKNOWN; - strcpy(pmu_ctx.arch_name, "Unknown"); + pmu_ctx.arch.type = QUADD_ARM_CPU_TYPE_UNKNOWN; pmu_ctx.current_map = NULL; break; } @@ -778,7 +792,9 @@ struct quadd_event_source_interface *quadd_armv7_pmu_init(void) INIT_LIST_HEAD(&pmu_ctx.used_events); - pr_info("arch: %s\n", pmu_ctx.arch_name); + pmu_ctx.arch.name[sizeof(pmu_ctx.arch.name) - 1] = '\0'; + pr_info("arch: %s, type: %d, ver: %d\n", + pmu_ctx.arch.name, pmu_ctx.arch.type, pmu_ctx.arch.ver); return pmu; } diff --git a/drivers/misc/tegra-profiler/armv8_events.h b/drivers/misc/tegra-profiler/armv8_events.h index e5fcf080c910..1d675ddddabf 100644 --- a/drivers/misc/tegra-profiler/armv8_events.h +++ b/drivers/misc/tegra-profiler/armv8_events.h @@ -52,11 +52,11 @@ enum { #define QUADD_ARMV8_PMCR_LC (1 << 6) /* Number of event counters */ -#define QUADD_ARMV8_PMCR_N_SHIFT 16 +#define QUADD_ARMV8_PMCR_N_SHIFT 11 #define QUADD_ARMV8_PMCR_N_MASK 0x1f /* Identification code */ -#define QUADD_ARMV8_PMCR_IDCODE_SHIFT 11 +#define QUADD_ARMV8_PMCR_IDCODE_SHIFT 16 #define QUADD_ARMV8_PMCR_IDCODE_MASK 0xff /* Implementer code */ @@ -81,6 +81,9 @@ enum { #define QUADD_ARMV8_COUNTERS_MASK_PMUV3 0x3f +#define QUADD_ARMV8_PMU_NVEXT_SHIFT 4 +#define QUADD_ARMV8_PMU_NVEXT_MASK 0x0f + /* * ARMv8 PMUv3 Performance Events handling code. * Common event types. diff --git a/drivers/misc/tegra-profiler/armv8_pmu.c b/drivers/misc/tegra-profiler/armv8_pmu.c index 9bc8eb232b62..7a4ffc17079a 100644 --- a/drivers/misc/tegra-profiler/armv8_pmu.c +++ b/drivers/misc/tegra-profiler/armv8_pmu.c @@ -219,9 +219,15 @@ armv8_pmu_pmovsclr_write(int idx) asm volatile("msr pmovsclr_el0, %0" : : "r" (BIT(idx))); } -/*********************************************************************/ - +static inline u32 +armv8_id_afr0_el1_read(void) +{ + u32 val; + /* Read Auxiliary Feature Register 0 */ + asm volatile("mrs %0, id_afr0_el1" : "=r" (val)); + return val; +} static void enable_counter(int idx) { @@ -710,7 +716,10 @@ static int get_current_events(int *events, int max_events) return i; } -/*********************************************************************/ +static struct quadd_arch_info *get_arch(void) +{ + return &pmu_ctx.arch; +} static struct quadd_event_source_interface pmu_armv8_int = { .enable = pmu_enable, @@ -727,6 +736,7 @@ static struct quadd_event_source_interface pmu_armv8_int = { .set_events = set_events, .get_supported_events = get_supported_events, .get_current_events = get_current_events, + .get_arch = get_arch, }; struct quadd_event_source_interface *quadd_armv8_pmu_init(void) @@ -737,11 +747,16 @@ struct quadd_event_source_interface *quadd_armv8_pmu_init(void) u64 aa64_dfr = read_cpuid(ID_AA64DFR0_EL1); aa64_dfr = (aa64_dfr >> 8) & 0x0f; - pmu_ctx.arch = QUADD_AA64_CPU_TYPE_UNKNOWN; + strncpy(pmu_ctx.arch.name, "Unknown", sizeof(pmu_ctx.arch.name)); + pmu_ctx.arch.type = QUADD_AA64_CPU_TYPE_UNKNOWN; + pmu_ctx.arch.ver = 0; switch (aa64_dfr) { case QUADD_AA64_PMUVER_PMUV3: - strcpy(pmu_ctx.arch_name, "AA64 PmuV3"); + strncpy(pmu_ctx.arch.name, "AA64 PmuV3", + sizeof(pmu_ctx.arch.name)); + pmu_ctx.arch.name[sizeof(pmu_ctx.arch.name) - 1] = '\0'; + pmu_ctx.counters_mask = QUADD_ARMV8_COUNTERS_MASK_PMUV3; pmu_ctx.current_map = quadd_armv8_pmuv3_events_map; @@ -755,19 +770,35 @@ struct quadd_event_source_interface *quadd_armv8_pmu_init(void) pr_info("imp: %#x, idcode: %#x\n", imp, idcode); if (imp == ARM_CPU_IMP_ARM) { - strcat(pmu_ctx.arch_name, " ARM"); + strncat(pmu_ctx.arch.name, " ARM", + sizeof(pmu_ctx.arch.name) - + strlen(pmu_ctx.arch.name)); + pmu_ctx.arch.name[sizeof(pmu_ctx.arch.name) - 1] = '\0'; + if (idcode == QUADD_AA64_CPU_IDCODE_CORTEX_A57) { - pmu_ctx.arch = QUADD_AA64_CPU_TYPE_CORTEX_A57; - strcat(pmu_ctx.arch_name, " CORTEX_A57"); + pmu_ctx.arch.type = + QUADD_AA64_CPU_TYPE_CORTEX_A57; + strncat(pmu_ctx.arch.name, " CORTEX_A57", + sizeof(pmu_ctx.arch.name) - + strlen(pmu_ctx.arch.name)); } else { - pmu_ctx.arch = QUADD_AA64_CPU_TYPE_ARM; + pmu_ctx.arch.type = QUADD_AA64_CPU_TYPE_ARM; } } else if (imp == QUADD_AA64_CPU_IMP_NVIDIA) { - strcat(pmu_ctx.arch_name, " Nvidia"); - pmu_ctx.arch = QUADD_AA64_CPU_TYPE_DENVER; + u32 ext_ver = armv8_id_afr0_el1_read(); + ext_ver = (ext_ver >> QUADD_ARMV8_PMU_NVEXT_SHIFT) & + QUADD_ARMV8_PMU_NVEXT_MASK; + + strncat(pmu_ctx.arch.name, " NVIDIA (Denver)", + sizeof(pmu_ctx.arch.name) - + strlen(pmu_ctx.arch.name)); + pmu_ctx.arch.type = QUADD_AA64_CPU_TYPE_DENVER; + pmu_ctx.arch.ver = ext_ver; } else { - strcat(pmu_ctx.arch_name, " Unknown"); - pmu_ctx.arch = QUADD_AA64_CPU_TYPE_UNKNOWN_IMP; + strncat(pmu_ctx.arch.name, " Unknown implementor code", + sizeof(pmu_ctx.arch.name) - + strlen(pmu_ctx.arch.name)); + pmu_ctx.arch.type = QUADD_AA64_CPU_TYPE_UNKNOWN_IMP; } pmu = &pmu_armv8_int; @@ -780,7 +811,9 @@ struct quadd_event_source_interface *quadd_armv8_pmu_init(void) INIT_LIST_HEAD(&pmu_ctx.used_events); - pr_info("arch: %s\n", pmu_ctx.arch_name); + pmu_ctx.arch.name[sizeof(pmu_ctx.arch.name) - 1] = '\0'; + pr_info("arch: %s, type: %d, ver: %d\n", + pmu_ctx.arch.name, pmu_ctx.arch.type, pmu_ctx.arch.ver); return pmu; } diff --git a/drivers/misc/tegra-profiler/backtrace.c b/drivers/misc/tegra-profiler/backtrace.c index d2039a5827c7..f16cdabc8f77 100644 --- a/drivers/misc/tegra-profiler/backtrace.c +++ b/drivers/misc/tegra-profiler/backtrace.c @@ -26,8 +26,6 @@ #include "backtrace.h" #include "eh_unwind.h" -#define QUADD_USER_SPACE_MIN_ADDR 0x8000 - static inline int is_thumb_mode(struct pt_regs *regs) { @@ -77,34 +75,60 @@ quadd_user_link_register(struct pt_regs *regs) #endif } +static inline void +put_unw_type(u32 *p, int bt_idx, unsigned int type) +{ + int word_idx, shift; + + word_idx = bt_idx / 8; + shift = (bt_idx % 8) * 4; + + *(p + word_idx) &= ~(0x0f << shift); + *(p + word_idx) |= (type & 0x0f) << shift; +} + int quadd_callchain_store(struct quadd_callchain *cc, - unsigned long ip) + unsigned long ip, unsigned int type) { - if (ip && cc->nr < QUADD_MAX_STACK_DEPTH) { - if (cc->cs_64) - cc->ip_64[cc->nr++] = ip; - else - cc->ip_32[cc->nr++] = ip; + if (!validate_pc_addr(ip, sizeof(unsigned long))) { + cc->unw_rc = QUADD_URC_PC_INCORRECT; + return 0; + } - return 1; + if (cc->nr >= QUADD_MAX_STACK_DEPTH) { + cc->unw_rc = QUADD_URC_LEVEL_TOO_DEEP; + return 0; } - return 0; + + put_unw_type(cc->types, cc->nr, type); + + if (cc->cs_64) + cc->ip_64[cc->nr++] = ip; + else + cc->ip_32[cc->nr++] = ip; + + return 1; } static unsigned long __user * -user_backtrace(unsigned long __user *tail, +user_backtrace(struct pt_regs *regs, + unsigned long __user *tail, struct quadd_callchain *cc, - struct vm_area_struct *stack_vma) + struct vm_area_struct *stack_vma, + struct task_struct *task) { + int nr_added; unsigned long value, value_lr = 0, value_fp = 0; unsigned long __user *fp_prev = NULL; if (!is_vma_addr((unsigned long)tail, stack_vma, sizeof(*tail))) return NULL; - if (__copy_from_user_inatomic(&value, tail, sizeof(unsigned long))) + if (__copy_from_user_inatomic(&value, tail, sizeof(unsigned long))) { + cc->unw_rc = QUADD_URC_EACCESS; return NULL; + } if (is_vma_addr(value, stack_vma, sizeof(value))) { /* gcc thumb/clang frame */ @@ -115,26 +139,39 @@ user_backtrace(unsigned long __user *tail, return NULL; if (__copy_from_user_inatomic(&value_lr, tail + 1, - sizeof(value_lr))) + sizeof(value_lr))) { + cc->unw_rc = QUADD_URC_EACCESS; return NULL; + } + + cc->curr_fp = value_fp; + cc->curr_sp = (unsigned long)tail + sizeof(value_fp) * 2; + cc->curr_pc = value_lr; } else { /* gcc arm frame */ if (__copy_from_user_inatomic(&value_fp, tail - 1, - sizeof(value_fp))) + sizeof(value_fp))) { + cc->unw_rc = QUADD_URC_EACCESS; return NULL; + } + + cc->curr_fp = value_fp; + cc->curr_sp = (unsigned long)tail + sizeof(value_fp); + cc->curr_pc = value_lr = value; if (!is_vma_addr(value_fp, stack_vma, sizeof(value_fp))) return NULL; - - value_lr = value; } fp_prev = (unsigned long __user *)value_fp; - if (value_lr < QUADD_USER_SPACE_MIN_ADDR) + nr_added = quadd_callchain_store(cc, value_lr, QUADD_UNW_TYPE_FP); + if (nr_added == 0) return NULL; - quadd_callchain_store(cc, value_lr); + if (cc->unw_method == QUADD_UNW_METHOD_MIXED && + quadd_is_ex_entry_exist(regs, value_lr, task)) + return NULL; if (fp_prev <= tail) return NULL; @@ -148,15 +185,17 @@ get_user_callchain_fp(struct pt_regs *regs, struct task_struct *task) { unsigned long fp, sp, pc, reg; - struct vm_area_struct *vma, *vma_pc; + struct vm_area_struct *vma, *vma_pc = NULL; unsigned long __user *tail = NULL; struct mm_struct *mm = task->mm; cc->nr = 0; - cc->unw_method = QUADD_UNW_METHOD_FP; + cc->unw_rc = QUADD_URC_FP_INCORRECT; - if (!regs || !mm) + if (!regs || !mm) { + cc->unw_rc = QUADD_URC_FAILURE; return 0; + } sp = quadd_user_stack_pointer(regs); pc = instruction_pointer(regs); @@ -166,19 +205,24 @@ get_user_callchain_fp(struct pt_regs *regs, return 0; vma = find_vma(mm, sp); - if (!vma) + if (!vma) { + cc->unw_rc = QUADD_URC_SP_INCORRECT; return 0; + } if (!is_vma_addr(fp, vma, sizeof(fp))) return 0; if (probe_kernel_address(fp, reg)) { - pr_warn_once("frame error: sp/fp: %#lx/%#lx, pc/lr: %#lx/%#lx, vma: %#lx-%#lx\n", - sp, fp, pc, quadd_user_link_register(regs), - vma->vm_start, vma->vm_end); + pr_warn_once("%s: failed for address: %#lx\n", __func__, fp); + cc->unw_rc = QUADD_URC_EACCESS; return 0; } + pr_debug("sp/fp: %#lx/%#lx, pc/lr: %#lx/%#lx, *fp: %#lx, stack: %#lx-%#lx\n", + sp, fp, pc, quadd_user_link_register(regs), reg, + vma->vm_start, vma->vm_end); + if (is_thumb_mode(regs)) { if (reg <= fp || !is_vma_addr(reg, vma, sizeof(reg))) return 0; @@ -191,8 +235,10 @@ get_user_callchain_fp(struct pt_regs *regs, if (__copy_from_user_inatomic( &value, (unsigned long __user *)fp + 1, - sizeof(unsigned long))) + sizeof(unsigned long))) { + cc->unw_rc = QUADD_URC_EACCESS; return 0; + } vma_pc = find_vma(mm, pc); read_lr = 1; @@ -200,12 +246,14 @@ get_user_callchain_fp(struct pt_regs *regs, if (!read_lr || !is_vma_addr(value, vma_pc, sizeof(value))) { /* gcc: fp --> short frame tail (fp) */ + int nr_added; unsigned long lr = quadd_user_link_register(regs); - if (lr < QUADD_USER_SPACE_MIN_ADDR) - return 0; + nr_added = quadd_callchain_store(cc, lr, + QUADD_UNW_TYPE_LR_FP); + if (nr_added == 0) + return cc->nr; - quadd_callchain_store(cc, lr); tail = (unsigned long __user *)reg; } } @@ -214,48 +262,60 @@ get_user_callchain_fp(struct pt_regs *regs, tail = (unsigned long __user *)fp; while (tail && !((unsigned long)tail & 0x3)) - tail = user_backtrace(tail, cc, vma); + tail = user_backtrace(regs, tail, cc, vma, task); return cc->nr; } static unsigned int -__user_backtrace(struct quadd_callchain *cc, struct task_struct *task) +__user_backtrace(struct pt_regs *regs, + struct quadd_callchain *cc, + struct task_struct *task) { struct mm_struct *mm = task->mm; struct vm_area_struct *vma; unsigned long __user *tail; - if (!mm) - goto out; + cc->unw_rc = QUADD_URC_FP_INCORRECT; + + if (!mm) { + cc->unw_rc = QUADD_URC_FAILURE; + return cc->nr; + } vma = find_vma(mm, cc->curr_sp); - if (!vma) - goto out; + if (!vma) { + cc->unw_rc = QUADD_URC_SP_INCORRECT; + return cc->nr; + } tail = (unsigned long __user *)cc->curr_fp; while (tail && !((unsigned long)tail & 0x3)) - tail = user_backtrace(tail, cc, vma); + tail = user_backtrace(regs, tail, cc, vma, task); -out: return cc->nr; } #ifdef CONFIG_ARM64 static u32 __user * -user_backtrace_compat(u32 __user *tail, - struct quadd_callchain *cc, - struct vm_area_struct *stack_vma) +user_backtrace_compat(struct pt_regs *regs, + u32 __user *tail, + struct quadd_callchain *cc, + struct vm_area_struct *stack_vma, + struct task_struct *task) { + int nr_added; u32 value, value_lr = 0, value_fp = 0; u32 __user *fp_prev = NULL; if (!is_vma_addr((unsigned long)tail, stack_vma, sizeof(*tail))) return NULL; - if (__copy_from_user_inatomic(&value, tail, sizeof(value))) + if (__copy_from_user_inatomic(&value, tail, sizeof(value))) { + cc->unw_rc = QUADD_URC_EACCESS; return NULL; + } if (is_vma_addr(value, stack_vma, sizeof(value))) { /* gcc thumb/clang frame */ @@ -266,26 +326,39 @@ user_backtrace_compat(u32 __user *tail, return NULL; if (__copy_from_user_inatomic(&value_lr, tail + 1, - sizeof(value_lr))) + sizeof(value_lr))) { + cc->unw_rc = QUADD_URC_EACCESS; return NULL; + } + + cc->curr_fp = value_fp; + cc->curr_sp = (unsigned long)tail + sizeof(value_fp) * 2; + cc->curr_pc = value_lr; } else { /* gcc arm frame */ if (__copy_from_user_inatomic(&value_fp, tail - 1, - sizeof(value_fp))) + sizeof(value_fp))) { + cc->unw_rc = QUADD_URC_EACCESS; return NULL; + } + + cc->curr_fp = value_fp; + cc->curr_sp = (unsigned long)tail + sizeof(value_fp); + cc->curr_pc = value_lr = value; if (!is_vma_addr(value_fp, stack_vma, sizeof(value_fp))) return NULL; - - value_lr = value; } fp_prev = (u32 __user *)(unsigned long)value_fp; - if (value_lr < QUADD_USER_SPACE_MIN_ADDR) + nr_added = quadd_callchain_store(cc, value_lr, QUADD_UNW_TYPE_FP); + if (nr_added == 0) return NULL; - quadd_callchain_store(cc, value_lr); + if (cc->unw_method == QUADD_UNW_METHOD_MIXED && + quadd_is_ex_entry_exist(regs, value_lr, task)) + return NULL; if (fp_prev <= tail) return NULL; @@ -299,14 +372,17 @@ get_user_callchain_fp_compat(struct pt_regs *regs, struct task_struct *task) { u32 fp, sp, pc, reg; - struct vm_area_struct *vma, *vma_pc; + struct vm_area_struct *vma, *vma_pc = NULL; u32 __user *tail = NULL; struct mm_struct *mm = task->mm; cc->nr = 0; + cc->unw_rc = QUADD_URC_FP_INCORRECT; - if (!regs || !mm) + if (!regs || !mm) { + cc->unw_rc = QUADD_URC_FAILURE; return 0; + } sp = quadd_user_stack_pointer(regs); pc = instruction_pointer(regs); @@ -316,19 +392,24 @@ get_user_callchain_fp_compat(struct pt_regs *regs, return 0; vma = find_vma(mm, sp); - if (!vma) + if (!vma) { + cc->unw_rc = QUADD_URC_SP_INCORRECT; return 0; + } if (!is_vma_addr(fp, vma, sizeof(fp))) return 0; if (probe_kernel_address((unsigned long)fp, reg)) { - pr_warn_once("frame error: sp/fp: %#x/%#x, pc/lr: %#x/%#x, vma: %#lx-%#lx\n", - sp, fp, pc, (u32)quadd_user_link_register(regs), - vma->vm_start, vma->vm_end); + pr_warn_once("%s: failed for address: %#x\n", __func__, fp); + cc->unw_rc = QUADD_URC_EACCESS; return 0; } + pr_debug("sp/fp: %#x/%#x, pc/lr: %#x/%#x, *fp: %#x, stack: %#lx-%#lx\n", + sp, fp, pc, (u32)quadd_user_link_register(regs), reg, + vma->vm_start, vma->vm_end); + if (is_thumb_mode(regs)) { if (reg <= fp || !is_vma_addr(reg, vma, sizeof(reg))) return 0; @@ -341,8 +422,10 @@ get_user_callchain_fp_compat(struct pt_regs *regs, if (__copy_from_user_inatomic( &value, (u32 __user *)(fp + sizeof(u32)), - sizeof(value))) + sizeof(value))) { + cc->unw_rc = QUADD_URC_EACCESS; return 0; + } vma_pc = find_vma(mm, pc); read_lr = 1; @@ -350,12 +433,14 @@ get_user_callchain_fp_compat(struct pt_regs *regs, if (!read_lr || !is_vma_addr(value, vma_pc, sizeof(value))) { /* gcc: fp --> short frame tail (fp) */ + int nr_added; u32 lr = quadd_user_link_register(regs); - if (lr < QUADD_USER_SPACE_MIN_ADDR) - return 0; + nr_added = quadd_callchain_store(cc, lr, + QUADD_UNW_TYPE_LR_FP); + if (nr_added == 0) + return cc->nr; - quadd_callchain_store(cc, lr); tail = (u32 __user *)(unsigned long)reg; } } @@ -364,31 +449,38 @@ get_user_callchain_fp_compat(struct pt_regs *regs, tail = (u32 __user *)(unsigned long)fp; while (tail && !((unsigned long)tail & 0x3)) - tail = user_backtrace_compat(tail, cc, vma); + tail = user_backtrace_compat(regs, tail, cc, vma, task); return cc->nr; } static unsigned int -__user_backtrace_compat(struct quadd_callchain *cc, struct task_struct *task) +__user_backtrace_compat(struct pt_regs *regs, + struct quadd_callchain *cc, + struct task_struct *task) { struct mm_struct *mm = task->mm; struct vm_area_struct *vma; u32 __user *tail; - if (!mm) - goto out; + cc->unw_rc = QUADD_URC_FP_INCORRECT; + + if (!mm) { + cc->unw_rc = QUADD_URC_FAILURE; + return cc->nr; + } vma = find_vma(mm, cc->curr_sp); - if (!vma) - goto out; + if (!vma) { + cc->unw_rc = QUADD_URC_SP_INCORRECT; + return cc->nr; + } tail = (u32 __user *)cc->curr_fp; while (tail && !((unsigned long)tail & 0x3)) - tail = user_backtrace_compat(tail, cc, vma); + tail = user_backtrace_compat(regs, tail, cc, vma, task); -out: return cc->nr; } @@ -400,47 +492,69 @@ __get_user_callchain_fp(struct pt_regs *regs, struct task_struct *task) { if (cc->nr > 0) { - int nr, nr_prev = cc->nr; + if (cc->unw_rc == QUADD_URC_LEVEL_TOO_DEEP) + return cc->nr; + #ifdef CONFIG_ARM64 if (compat_user_mode(regs)) - nr = __user_backtrace_compat(cc, task); + __user_backtrace_compat(regs, cc, task); else - nr = __user_backtrace(cc, task); + __user_backtrace(regs, cc, task); #else - nr = __user_backtrace(cc, task); + __user_backtrace(regs, cc, task); #endif - if (nr != nr_prev) - cc->unw_method = QUADD_UNW_METHOD_MIXED; - return nr; + return cc->nr; } - cc->unw_method = QUADD_UNW_METHOD_FP; - #ifdef CONFIG_ARM64 if (compat_user_mode(regs)) return get_user_callchain_fp_compat(regs, cc, task); #endif + return get_user_callchain_fp(regs, cc, task); } +static unsigned int +get_user_callchain_mixed(struct pt_regs *regs, + struct quadd_callchain *cc, + struct task_struct *task) +{ + int nr_prev; + + do { + nr_prev = cc->nr; + + quadd_get_user_callchain_ut(regs, cc, task); + if (nr_prev > 0 && cc->nr == nr_prev) + break; + + nr_prev = cc->nr; + + __get_user_callchain_fp(regs, cc, task); + } while (nr_prev != cc->nr); + + return cc->nr; +} + unsigned int quadd_get_user_callchain(struct pt_regs *regs, struct quadd_callchain *cc, struct quadd_ctx *ctx, struct task_struct *task) { - int unw_fp, unw_eht, unw_mix, nr = 0; - unsigned int extra; - struct quadd_parameters *param = &ctx->param; + unsigned int method = cc->unw_method; cc->nr = 0; - if (!regs) + if (!regs) { + cc->unw_rc = QUADD_URC_FAILURE; return 0; + } cc->curr_sp = 0; cc->curr_fp = 0; + cc->curr_pc = 0; #ifdef CONFIG_ARM64 cc->cs_64 = compat_user_mode(regs) ? 0 : 1; @@ -448,21 +562,25 @@ quadd_get_user_callchain(struct pt_regs *regs, cc->cs_64 = 0; #endif - extra = param->reserved[QUADD_PARAM_IDX_EXTRA]; + cc->unw_rc = 0; - unw_fp = extra & QUADD_PARAM_EXTRA_BT_FP; - unw_eht = extra & QUADD_PARAM_EXTRA_BT_UNWIND_TABLES; - unw_mix = extra & QUADD_PARAM_EXTRA_BT_MIXED; + switch (method) { + case QUADD_UNW_METHOD_FP: + __get_user_callchain_fp(regs, cc, task); + break; - cc->unw_rc = 0; + case QUADD_UNW_METHOD_EHT: + quadd_get_user_callchain_ut(regs, cc, task); + break; - if (unw_eht) - nr = quadd_get_user_callchain_ut(regs, cc, task); + case QUADD_UNW_METHOD_MIXED: + get_user_callchain_mixed(regs, cc, task); + break; - if (unw_fp) { - if (!nr || unw_mix) - nr = __get_user_callchain_fp(regs, cc, task); + case QUADD_UNW_METHOD_NONE: + default: + break; } - return nr; + return cc->nr; } diff --git a/drivers/misc/tegra-profiler/backtrace.h b/drivers/misc/tegra-profiler/backtrace.h index 47fad098427d..abf28ebdacf6 100644 --- a/drivers/misc/tegra-profiler/backtrace.h +++ b/drivers/misc/tegra-profiler/backtrace.h @@ -18,9 +18,13 @@ #define __QUADD_BACKTRACE_H #include <linux/mm.h> +#include <linux/bitops.h> #define QUADD_MAX_STACK_DEPTH 64 +#define QUADD_UNW_TYPES_SIZE \ + DIV_ROUND_UP(QUADD_MAX_STACK_DEPTH * 4, sizeof(u32) * BITS_PER_BYTE) + struct quadd_callchain { int nr; @@ -29,6 +33,8 @@ struct quadd_callchain { u64 ip_64[QUADD_MAX_STACK_DEPTH]; }; + u32 types[QUADD_UNW_TYPES_SIZE]; + int cs_64; unsigned int unw_method; @@ -36,6 +42,7 @@ struct quadd_callchain { unsigned long curr_sp; unsigned long curr_fp; + unsigned long curr_pc; }; struct quadd_ctx; @@ -49,7 +56,7 @@ quadd_get_user_callchain(struct pt_regs *regs, int quadd_callchain_store(struct quadd_callchain *cc, - unsigned long ip); + unsigned long ip, unsigned int type); unsigned long quadd_user_stack_pointer(struct pt_regs *regs); @@ -66,5 +73,10 @@ is_vma_addr(unsigned long addr, struct vm_area_struct *vma, addr < vma->vm_end - nbytes; } +static inline int +validate_pc_addr(unsigned long addr, unsigned long nbytes) +{ + return addr && addr < TASK_SIZE - nbytes; +} #endif /* __QUADD_BACKTRACE_H */ diff --git a/drivers/misc/tegra-profiler/comm.c b/drivers/misc/tegra-profiler/comm.c index a50ddeed30b0..17f5c3c5a681 100644 --- a/drivers/misc/tegra-profiler/comm.c +++ b/drivers/misc/tegra-profiler/comm.c @@ -24,6 +24,7 @@ #include <linux/poll.h> #include <linux/bitops.h> #include <linux/err.h> +#include <linux/mm.h> #include <asm/uaccess.h> @@ -253,7 +254,7 @@ static ssize_t read_sample(char __user *buffer, size_t max_length) { u32 sed; unsigned int type; - int retval = -EIO, ip_size; + int retval = -EIO, ip_size, bt_size; int was_read = 0, write_offset = 0; unsigned long flags; struct quadd_ring_buffer *rb = &comm_ctx.rb; @@ -293,7 +294,12 @@ static ssize_t read_sample(char __user *buffer, size_t max_length) ip_size = (sed & QUADD_SED_IP64) ? sizeof(u64) : sizeof(u32); - length_extra = sample->callchain_nr * ip_size; + bt_size = sample->callchain_nr; + + length_extra = bt_size * ip_size; + + if (bt_size > 0) + length_extra += DIV_ROUND_UP(bt_size, 8) * sizeof(u32); nr_events = __sw_hweight32(sample->events_flags); length_extra += nr_events * sizeof(u32); @@ -332,6 +338,10 @@ static ssize_t read_sample(char __user *buffer, size_t max_length) length_extra = record.additional_sample.extra_length; break; + case QUADD_RECORD_TYPE_SCHED: + length_extra = 0; + break; + default: goto out; } @@ -435,6 +445,20 @@ static int check_access_permission(void) return 0; } +static struct quadd_extabs_mmap * +find_mmap(unsigned long vm_start) +{ + struct quadd_extabs_mmap *entry; + + list_for_each_entry(entry, &comm_ctx.ext_mmaps, list) { + struct vm_area_struct *mmap_vma = entry->mmap_vma; + if (vm_start == mmap_vma->vm_start) + return entry; + } + + return NULL; +} + static int device_open(struct inode *inode, struct file *file) { mutex_lock(&comm_ctx.io_mutex); @@ -528,12 +552,14 @@ device_ioctl(struct file *file, unsigned long ioctl_param) { int err = 0; + unsigned long flags; + u64 *mmap_vm_start; + struct quadd_extabs_mmap *mmap; struct quadd_parameters *user_params; struct quadd_comm_cap cap; struct quadd_module_state state; struct quadd_module_version versions; struct quadd_extables extabs; - unsigned long flags; struct quadd_ring_buffer *rb = &comm_ctx.rb; if (ioctl_num != IOCTL_SETUP && @@ -684,7 +710,20 @@ device_ioctl(struct file *file, goto error_out; } - err = comm_ctx.control->set_extab(&extabs); + mmap_vm_start = (u64 *) + &extabs.reserved[QUADD_EXT_IDX_MMAP_VM_START]; + + spin_lock(&comm_ctx.mmaps_lock); + mmap = find_mmap((unsigned long)*mmap_vm_start); + if (!mmap) { + pr_err("%s: error: mmap is not found\n", __func__); + err = -ENXIO; + spin_unlock(&comm_ctx.mmaps_lock); + goto error_out; + } + + err = comm_ctx.control->set_extab(&extabs, mmap); + spin_unlock(&comm_ctx.mmaps_lock); if (err) { pr_err("error: set_extab\n"); goto error_out; @@ -695,6 +734,7 @@ device_ioctl(struct file *file, pr_err("error: ioctl %u is unsupported in this version of module\n", ioctl_num); err = -EFAULT; + goto error_out; } error_out: @@ -702,6 +742,124 @@ error_out: return err; } +static void +delete_mmap(struct quadd_extabs_mmap *mmap) +{ + struct quadd_extabs_mmap *entry, *next; + + list_for_each_entry_safe(entry, next, &comm_ctx.ext_mmaps, list) { + if (entry == mmap) { + list_del(&entry->list); + vfree(entry->data); + kfree(entry); + break; + } + } +} + +static void mmap_open(struct vm_area_struct *vma) +{ +} + +static void mmap_close(struct vm_area_struct *vma) +{ + struct quadd_extabs_mmap *mmap; + + pr_debug("mmap_close: vma: %#lx - %#lx\n", + vma->vm_start, vma->vm_end); + + spin_lock(&comm_ctx.mmaps_lock); + + mmap = find_mmap(vma->vm_start); + if (!mmap) { + pr_err("%s: error: mmap is not found\n", __func__); + goto out; + } + + comm_ctx.control->delete_mmap(mmap); + delete_mmap(mmap); + +out: + spin_unlock(&comm_ctx.mmaps_lock); +} + +static int mmap_fault(struct vm_area_struct *vma, struct vm_fault *vmf) +{ + void *data; + struct quadd_extabs_mmap *mmap; + unsigned long offset = vmf->pgoff << PAGE_SHIFT; + + pr_debug("mmap_fault: vma: %#lx - %#lx, pgoff: %#lx, vaddr: %p\n", + vma->vm_start, vma->vm_end, vmf->pgoff, vmf->virtual_address); + + spin_lock(&comm_ctx.mmaps_lock); + + mmap = find_mmap(vma->vm_start); + if (!mmap) { + spin_unlock(&comm_ctx.mmaps_lock); + return VM_FAULT_SIGBUS; + } + + data = mmap->data; + + vmf->page = vmalloc_to_page(data + offset); + get_page(vmf->page); + + spin_unlock(&comm_ctx.mmaps_lock); + return 0; +} + +static struct vm_operations_struct mmap_vm_ops = { + .open = mmap_open, + .close = mmap_close, + .fault = mmap_fault, +}; + +static int +device_mmap(struct file *filp, struct vm_area_struct *vma) +{ + unsigned long vma_size, nr_pages; + struct quadd_extabs_mmap *entry; + + pr_debug("mmap: vma: %#lx - %#lx, pgoff: %#lx\n", + vma->vm_start, vma->vm_end, vma->vm_pgoff); + + if (vma->vm_pgoff != 0) + return -EINVAL; + + vma->vm_private_data = filp->private_data; + + vma_size = vma->vm_end - vma->vm_start; + nr_pages = vma_size / PAGE_SIZE; + + entry = kzalloc(sizeof(*entry), GFP_KERNEL); + if (!entry) + return -ENOMEM; + + entry->mmap_vma = vma; + + INIT_LIST_HEAD(&entry->list); + INIT_LIST_HEAD(&entry->ex_entries); + + entry->data = vmalloc_user(nr_pages * PAGE_SIZE); + if (!entry->data) { + pr_err("%s: error: vmalloc_user", __func__); + kfree(entry); + return -ENOMEM; + } + + spin_lock(&comm_ctx.mmaps_lock); + list_add_tail(&entry->list, &comm_ctx.ext_mmaps); + spin_unlock(&comm_ctx.mmaps_lock); + + vma->vm_ops = &mmap_vm_ops; + vma->vm_flags |= VM_DONTCOPY | VM_DONTEXPAND | VM_DONTDUMP; + + vma->vm_ops->open(vma); + + return 0; +} + static void unregister(void) { misc_deregister(comm_ctx.misc_dev); @@ -720,6 +878,7 @@ static const struct file_operations qm_fops = { .release = device_release, .unlocked_ioctl = device_ioctl, .compat_ioctl = device_ioctl, + .mmap = device_mmap, }; static int comm_init(void) @@ -740,6 +899,7 @@ static int comm_init(void) res = misc_register(misc_dev); if (res < 0) { pr_err("Error: misc_register: %d\n", res); + kfree(misc_dev); return res; } comm_ctx.misc_dev = misc_dev; @@ -753,6 +913,9 @@ static int comm_init(void) init_waitqueue_head(&comm_ctx.read_wait); + INIT_LIST_HEAD(&comm_ctx.ext_mmaps); + spin_lock_init(&comm_ctx.mmaps_lock); + return 0; } diff --git a/drivers/misc/tegra-profiler/comm.h b/drivers/misc/tegra-profiler/comm.h index a72b1d1d37dc..da49d4a34864 100644 --- a/drivers/misc/tegra-profiler/comm.h +++ b/drivers/misc/tegra-profiler/comm.h @@ -25,6 +25,7 @@ struct quadd_module_state; struct miscdevice; struct quadd_parameters; struct quadd_extables; +struct quadd_unwind_ctx; struct quadd_ring_buffer { char *buf; @@ -42,6 +43,14 @@ struct quadd_iovec { size_t len; }; +struct quadd_extabs_mmap { + struct vm_area_struct *mmap_vma; + void *data; + + struct list_head list; + struct list_head ex_entries; +}; + struct quadd_comm_control_interface { int (*start)(void); void (*stop)(void); @@ -49,7 +58,9 @@ struct quadd_comm_control_interface { uid_t *debug_app_uid); void (*get_capabilities)(struct quadd_comm_cap *cap); void (*get_state)(struct quadd_module_state *state); - int (*set_extab)(struct quadd_extables *extabs); + int (*set_extab)(struct quadd_extables *extabs, + struct quadd_extabs_mmap *mmap); + void (*delete_mmap)(struct quadd_extabs_mmap *mmap); }; struct quadd_comm_data_interface { @@ -77,6 +88,9 @@ struct quadd_comm_ctx { wait_queue_head_t read_wait; struct miscdevice *misc_dev; + + struct list_head ext_mmaps; + spinlock_t mmaps_lock; }; struct quadd_comm_data_interface * diff --git a/drivers/misc/tegra-profiler/eh_unwind.c b/drivers/misc/tegra-profiler/eh_unwind.c index dc0c47b8a9c6..ae3b0d0dd195 100644 --- a/drivers/misc/tegra-profiler/eh_unwind.c +++ b/drivers/misc/tegra-profiler/eh_unwind.c @@ -21,11 +21,13 @@ #include <linux/slab.h> #include <linux/uaccess.h> #include <linux/err.h> +#include <linux/rcupdate.h> #include <linux/tegra_profiler.h> #include "eh_unwind.h" #include "backtrace.h" +#include "comm.h" #define QUADD_EXTABS_SIZE 0x100 @@ -44,11 +46,13 @@ enum regs { struct extab_info { unsigned long addr; unsigned long length; + + unsigned long mmap_offset; }; struct extables { - struct extab_info exidx; struct extab_info extab; + struct extab_info exidx; }; struct ex_region_info { @@ -56,18 +60,25 @@ struct ex_region_info { unsigned long vm_end; struct extables tabs; + struct quadd_extabs_mmap *mmap; + + struct list_head list; }; -struct quadd_unwind_ctx { - struct ex_region_info *regions; - unsigned long ri_nr; - unsigned long ri_size; +struct regions_data { + struct ex_region_info *entries; - pid_t pid; + unsigned long curr_nr; + unsigned long size; + + struct rcu_head rcu; +}; - unsigned long pinned_pages; - unsigned long pinned_size; +struct quadd_unwind_ctx { + struct regions_data *rd; + pid_t pid; + unsigned long ex_tables_size; spinlock_t lock; }; @@ -111,26 +122,135 @@ validate_stack_addr(unsigned long addr, } static inline int -validate_pc_addr(unsigned long addr, unsigned long nbytes) +validate_mmap_addr(struct quadd_extabs_mmap *mmap, + unsigned long addr, unsigned long nbytes) { - return addr && addr < TASK_SIZE - nbytes; + struct vm_area_struct *vma = mmap->mmap_vma; + unsigned long size = vma->vm_end - vma->vm_start; + unsigned long data = (unsigned long)mmap->data; + + if (addr & 0x03) { + pr_err_once("%s: error: unaligned address: %#lx, data: %#lx-%#lx, vma: %#lx-%#lx\n", + __func__, addr, data, data + size, + vma->vm_start, vma->vm_end); + return 0; + } + + if (addr < data || addr >= data + (size - nbytes)) { + pr_err_once("%s: error: addr: %#lx, data: %#lx-%#lx, vma: %#lx-%#lx\n", + __func__, addr, data, data + size, + vma->vm_start, vma->vm_end); + return 0; + } + + return 1; } -#define read_user_data(addr, retval) \ -({ \ - long ret; \ - ret = probe_kernel_address(addr, retval); \ - if (ret) \ - ret = -QUADD_URC_EACCESS; \ - ret; \ +/* + * TBD: why probe_kernel_address() can lead to random crashes + * on 64-bit kernel, and replacing it to __get_user() fixed the issue. + */ +#define read_user_data(addr, retval) \ +({ \ + int ret; \ + \ + pagefault_disable(); \ + ret = __get_user(retval, addr); \ + pagefault_enable(); \ + \ + if (ret) { \ + pr_debug("%s: failed for address: %p\n", \ + __func__, addr); \ + ret = -QUADD_URC_EACCESS; \ + } \ + \ + ret; \ }) +static inline long +read_mmap_data(struct quadd_extabs_mmap *mmap, const u32 *addr, u32 *retval) +{ + if (!validate_mmap_addr(mmap, (unsigned long)addr, sizeof(u32))) + return -QUADD_URC_EACCESS; + + *retval = *addr; + return 0; +} + +static inline unsigned long +ex_addr_to_mmap_addr(unsigned long addr, + struct ex_region_info *ri, + int exidx) +{ + unsigned long offset; + struct extab_info *ei; + + ei = exidx ? &ri->tabs.exidx : &ri->tabs.extab; + offset = addr - ei->addr; + + return ei->mmap_offset + offset + (unsigned long)ri->mmap->data; +} + +static inline unsigned long +mmap_addr_to_ex_addr(unsigned long addr, + struct ex_region_info *ri, + int exidx) +{ + unsigned long offset; + struct extab_info *ei; + + ei = exidx ? &ri->tabs.exidx : &ri->tabs.extab; + offset = addr - ei->mmap_offset - (unsigned long)ri->mmap->data; + + return ei->addr + offset; +} + +static inline u32 +prel31_to_addr(const u32 *ptr) +{ + u32 value; + s32 offset; + + if (read_user_data(ptr, value)) + return 0; + + /* sign-extend to 32 bits */ + offset = (((s32)value) << 1) >> 1; + return (u32)(unsigned long)ptr + offset; +} + +static unsigned long +mmap_prel31_to_addr(const u32 *ptr, struct ex_region_info *ri, + int is_src_exidx, int is_dst_exidx, int to_mmap) +{ + u32 value, addr; + unsigned long addr_res; + s32 offset; + struct extab_info *ei_src, *ei_dst; + + ei_src = is_src_exidx ? &ri->tabs.exidx : &ri->tabs.extab; + ei_dst = is_dst_exidx ? &ri->tabs.exidx : &ri->tabs.extab; + + value = *ptr; + offset = (((s32)value) << 1) >> 1; + + addr = mmap_addr_to_ex_addr((unsigned long)ptr, ri, is_src_exidx); + addr += offset; + addr_res = addr; + + if (to_mmap) + addr_res = ex_addr_to_mmap_addr(addr_res, ri, is_dst_exidx); + + return addr_res; +} + static int -add_ex_region(struct ex_region_info *new_entry) +add_ex_region(struct regions_data *rd, + struct ex_region_info *new_entry) { unsigned int i_min, i_max, mid; - struct ex_region_info *array = ctx.regions; - unsigned long size = ctx.ri_nr; + struct ex_region_info *array = rd->entries; + unsigned long size = rd->curr_nr; if (!array) return 0; @@ -175,12 +295,61 @@ add_ex_region(struct ex_region_info *new_entry) } } +static int +remove_ex_region(struct regions_data *rd, + struct ex_region_info *entry) +{ + unsigned int i_min, i_max, mid; + struct ex_region_info *array = rd->entries; + unsigned long size = rd->curr_nr; + + if (!array) + return 0; + + if (size == 0) + return 0; + + if (size == 1) { + if (array[0].vm_start == entry->vm_start) + return 1; + else + return 0; + } + + if (array[0].vm_start > entry->vm_start) + return 0; + else if (array[size - 1].vm_start < entry->vm_start) + return 0; + + i_min = 0; + i_max = size; + + while (i_min < i_max) { + mid = i_min + (i_max - i_min) / 2; + + if (entry->vm_start <= array[mid].vm_start) + i_max = mid; + else + i_min = mid + 1; + } + + if (array[i_max].vm_start == entry->vm_start) { + memmove(array + i_max, + array + i_max + 1, + (size - i_max) * sizeof(*array)); + return 1; + } else { + return 0; + } +} + static struct ex_region_info * -search_ex_region(unsigned long key, struct extables *tabs) +search_ex_region(struct ex_region_info *array, + unsigned long size, + unsigned long key, + struct ex_region_info *ri) { unsigned int i_min, i_max, mid; - struct ex_region_info *array = ctx.regions; - unsigned long size = ctx.ri_nr; if (size == 0) return NULL; @@ -198,288 +367,253 @@ search_ex_region(unsigned long key, struct extables *tabs) } if (array[i_max].vm_start == key) { - memcpy(tabs, &array[i_max].tabs, sizeof(*tabs)); + memcpy(ri, &array[i_max], sizeof(*ri)); return &array[i_max]; } return NULL; } -static void pin_user_pages(struct extables *tabs) +static long +__search_ex_region(unsigned long key, struct ex_region_info *ri) { - long ret; - struct extab_info *ti; - unsigned long nr_pages, addr; - struct pid *pid_s; - struct task_struct *task = NULL; - struct mm_struct *mm; + struct regions_data *rd; + struct ex_region_info *ri_p = NULL; rcu_read_lock(); - pid_s = find_vpid(ctx.pid); - if (pid_s) - task = pid_task(pid_s, PIDTYPE_PID); - - rcu_read_unlock(); - - if (!task) - return; - - mm = task->mm; - if (!mm) - return; + rd = rcu_dereference(ctx.rd); + if (!rd) + goto out; - down_write(&mm->mmap_sem); + ri_p = search_ex_region(rd->entries, rd->curr_nr, key, ri); - ti = &tabs->exidx; - addr = ti->addr & PAGE_MASK; - nr_pages = GET_NR_PAGES(ti->addr, ti->length); - - ret = get_user_pages(task, mm, addr, nr_pages, 0, 0, - NULL, NULL); - if (ret < 0) { - pr_debug("%s: warning: addr/nr_pages: %#lx/%lu\n", - __func__, ti->addr, nr_pages); - goto error_out; - } - - ctx.pinned_pages += ret; - ctx.pinned_size += ti->length; +out: + rcu_read_unlock(); + return ri_p ? 0 : -ENOENT; +} - pr_debug("%s: pin exidx: addr/nr_pages: %#lx/%lu\n", - __func__, ti->addr, nr_pages); +static struct regions_data *rd_alloc(unsigned long size) +{ + struct regions_data *rd; - ti = &tabs->extab; - addr = ti->addr & PAGE_MASK; - nr_pages = GET_NR_PAGES(ti->addr, ti->length); + rd = kzalloc(sizeof(*rd), GFP_KERNEL); + if (!rd) + return NULL; - ret = get_user_pages(task, mm, addr, nr_pages, 0, 0, - NULL, NULL); - if (ret < 0) { - pr_debug("%s: warning: addr/nr_pages: %#lx/%lu\n", - __func__, ti->addr, nr_pages); - goto error_out; + rd->entries = kzalloc(size * sizeof(*rd->entries), GFP_KERNEL); + if (!rd->entries) { + kfree(rd); + return NULL; } - ctx.pinned_pages += ret; - ctx.pinned_size += ti->length; + rd->size = size; + rd->curr_nr = 0; - pr_debug("%s: pin extab: addr/nr_pages: %#lx/%lu\n", - __func__, ti->addr, nr_pages); - -error_out: - up_write(&mm->mmap_sem); + return rd; } -static void -pin_user_pages_work(struct work_struct *w) +static void rd_free(struct regions_data *rd) { - struct extables tabs; - struct ex_region_info *ri; - struct pin_pages_work *work; - - work = container_of(w, struct pin_pages_work, work); + if (rd) + kfree(rd->entries); - spin_lock(&ctx.lock); - ri = search_ex_region(work->vm_start, &tabs); - spin_unlock(&ctx.lock); - if (ri) - pin_user_pages(&tabs); - - kfree(w); + kfree(rd); } -static int -__pin_user_pages(unsigned long vm_start) +static void rd_free_rcu(struct rcu_head *rh) { - struct pin_pages_work *work; - - work = kmalloc(sizeof(*work), GFP_ATOMIC); - if (!work) - return -ENOMEM; - - INIT_WORK(&work->work, pin_user_pages_work); - work->vm_start = vm_start; - - schedule_work(&work->work); - - return 0; + struct regions_data *rd = container_of(rh, struct regions_data, rcu); + rd_free(rd); } -int quadd_unwind_set_extab(struct quadd_extables *extabs) +int quadd_unwind_set_extab(struct quadd_extables *extabs, + struct quadd_extabs_mmap *mmap) { int err = 0; + unsigned long nr_entries, nr_added, new_size; struct ex_region_info ri_entry; struct extab_info *ti; + struct regions_data *rd, *rd_new; + struct ex_region_info *ex_entry; spin_lock(&ctx.lock); - if (!ctx.regions) { + rd = rcu_dereference(ctx.rd); + if (!rd) { + pr_warn("%s: warning: rd\n", __func__); + new_size = QUADD_EXTABS_SIZE; + nr_entries = 0; + } else { + new_size = rd->size; + nr_entries = rd->curr_nr; + } + + if (nr_entries >= new_size) + new_size += new_size >> 1; + + rd_new = rd_alloc(new_size); + if (IS_ERR_OR_NULL(rd_new)) { + pr_err("%s: error: rd_alloc\n", __func__); err = -ENOMEM; goto error_out; } - if (ctx.ri_nr >= ctx.ri_size) { - struct ex_region_info *new_regions; - unsigned long newlen = ctx.ri_size + (ctx.ri_size >> 1); + if (rd && nr_entries) + memcpy(rd_new->entries, rd->entries, + nr_entries * sizeof(*rd->entries)); - new_regions = krealloc(ctx.regions, newlen, GFP_KERNEL); - if (!new_regions) { - err = -ENOMEM; - goto error_out; - } - ctx.regions = new_regions; - ctx.ri_size = newlen; - } + rd_new->curr_nr = nr_entries; ri_entry.vm_start = extabs->vm_start; ri_entry.vm_end = extabs->vm_end; + ri_entry.mmap = mmap; + ti = &ri_entry.tabs.exidx; ti->addr = extabs->exidx.addr; ti->length = extabs->exidx.length; + ti->mmap_offset = extabs->reserved[QUADD_EXT_IDX_EXIDX_OFFSET]; + ctx.ex_tables_size += ti->length; ti = &ri_entry.tabs.extab; ti->addr = extabs->extab.addr; ti->length = extabs->extab.length; + ti->mmap_offset = extabs->reserved[QUADD_EXT_IDX_EXTAB_OFFSET]; + ctx.ex_tables_size += ti->length; - ctx.ri_nr += add_ex_region(&ri_entry); + nr_added = add_ex_region(rd_new, &ri_entry); + if (nr_added == 0) + goto error_free; + rd_new->curr_nr += nr_added; - spin_unlock(&ctx.lock); + ex_entry = kzalloc(sizeof(*ex_entry), GFP_KERNEL); + if (!ex_entry) { + err = -ENOMEM; + goto error_free; + } + memcpy(ex_entry, &ri_entry, sizeof(*ex_entry)); + + INIT_LIST_HEAD(&ex_entry->list); + list_add_tail(&ex_entry->list, &mmap->ex_entries); - __pin_user_pages(ri_entry.vm_start); + rcu_assign_pointer(ctx.rd, rd_new); + + if (rd) + call_rcu(&rd->rcu, rd_free_rcu); + + spin_unlock(&ctx.lock); return 0; +error_free: + rd_free(rd_new); error_out: spin_unlock(&ctx.lock); return err; } -static u32 -prel31_to_addr(const u32 *ptr) +static int +clean_mmap(struct regions_data *rd, struct quadd_extabs_mmap *mmap, int rm_ext) { - u32 value; - s32 offset; + int nr_removed = 0; + struct ex_region_info *entry, *next; - if (read_user_data(ptr, value)) + if (!rd || !mmap) return 0; - /* sign-extend to 32 bits */ - offset = (((s32)value) << 1) >> 1; - return (u32)(unsigned long)ptr + offset; -} + list_for_each_entry_safe(entry, next, &mmap->ex_entries, list) { + if (rm_ext) + nr_removed += remove_ex_region(rd, entry); -static const struct unwind_idx * -unwind_find_origin(const struct unwind_idx *start, - const struct unwind_idx *stop) -{ - while (start < stop) { - u32 addr_offset; - const struct unwind_idx *mid = start + ((stop - start) >> 1); - - if (read_user_data(&mid->addr_offset, addr_offset)) - return ERR_PTR(-EFAULT); - - if (addr_offset >= 0x40000000) - /* negative offset */ - start = mid + 1; - else - /* positive offset */ - stop = mid; + list_del(&entry->list); + kfree(entry); } - return stop; + return nr_removed; } -/* - * Binary search in the unwind index. The entries are - * guaranteed to be sorted in ascending order by the linker. - * - * start = first entry - * origin = first entry with positive offset (or stop if there is no such entry) - * stop - 1 = last entry - */ -static const struct unwind_idx * -search_index(u32 addr, - const struct unwind_idx *start, - const struct unwind_idx *origin, - const struct unwind_idx *stop) +void quadd_unwind_delete_mmap(struct quadd_extabs_mmap *mmap) { - u32 addr_prel31; + unsigned long nr_entries, nr_removed, new_size; + struct regions_data *rd, *rd_new; - pr_debug("%#x, %p, %p, %p\n", addr, start, origin, stop); - - /* - * only search in the section with the matching sign. This way the - * prel31 numbers can be compared as unsigned longs. - */ - if (addr < (u32)(unsigned long)start) - /* negative offsets: [start; origin) */ - stop = origin; - else - /* positive offsets: [origin; stop) */ - start = origin; + if (!mmap) + return; - /* prel31 for address relavive to start */ - addr_prel31 = (addr - (u32)(unsigned long)start) & 0x7fffffff; + spin_lock(&ctx.lock); - while (start < stop - 1) { - u32 addr_offset, d; + rd = rcu_dereference(ctx.rd); + if (!rd || !rd->curr_nr) + goto error_out; - const struct unwind_idx *mid = start + ((stop - start) >> 1); + nr_entries = rd->curr_nr; + new_size = min_t(unsigned long, rd->size, nr_entries); - /* - * As addr_prel31 is relative to start an offset is needed to - * make it relative to mid. - */ - if (read_user_data(&mid->addr_offset, addr_offset)) - return ERR_PTR(-EFAULT); + rd_new = rd_alloc(new_size); + if (IS_ERR_OR_NULL(rd_new)) { + pr_err("%s: error: rd_alloc\n", __func__); + goto error_out; + } + rd_new->size = new_size; + rd_new->curr_nr = nr_entries; - d = (u32)(unsigned long)mid - (u32)(unsigned long)start; + memcpy(rd_new->entries, rd->entries, + nr_entries * sizeof(*rd->entries)); - if (addr_prel31 - d < addr_offset) { - stop = mid; - } else { - /* keep addr_prel31 relative to start */ - addr_prel31 -= ((u32)(unsigned long)mid - - (u32)(unsigned long)start); - start = mid; - } - } + nr_removed = clean_mmap(rd_new, mmap, 1); + rd_new->curr_nr -= nr_removed; - if (likely(start->addr_offset <= addr_prel31)) - return start; + rcu_assign_pointer(ctx.rd, rd_new); + call_rcu(&rd->rcu, rd_free_rcu); - pr_debug("Unknown address %#x\n", addr); - return NULL; +error_out: + spin_unlock(&ctx.lock); } static const struct unwind_idx * -unwind_find_idx(struct extab_info *exidx, u32 addr) +unwind_find_idx(struct ex_region_info *ri, u32 addr) { - const struct unwind_idx *start; - const struct unwind_idx *origin; - const struct unwind_idx *stop; - const struct unwind_idx *idx = NULL; + unsigned long length; + u32 value; + struct unwind_idx *start; + struct unwind_idx *stop; + struct unwind_idx *mid = NULL; + length = ri->tabs.exidx.length / sizeof(*start); + + if (unlikely(!length)) + return NULL; + + start = (struct unwind_idx *)((char *)ri->mmap->data + + ri->tabs.exidx.mmap_offset); + stop = start + length - 1; - start = (const struct unwind_idx *)exidx->addr; - stop = start + exidx->length / sizeof(*start); + value = (u32)mmap_prel31_to_addr(&start->addr_offset, ri, 1, 0, 0); + if (addr < value) + return NULL; - origin = unwind_find_origin(start, stop); - if (IS_ERR(origin)) - return origin; + value = (u32)mmap_prel31_to_addr(&stop->addr_offset, ri, 1, 0, 0); + if (addr >= value) + return NULL; - idx = search_index(addr, start, origin, stop); + while (start < stop - 1) { + mid = start + ((stop - start) >> 1); - pr_debug("addr: %#x, start: %p, origin: %p, stop: %p, idx: %p\n", - addr, start, origin, stop, idx); + value = (u32)mmap_prel31_to_addr(&mid->addr_offset, + ri, 1, 0, 0); - return idx; + if (addr < value) + stop = mid; + else + start = mid; + } + + return start; } static unsigned long -unwind_get_byte(struct unwind_ctrl_block *ctrl, long *err) +unwind_get_byte(struct quadd_extabs_mmap *mmap, + struct unwind_ctrl_block *ctrl, long *err) { unsigned long ret; u32 insn_word; @@ -487,12 +621,12 @@ unwind_get_byte(struct unwind_ctrl_block *ctrl, long *err) *err = 0; if (ctrl->entries <= 0) { - pr_debug("error: corrupt unwind table\n"); + pr_err_once("%s: error: corrupt unwind table\n", __func__); *err = -QUADD_URC_TBL_IS_CORRUPT; return 0; } - *err = read_user_data(ctrl->insn, insn_word); + *err = read_mmap_data(mmap, ctrl->insn, &insn_word); if (*err < 0) return 0; @@ -511,11 +645,13 @@ unwind_get_byte(struct unwind_ctrl_block *ctrl, long *err) /* * Execute the current unwind instruction. */ -static long unwind_exec_insn(struct unwind_ctrl_block *ctrl) +static long +unwind_exec_insn(struct quadd_extabs_mmap *mmap, + struct unwind_ctrl_block *ctrl) { long err; unsigned int i; - unsigned long insn = unwind_get_byte(ctrl, &err); + unsigned long insn = unwind_get_byte(mmap, ctrl, &err); if (err < 0) return err; @@ -537,7 +673,7 @@ static long unwind_exec_insn(struct unwind_ctrl_block *ctrl) u32 *vsp = (u32 *)(unsigned long)ctrl->vrs[SP]; int load_sp, reg = 4; - insn = (insn << 8) | unwind_get_byte(ctrl, &err); + insn = (insn << 8) | unwind_get_byte(mmap, ctrl, &err); if (err < 0) return err; @@ -600,7 +736,7 @@ static long unwind_exec_insn(struct unwind_ctrl_block *ctrl) pr_debug("CMD_FINISH\n"); } else if (insn == 0xb1) { - unsigned long mask = unwind_get_byte(ctrl, &err); + unsigned long mask = unwind_get_byte(mmap, ctrl, &err); u32 *vsp = (u32 *)(unsigned long)ctrl->vrs[SP]; int reg = 0; @@ -629,7 +765,7 @@ static long unwind_exec_insn(struct unwind_ctrl_block *ctrl) ctrl->vrs[SP] = (u32)(unsigned long)vsp; pr_debug("new vsp: %#x\n", ctrl->vrs[SP]); } else if (insn == 0xb2) { - unsigned long uleb128 = unwind_get_byte(ctrl, &err); + unsigned long uleb128 = unwind_get_byte(mmap, ctrl, &err); if (err < 0) return err; @@ -641,7 +777,7 @@ static long unwind_exec_insn(struct unwind_ctrl_block *ctrl) unsigned long data, reg_from, reg_to; u32 *vsp = (u32 *)(unsigned long)ctrl->vrs[SP]; - data = unwind_get_byte(ctrl, &err); + data = unwind_get_byte(mmap, ctrl, &err); if (err < 0) return err; @@ -701,9 +837,10 @@ static long unwind_exec_insn(struct unwind_ctrl_block *ctrl) * updates the *pc and *sp with the new values. */ static long -unwind_frame(struct extab_info *exidx, +unwind_frame(struct ex_region_info *ri, struct stackframe *frame, - struct vm_area_struct *vma_sp) + struct vm_area_struct *vma_sp, + unsigned int *unw_type) { unsigned long high, low; const struct unwind_idx *idx; @@ -721,7 +858,7 @@ unwind_frame(struct extab_info *exidx, pr_debug("pc: %#lx, lr: %#lx, sp:%#lx, low/high: %#lx/%#lx\n", frame->pc, frame->lr, frame->sp, low, high); - idx = unwind_find_idx(exidx, frame->pc); + idx = unwind_find_idx(ri, frame->pc); if (IS_ERR_OR_NULL(idx)) return -QUADD_URC_IDX_NOT_FOUND; @@ -734,7 +871,7 @@ unwind_frame(struct extab_info *exidx, ctrl.vrs[LR] = frame->lr; ctrl.vrs[PC] = 0; - err = read_user_data(&idx->insn, val); + err = read_mmap_data(ri->mmap, &idx->insn, &val); if (err < 0) return err; @@ -743,7 +880,8 @@ unwind_frame(struct extab_info *exidx, return -QUADD_URC_CANTUNWIND; } else if ((val & 0x80000000) == 0) { /* prel31 to the unwind table */ - ctrl.insn = (u32 *)(unsigned long)prel31_to_addr(&idx->insn); + ctrl.insn = (u32 *)(unsigned long) + mmap_prel31_to_addr(&idx->insn, ri, 1, 0, 1); if (!ctrl.insn) return -QUADD_URC_EACCESS; } else if ((val & 0xff000000) == 0x80000000) { @@ -755,7 +893,7 @@ unwind_frame(struct extab_info *exidx, return -QUADD_URC_UNSUPPORTED_PR; } - err = read_user_data(ctrl.insn, val); + err = read_mmap_data(ri->mmap, ctrl.insn, &val); if (err < 0) return err; @@ -773,7 +911,7 @@ unwind_frame(struct extab_info *exidx, } while (ctrl.entries > 0) { - err = unwind_exec_insn(&ctrl); + err = unwind_exec_insn(ri->mmap, &ctrl); if (err < 0) return err; @@ -782,12 +920,12 @@ unwind_frame(struct extab_info *exidx, return -QUADD_URC_SP_INCORRECT; } - if (ctrl.vrs[PC] == 0) + if (ctrl.vrs[PC] == 0) { ctrl.vrs[PC] = ctrl.vrs[LR]; - - /* check for infinite loop */ - if (frame->pc == ctrl.vrs[PC]) - return -QUADD_URC_FAILURE; + *unw_type = QUADD_UNW_TYPE_LR_UT; + } else { + *unw_type = QUADD_UNW_TYPE_UT; + } if (!validate_pc_addr(ctrl.vrs[PC], sizeof(u32))) return -QUADD_URC_PC_INCORRECT; @@ -804,67 +942,53 @@ unwind_frame(struct extab_info *exidx, static void unwind_backtrace(struct quadd_callchain *cc, - struct extab_info *exidx, - struct pt_regs *regs, + struct ex_region_info *ri, + struct stackframe *frame, struct vm_area_struct *vma_sp, struct task_struct *task) { - struct extables tabs; - struct stackframe frame; - -#ifdef CONFIG_ARM64 - frame.fp_thumb = regs->compat_usr(7); - frame.fp_arm = regs->compat_usr(11); -#else - frame.fp_thumb = regs->ARM_r7; - frame.fp_arm = regs->ARM_fp; -#endif - - frame.pc = instruction_pointer(regs); - frame.sp = quadd_user_stack_pointer(regs); - frame.lr = quadd_user_link_register(regs); + unsigned int unw_type; + struct ex_region_info ri_new; cc->unw_rc = QUADD_URC_FAILURE; pr_debug("fp_arm: %#lx, fp_thumb: %#lx, sp: %#lx, lr: %#lx, pc: %#lx\n", - frame.fp_arm, frame.fp_thumb, frame.sp, frame.lr, frame.pc); + frame->fp_arm, frame->fp_thumb, + frame->sp, frame->lr, frame->pc); pr_debug("vma_sp: %#lx - %#lx, length: %#lx\n", vma_sp->vm_start, vma_sp->vm_end, vma_sp->vm_end - vma_sp->vm_start); while (1) { long err; - unsigned long where = frame.pc; + int nr_added; + unsigned long where = frame->pc; struct vm_area_struct *vma_pc; struct mm_struct *mm = task->mm; if (!mm) break; - if (!validate_stack_addr(frame.sp, vma_sp, sizeof(u32))) { + if (!validate_stack_addr(frame->sp, vma_sp, sizeof(u32))) { cc->unw_rc = -QUADD_URC_SP_INCORRECT; break; } - vma_pc = find_vma(mm, frame.pc); + vma_pc = find_vma(mm, frame->pc); if (!vma_pc) break; - if (!is_vma_addr(exidx->addr, vma_pc, sizeof(u32))) { - struct ex_region_info *ri; - - spin_lock(&ctx.lock); - ri = search_ex_region(vma_pc->vm_start, &tabs); - spin_unlock(&ctx.lock); - if (!ri) { + if (!is_vma_addr(ri->tabs.exidx.addr, vma_pc, sizeof(u32))) { + err = __search_ex_region(vma_pc->vm_start, &ri_new); + if (err) { cc->unw_rc = QUADD_URC_TBL_NOT_EXIST; break; } - exidx = &tabs.exidx; + ri = &ri_new; } - err = unwind_frame(exidx, &frame, vma_sp); + err = unwind_frame(ri, frame, vma_sp, &unw_type); if (err < 0) { pr_debug("end unwind, urc: %ld\n", err); cc->unw_rc = -err; @@ -872,12 +996,15 @@ unwind_backtrace(struct quadd_callchain *cc, } pr_debug("function at [<%08lx>] from [<%08lx>]\n", - where, frame.pc); + where, frame->pc); - quadd_callchain_store(cc, frame.pc); + cc->curr_sp = frame->sp; + cc->curr_fp = frame->fp_arm; + cc->curr_pc = frame->pc; - cc->curr_sp = frame.sp; - cc->curr_fp = frame.fp_arm; + nr_added = quadd_callchain_store(cc, frame->pc, unw_type); + if (nr_added == 0) + break; } } @@ -886,14 +1013,13 @@ quadd_get_user_callchain_ut(struct pt_regs *regs, struct quadd_callchain *cc, struct task_struct *task) { - unsigned long ip, sp; + long err; + int nr_prev = cc->nr; + unsigned long ip, sp, lr; struct vm_area_struct *vma, *vma_sp; struct mm_struct *mm = task->mm; - struct ex_region_info *ri; - struct extables tabs; - - cc->unw_method = QUADD_UNW_METHOD_EHT; - cc->unw_rc = QUADD_URC_FAILURE; + struct ex_region_info ri; + struct stackframe frame; #ifdef CONFIG_ARM64 if (!compat_user_mode(regs)) { @@ -902,11 +1028,38 @@ quadd_get_user_callchain_ut(struct pt_regs *regs, } #endif + if (cc->unw_rc == QUADD_URC_LEVEL_TOO_DEEP) + return nr_prev; + + cc->unw_rc = QUADD_URC_FAILURE; + if (!regs || !mm) return 0; - ip = instruction_pointer(regs); - sp = quadd_user_stack_pointer(regs); + if (nr_prev > 0) { + ip = cc->curr_pc; + sp = cc->curr_sp; + lr = 0; + + frame.fp_thumb = 0; + frame.fp_arm = cc->curr_fp; + } else { + ip = instruction_pointer(regs); + sp = quadd_user_stack_pointer(regs); + lr = quadd_user_link_register(regs); + +#ifdef CONFIG_ARM64 + frame.fp_thumb = regs->compat_usr(7); + frame.fp_arm = regs->compat_usr(11); +#else + frame.fp_thumb = regs->ARM_r7; + frame.fp_arm = regs->ARM_fp; +#endif + } + + frame.pc = ip; + frame.sp = sp; + frame.lr = lr; vma = find_vma(mm, ip); if (!vma) @@ -916,41 +1069,85 @@ quadd_get_user_callchain_ut(struct pt_regs *regs, if (!vma_sp) return 0; - spin_lock(&ctx.lock); - ri = search_ex_region(vma->vm_start, &tabs); - spin_unlock(&ctx.lock); - if (!ri) { + err = __search_ex_region(vma->vm_start, &ri); + if (err) { cc->unw_rc = QUADD_URC_TBL_NOT_EXIST; return 0; } - unwind_backtrace(cc, &tabs.exidx, regs, vma_sp, task); + unwind_backtrace(cc, &ri, &frame, vma_sp, task); return cc->nr; } -int quadd_unwind_start(struct task_struct *task) +int +quadd_is_ex_entry_exist(struct pt_regs *regs, + unsigned long addr, + struct task_struct *task) { - spin_lock(&ctx.lock); + long err; + u32 value; + const struct unwind_idx *idx; + struct ex_region_info ri; + struct vm_area_struct *vma; + struct mm_struct *mm = task->mm; - kfree(ctx.regions); + if (!regs || !mm) + return 0; - ctx.ri_nr = 0; - ctx.ri_size = 0; +#ifdef CONFIG_ARM64 + if (!compat_user_mode(regs)) + return 0; +#endif - ctx.pinned_pages = 0; - ctx.pinned_size = 0; + vma = find_vma(mm, addr); + if (!vma) + return 0; - ctx.regions = kzalloc(QUADD_EXTABS_SIZE * sizeof(*ctx.regions), - GFP_KERNEL); - if (!ctx.regions) { + err = __search_ex_region(vma->vm_start, &ri); + if (err) + return 0; + + idx = unwind_find_idx(&ri, addr); + if (IS_ERR_OR_NULL(idx)) + return 0; + + err = read_mmap_data(ri.mmap, &idx->insn, &value); + if (err < 0) + return 0; + + if (value == 1) + return 0; + + return 1; +} + +int quadd_unwind_start(struct task_struct *task) +{ + struct regions_data *rd, *rd_old; + + spin_lock(&ctx.lock); + + rd_old = rcu_dereference(ctx.rd); + if (rd_old) + pr_warn("%s: warning: rd_old\n", __func__); + + rd = rd_alloc(QUADD_EXTABS_SIZE); + if (IS_ERR_OR_NULL(rd)) { + pr_err("%s: error: rd_alloc\n", __func__); spin_unlock(&ctx.lock); return -ENOMEM; } - ctx.ri_size = QUADD_EXTABS_SIZE; + + rcu_assign_pointer(ctx.rd, rd); + + if (rd_old) + call_rcu(&rd_old->rcu, rd_free_rcu); ctx.pid = task->tgid; + ctx.ex_tables_size = 0; + spin_unlock(&ctx.lock); return 0; @@ -958,31 +1155,40 @@ int quadd_unwind_start(struct task_struct *task) void quadd_unwind_stop(void) { + int i; + unsigned long nr_entries, size; + struct regions_data *rd; + struct ex_region_info *ri; + spin_lock(&ctx.lock); - kfree(ctx.regions); - ctx.regions = NULL; + ctx.pid = 0; + + rd = rcu_dereference(ctx.rd); + if (!rd) + goto out; - ctx.ri_size = 0; - ctx.ri_nr = 0; + nr_entries = rd->curr_nr; + size = rd->size; - ctx.pid = 0; + for (i = 0; i < nr_entries; i++) { + ri = &rd->entries[i]; + clean_mmap(rd, ri->mmap, 0); + } - spin_unlock(&ctx.lock); + rcu_assign_pointer(ctx.rd, NULL); + call_rcu(&rd->rcu, rd_free_rcu); - pr_info("exception tables size: %lu bytes\n", ctx.pinned_size); - pr_info("pinned pages: %lu (%lu bytes)\n", ctx.pinned_pages, - ctx.pinned_pages * PAGE_SIZE); +out: + spin_unlock(&ctx.lock); + pr_info("exception tables size: %lu bytes\n", ctx.ex_tables_size); } int quadd_unwind_init(void) { - ctx.regions = NULL; - ctx.ri_size = 0; - ctx.ri_nr = 0; - ctx.pid = 0; - spin_lock_init(&ctx.lock); + rcu_assign_pointer(ctx.rd, NULL); + ctx.pid = 0; return 0; } @@ -990,4 +1196,5 @@ int quadd_unwind_init(void) void quadd_unwind_deinit(void) { quadd_unwind_stop(); + rcu_barrier(); } diff --git a/drivers/misc/tegra-profiler/eh_unwind.h b/drivers/misc/tegra-profiler/eh_unwind.h index 1f8b7becac7e..6723cb72680a 100644 --- a/drivers/misc/tegra-profiler/eh_unwind.h +++ b/drivers/misc/tegra-profiler/eh_unwind.h @@ -22,6 +22,7 @@ struct quadd_callchain; struct quadd_ctx; struct quadd_extables; struct task_struct; +struct quadd_extabs_mmap; unsigned int quadd_get_user_callchain_ut(struct pt_regs *regs, @@ -34,6 +35,13 @@ void quadd_unwind_deinit(void); int quadd_unwind_start(struct task_struct *task); void quadd_unwind_stop(void); -int quadd_unwind_set_extab(struct quadd_extables *extabs); +int quadd_unwind_set_extab(struct quadd_extables *extabs, + struct quadd_extabs_mmap *mmap); +void quadd_unwind_delete_mmap(struct quadd_extabs_mmap *mmap); + +int +quadd_is_ex_entry_exist(struct pt_regs *regs, + unsigned long addr, + struct task_struct *task); #endif /* __QUADD_EH_UNWIND_H__ */ diff --git a/drivers/misc/tegra-profiler/hrt.c b/drivers/misc/tegra-profiler/hrt.c index c03d53832c6b..22ba32ef37c3 100644 --- a/drivers/misc/tegra-profiler/hrt.c +++ b/drivers/misc/tegra-profiler/hrt.c @@ -23,9 +23,12 @@ #include <linux/ptrace.h> #include <linux/interrupt.h> #include <linux/err.h> +#include <linux/nsproxy.h> +#include <clocksource/arm_arch_timer.h> #include <asm/cputype.h> #include <asm/irq_regs.h> +#include <asm/arch_timer.h> #include <linux/tegra_profiler.h> @@ -90,7 +93,7 @@ static void init_hrtimer(struct quadd_cpu_context *cpu_ctx) cpu_ctx->hrtimer.function = hrtimer_handler; } -u64 quadd_get_time(void) +static inline u64 get_posix_clock_monotonic_time(void) { struct timespec ts; @@ -98,6 +101,25 @@ u64 quadd_get_time(void) return timespec_to_ns(&ts); } +static inline u64 get_arch_time(struct timecounter *tc) +{ + cycle_t value; + const struct cyclecounter *cc = tc->cc; + + value = cc->read(cc); + return cyclecounter_cyc2ns(cc, value); +} + +u64 quadd_get_time(void) +{ + struct timecounter *tc = hrt.tc; + + if (tc) + return get_arch_time(tc); + else + return get_posix_clock_monotonic_time(); +} + static void put_header(void) { int nr_events = 0, max_events = QUADD_MAX_COUNTERS; @@ -137,6 +159,8 @@ static void put_header(void) hdr->reserved = 0; hdr->extra_length = 0; + hdr->reserved |= hrt.unw_method << QUADD_HDR_UNW_METHOD_SHIFT; + if (pmu) nr_events += pmu->get_current_events(events, max_events); @@ -161,6 +185,31 @@ void quadd_put_sample(struct quadd_record_data *data, atomic64_inc(&hrt.counter_samples); } +static void +put_sched_sample(struct task_struct *task, int is_sched_in) +{ + unsigned int cpu, flags; + struct quadd_record_data record; + struct quadd_sched_data *s = &record.sched; + + record.record_type = QUADD_RECORD_TYPE_SCHED; + + cpu = quadd_get_processor_id(NULL, &flags); + s->cpu = cpu; + s->lp_mode = (flags & QUADD_CPUMODE_TEGRA_POWER_CLUSTER_LP) ? 1 : 0; + + s->sched_in = is_sched_in ? 1 : 0; + s->time = quadd_get_time(); + s->pid = task->pid; + + s->reserved = 0; + + s->data[0] = 0; + s->data[1] = 0; + + quadd_put_sample(&record, NULL, 0); +} + static int get_sample_data(struct quadd_sample_data *sample, struct pt_regs *regs, struct task_struct *task) @@ -236,7 +285,7 @@ read_all_sources(struct pt_regs *regs, struct task_struct *task) int i, vec_idx = 0, bt_size = 0; int nr_events = 0, nr_positive_events = 0; struct pt_regs *user_regs; - struct quadd_iovec vec[4]; + struct quadd_iovec vec[5]; struct hrt_event_value events[QUADD_MAX_COUNTERS]; u32 events_extra[QUADD_MAX_COUNTERS]; @@ -253,22 +302,15 @@ read_all_sources(struct pt_regs *regs, struct task_struct *task) if (atomic_read(&cpu_ctx->nr_active) == 0) return; - if (!task) { - pid_t pid; - struct pid *pid_s; - struct quadd_thread_data *t_data; - - t_data = &cpu_ctx->active_thread; - pid = t_data->pid; + if (!task) + task = current; - rcu_read_lock(); - pid_s = find_vpid(pid); - if (pid_s) - task = pid_task(pid_s, PIDTYPE_PID); + rcu_read_lock(); + if (!task_nsproxy(task)) { rcu_read_unlock(); - if (!task) - return; + return; } + rcu_read_unlock(); if (ctx->pmu && ctx->pmu_info.active) nr_events += read_source(ctx->pmu, regs, @@ -300,6 +342,7 @@ read_all_sources(struct pt_regs *regs, struct task_struct *task) s->reserved = 0; if (ctx->param.backtrace) { + cc->unw_method = hrt.unw_method; bt_size = quadd_get_user_callchain(user_regs, cc, ctx, task); if (!bt_size && !user_mode(regs)) { @@ -311,23 +354,26 @@ read_all_sources(struct pt_regs *regs, struct task_struct *task) #else cc->cs_64 = 0; #endif - bt_size += quadd_callchain_store(cc, pc); + bt_size += quadd_callchain_store(cc, pc, + QUADD_UNW_TYPE_KCTX); } if (bt_size > 0) { int ip_size = cc->cs_64 ? sizeof(u64) : sizeof(u32); + int nr_types = DIV_ROUND_UP(bt_size, 8); vec[vec_idx].base = cc->cs_64 ? (void *)cc->ip_64 : (void *)cc->ip_32; vec[vec_idx].len = bt_size * ip_size; vec_idx++; + + vec[vec_idx].base = cc->types; + vec[vec_idx].len = nr_types * sizeof(cc->types[0]); + vec_idx++; } extra_data |= cc->unw_method << QUADD_SED_UNW_METHOD_SHIFT; - - if (cc->unw_method == QUADD_UNW_METHOD_EHT || - cc->unw_method == QUADD_UNW_METHOD_MIXED) - s->reserved |= cc->unw_rc << QUADD_SAMPLE_URC_SHIFT; + s->reserved |= cc->unw_rc << QUADD_SAMPLE_URC_SHIFT; } s->callchain_nr = bt_size; @@ -434,6 +480,8 @@ void __quadd_task_sched_in(struct task_struct *prev, */ if (is_profile_process(task)) { + put_sched_sample(task, 1); + add_active_thread(cpu_ctx, task->pid, task->tgid); atomic_inc(&cpu_ctx->nr_active); @@ -484,6 +532,8 @@ void __quadd_task_sched_out(struct task_struct *prev, if (ctx->pmu) ctx->pmu->stop(); } + + put_sched_sample(prev, 0); } } @@ -553,6 +603,15 @@ int quadd_hrt_start(void) } } + if (extra & QUADD_PARAM_EXTRA_BT_MIXED) + hrt.unw_method = QUADD_UNW_METHOD_MIXED; + else if (extra & QUADD_PARAM_EXTRA_BT_UNWIND_TABLES) + hrt.unw_method = QUADD_UNW_METHOD_EHT; + else if (extra & QUADD_PARAM_EXTRA_BT_FP) + hrt.unw_method = QUADD_UNW_METHOD_FP; + else + hrt.unw_method = QUADD_UNW_METHOD_NONE; + if (ctx->pl310) ctx->pl310->start(); @@ -618,6 +677,7 @@ struct quadd_hrt_ctx *quadd_hrt_init(struct quadd_ctx *ctx) hrt.ma_period = 0; atomic64_set(&hrt.counter_samples, 0); + hrt.tc = arch_timer_get_timecounter(); hrt.cpu_ctx = alloc_percpu(struct quadd_cpu_context); if (!hrt.cpu_ctx) diff --git a/drivers/misc/tegra-profiler/hrt.h b/drivers/misc/tegra-profiler/hrt.h index f209dc116119..b32f037dba3e 100644 --- a/drivers/misc/tegra-profiler/hrt.h +++ b/drivers/misc/tegra-profiler/hrt.h @@ -39,6 +39,8 @@ struct quadd_cpu_context { atomic_t nr_active; }; +struct timecounter; + struct quadd_hrt_ctx { struct quadd_cpu_context * __percpu cpu_ctx; u64 sample_period; @@ -54,9 +56,12 @@ struct quadd_hrt_ctx { unsigned long vm_size_prev; unsigned long rss_size_prev; + + struct timecounter *tc; + unsigned int unw_method; }; -#define QUADD_HRT_MIN_FREQ 110 +#define QUADD_HRT_MIN_FREQ 100 #define QUADD_U32_MAX (~(__u32)0) diff --git a/drivers/misc/tegra-profiler/main.c b/drivers/misc/tegra-profiler/main.c index dc7fddc3e39c..555cc259a0ce 100644 --- a/drivers/misc/tegra-profiler/main.c +++ b/drivers/misc/tegra-profiler/main.c @@ -158,7 +158,17 @@ static inline int is_event_supported(struct source_info *si, int event) return 0; } -static int set_parameters(struct quadd_parameters *p, uid_t *debug_app_uid) +static int +validate_freq(unsigned int freq) +{ + if (capable(CAP_SYS_ADMIN)) + return freq >= 100 && freq <= 100000; + else + return freq == 100 || freq == 1000 || freq == 10000; +} + +static int +set_parameters(struct quadd_parameters *p, uid_t *debug_app_uid) { int i, err; int pmu_events_id[QUADD_MAX_COUNTERS]; @@ -168,9 +178,10 @@ static int set_parameters(struct quadd_parameters *p, uid_t *debug_app_uid) struct task_struct *task; unsigned int extra; - if (ctx.param.freq != 100 && ctx.param.freq != 1000 && - ctx.param.freq != 10000) + if (!validate_freq(p->freq)) { + pr_err("%s: incorrect frequency: %u\n", __func__, p->freq); return -EINVAL; + } ctx.param.freq = p->freq; ctx.param.ma_freq = p->ma_freq; @@ -284,6 +295,9 @@ static int set_parameters(struct quadd_parameters *p, uid_t *debug_app_uid) if (extra & QUADD_PARAM_EXTRA_BT_FP) pr_info("unwinding: frame pointers\n"); + if (extra & QUADD_PARAM_EXTRA_BT_MIXED) + pr_info("unwinding: mixed mode\n"); + quadd_unwind_start(task); pr_info("New parameters have been applied\n"); @@ -416,6 +430,10 @@ static void get_capabilities(struct quadd_comm_cap *cap) extra |= QUADD_COMM_CAP_EXTRA_SUPPORT_AARCH64; extra |= QUADD_COMM_CAP_EXTRA_SPECIAL_ARCH_MMAP; extra |= QUADD_COMM_CAP_EXTRA_UNWIND_MIXED; + extra |= QUADD_COMM_CAP_EXTRA_UNW_ENTRY_TYPE; + + if (ctx.hrt->tc) + extra |= QUADD_COMM_CAP_EXTRA_USE_ARCH_TIMER; cap->reserved[QUADD_COMM_CAP_IDX_EXTRA] = extra; } @@ -436,9 +454,16 @@ void quadd_get_state(struct quadd_module_state *state) } static int -set_extab(struct quadd_extables *extabs) +set_extab(struct quadd_extables *extabs, + struct quadd_extabs_mmap *mmap) +{ + return quadd_unwind_set_extab(extabs, mmap); +} + +static void +delete_mmap(struct quadd_extabs_mmap *mmap) { - return quadd_unwind_set_extab(extabs); + quadd_unwind_delete_mmap(mmap); } static struct quadd_comm_control_interface control = { @@ -448,6 +473,7 @@ static struct quadd_comm_control_interface control = { .get_capabilities = get_capabilities, .get_state = quadd_get_state, .set_extab = set_extab, + .delete_mmap = delete_mmap, }; static int __init quadd_module_init(void) diff --git a/drivers/misc/tegra-profiler/quadd.h b/drivers/misc/tegra-profiler/quadd.h index 9de52c773722..c25835e29f09 100644 --- a/drivers/misc/tegra-profiler/quadd.h +++ b/drivers/misc/tegra-profiler/quadd.h @@ -25,6 +25,7 @@ struct event_data; struct quadd_comm_data_interface; struct quadd_hrt_ctx; struct quadd_module_state; +struct quadd_arch_info; struct quadd_event_source_interface { int (*enable)(void); @@ -35,6 +36,7 @@ struct quadd_event_source_interface { int (*set_events)(int *events, int size); int (*get_supported_events)(int *events, int max_events); int (*get_current_events)(int *events, int max_events); + struct quadd_arch_info * (*get_arch)(void); }; struct source_info { diff --git a/drivers/misc/tegra-profiler/quadd_proc.c b/drivers/misc/tegra-profiler/quadd_proc.c index d7464f18a951..12f5fc90cd91 100644 --- a/drivers/misc/tegra-profiler/quadd_proc.c +++ b/drivers/misc/tegra-profiler/quadd_proc.c @@ -24,6 +24,7 @@ #include "quadd.h" #include "version.h" #include "quadd_proc.h" +#include "arm_pmu.h" #define YES_NO(x) ((x) ? "yes" : "no") @@ -58,6 +59,10 @@ static int show_capabilities(struct seq_file *f, void *offset) struct quadd_comm_cap *cap = &ctx->cap; struct quadd_events_cap *event = &cap->events_cap; unsigned int extra = cap->reserved[QUADD_COMM_CAP_IDX_EXTRA]; + struct quadd_arch_info *arch = NULL; + + if (ctx->pmu) + arch = ctx->pmu->get_arch(); seq_printf(f, "pmu: %s\n", YES_NO(cap->pmu)); @@ -69,7 +74,7 @@ static int show_capabilities(struct seq_file *f, void *offset) seq_printf(f, "l2 cache: %s\n", YES_NO(cap->l2_cache)); if (cap->l2_cache) { - seq_printf(f, "multiple l2 events: %s\n", + seq_printf(f, "multiple l2 events: %s\n", YES_NO(cap->l2_multiple_events)); } @@ -89,6 +94,19 @@ static int show_capabilities(struct seq_file *f, void *offset) YES_NO(extra & QUADD_COMM_CAP_EXTRA_SPECIAL_ARCH_MMAP)); seq_printf(f, "support mixed unwinding mode: %s\n", YES_NO(extra & QUADD_COMM_CAP_EXTRA_UNWIND_MIXED)); + seq_printf(f, "information about unwind entry: %s\n", + YES_NO(extra & QUADD_COMM_CAP_EXTRA_UNW_ENTRY_TYPE)); + seq_printf(f, "use arch timer: %s\n", + YES_NO(extra & QUADD_COMM_CAP_EXTRA_USE_ARCH_TIMER)); + + seq_puts(f, "\n"); + + if (arch) { + seq_printf(f, "pmu arch: %s\n", + arch->name); + seq_printf(f, "pmu arch version: %d\n", + arch->ver); + } seq_puts(f, "\n"); seq_puts(f, "Supported events:\n"); diff --git a/drivers/misc/tegra-profiler/version.h b/drivers/misc/tegra-profiler/version.h index b44426de71c8..1225073e2219 100644 --- a/drivers/misc/tegra-profiler/version.h +++ b/drivers/misc/tegra-profiler/version.h @@ -18,7 +18,7 @@ #ifndef __QUADD_VERSION_H #define __QUADD_VERSION_H -#define QUADD_MODULE_VERSION "1.64" +#define QUADD_MODULE_VERSION "1.75" #define QUADD_MODULE_BRANCH "Dev" #endif /* __QUADD_VERSION_H */ diff --git a/include/linux/tegra_profiler.h b/include/linux/tegra_profiler.h index f1d47520cfd9..3ba50b60b342 100644 --- a/include/linux/tegra_profiler.h +++ b/include/linux/tegra_profiler.h @@ -19,8 +19,8 @@ #include <linux/ioctl.h> -#define QUADD_SAMPLES_VERSION 25 -#define QUADD_IO_VERSION 11 +#define QUADD_SAMPLES_VERSION 29 +#define QUADD_IO_VERSION 12 #define QUADD_IO_VERSION_DYNAMIC_RB 5 #define QUADD_IO_VERSION_RB_MAX_FILL_COUNT 6 @@ -29,6 +29,7 @@ #define QUADD_IO_VERSION_GET_MMAP 9 #define QUADD_IO_VERSION_BT_UNWIND_TABLES 10 #define QUADD_IO_VERSION_UNWIND_MIXED 11 +#define QUADD_IO_VERSION_EXTABLES_MMAP 12 #define QUADD_SAMPLE_VERSION_THUMB_MODE_FLAG 17 #define QUADD_SAMPLE_VERSION_GROUP_SAMPLES 18 @@ -37,6 +38,10 @@ #define QUADD_SAMPLE_VERSION_SUPPORT_IP64 23 #define QUADD_SAMPLE_VERSION_SPECIAL_MMAP 24 #define QUADD_SAMPLE_VERSION_UNWIND_MIXED 25 +#define QUADD_SAMPLE_VERSION_UNW_ENTRY_TYPE 26 +#define QUADD_SAMPLE_VERSION_USE_ARCH_TIMER 27 +#define QUADD_SAMPLE_VERSION_SCHED_SAMPLES 28 +#define QUADD_SAMPLE_VERSION_HDR_UNW_METHOD 29 #define QUADD_MAX_COUNTERS 32 #define QUADD_MAX_PROCESS 64 @@ -123,6 +128,7 @@ enum quadd_record_type { QUADD_RECORD_TYPE_HEADER, QUADD_RECORD_TYPE_POWER_RATE, QUADD_RECORD_TYPE_ADDITIONAL_SAMPLE, + QUADD_RECORD_TYPE_SCHED, }; enum quadd_event_source { @@ -145,6 +151,7 @@ enum { QUADD_UNW_METHOD_FP = 0, QUADD_UNW_METHOD_EHT, QUADD_UNW_METHOD_MIXED, + QUADD_UNW_METHOD_NONE, }; #define QUADD_SAMPLE_URC_SHIFT 1 @@ -164,6 +171,9 @@ enum { QUADD_URC_SPARE_ENCODING, QUADD_URC_UNSUPPORTED_PR, QUADD_URC_PC_INCORRECT, + QUADD_URC_LEVEL_TOO_DEEP, + QUADD_URC_FP_INCORRECT, + QUADD_URC_MAX, }; #define QUADD_SED_IP64 (1 << 0) @@ -171,6 +181,14 @@ enum { #define QUADD_SED_UNW_METHOD_SHIFT 1 #define QUADD_SED_UNW_METHOD_MASK (0x07 << QUADD_SED_UNW_METHOD_SHIFT) +enum { + QUADD_UNW_TYPE_FP = 0, + QUADD_UNW_TYPE_UT, + QUADD_UNW_TYPE_LR_FP, + QUADD_UNW_TYPE_LR_UT, + QUADD_UNW_TYPE_KCTX, +}; + struct quadd_sample_data { u64 ip; u32 pid; @@ -224,6 +242,18 @@ struct quadd_additional_sample { u16 extra_length; }; +struct quadd_sched_data { + u32 pid; + u64 time; + + u32 cpu:6, + lp_mode:1, + sched_in:1, + reserved:24; + + u32 data[2]; +}; + enum { QM_DEBUG_SAMPLE_TYPE_SCHED_IN = 1, QM_DEBUG_SAMPLE_TYPE_SCHED_OUT, @@ -257,6 +287,9 @@ struct quadd_debug_data { #define QUADD_HEADER_MAGIC 0x1122 +#define QUADD_HDR_UNW_METHOD_SHIFT 0 +#define QUADD_HDR_UNW_METHOD_MASK (0x07 << QUADD_HDR_UNW_METHOD_SHIFT) + struct quadd_header_data { u16 magic; u16 version; @@ -288,6 +321,7 @@ struct quadd_record_data { struct quadd_debug_data debug; struct quadd_header_data hdr; struct quadd_power_rate_data power_rate; + struct quadd_sched_data sched; struct quadd_additional_sample additional_sample; }; } __aligned(4); @@ -354,6 +388,8 @@ enum { #define QUADD_COMM_CAP_EXTRA_SUPPORT_AARCH64 (1 << 4) #define QUADD_COMM_CAP_EXTRA_SPECIAL_ARCH_MMAP (1 << 5) #define QUADD_COMM_CAP_EXTRA_UNWIND_MIXED (1 << 6) +#define QUADD_COMM_CAP_EXTRA_UNW_ENTRY_TYPE (1 << 7) +#define QUADD_COMM_CAP_EXTRA_USE_ARCH_TIMER (1 << 8) struct quadd_comm_cap { u32 pmu:1, @@ -401,6 +437,12 @@ struct quadd_sec_info { u64 length; }; +enum { + QUADD_EXT_IDX_EXTAB_OFFSET = 0, + QUADD_EXT_IDX_EXIDX_OFFSET = 1, + QUADD_EXT_IDX_MMAP_VM_START = 2, +}; + struct quadd_extables { u64 vm_start; u64 vm_end; |