Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Ingo Molnar: "As a first remark I'd like to note that the way to build perf tooling has been simplified and sped up, in the future it should be enough for you to build perf via: cd tools/perf/ make install (ie without the -j option.) The build system will figure out the number of CPUs and will do a parallel build+install. The various build system inefficiencies and breakages Linus reported against the v3.12 pull request should now be resolved - please (re-)report any remaining annoyances or bugs. Main changes on the perf kernel side: * Performance optimizations: . perf ring-buffer code optimizations, by Peter Zijlstra . perf ring-buffer code optimizations, by Oleg Nesterov . x86 NMI call-stack processing optimizations, by Peter Zijlstra . perf context-switch optimizations, by Peter Zijlstra . perf sampling speedups, by Peter Zijlstra . x86 Intel PEBS processing speedups, by Peter Zijlstra * Enhanced hardware support: . for Intel Ivy Bridge-EP uncore PMUs, by Zheng Yan . for Haswell transactions, by Andi Kleen, Peter Zijlstra * Core perf events code enhancements and fixes by Oleg Nesterov: . for uprobes, if fork() is called with pending ret-probes . for uprobes platform support code * New ABI details by Andi Kleen: . Report x86 Haswell TSX transaction abort cost as weight Main changes on the perf tooling side (some of these tooling changes utilize the above kernel side changes): * 'perf report/top' enhancements: . Convert callchain children list to rbtree, greatly reducing the time taken for callchain processing, from Namhyung Kim. . Add new COMM infrastructure, further improving histogram processing, from Frédéric Weisbecker, one fix from Namhyung Kim. . Add /proc/kcore based live-annotation improvements, including build-id cache support, multi map 'call' instruction navigation fixes, kcore address validation, objdump workarounds. From Adrian Hunter. . Show progress on histogram collapsing, that can take a long time, from Namhyung Kim. . Add --max-stack option to limit callchain stack scan in 'top' and 'report', improving callchain processing when reducing the stack depth is an option, from Waiman Long. . Add new option --ignore-vmlinux for perf top, from Willy Tarreau. * 'perf trace' enhancements: . 'perf trace' now can can use a 'perf probe' dynamic tracepoints to hook into the userspace -> kernel pathname copy so that it can map fds to pathnames without reading /proc/pid/fd/ symlinks. From Arnaldo Carvalho de Melo. . Show VFS path associated with fd in live sessions, using a 'vfs_getname' 'perf probe' created dynamic tracepoint or by looking at /proc/pid/fd, from Arnaldo Carvalho de Melo. . Add 'trace' beautifiers for lots of syscall arguments, from Arnaldo Carvalho de Melo. . Implement more compact 'trace' output by suppressing zeroed args, from Arnaldo Carvalho de Melo. . Show thread COMM by default in 'trace', from Arnaldo Carvalho de Melo. . Add option to show full timestamp in 'trace', from David Ahern. . Add 'record' command in 'trace', to record raw_syscalls:*, from David Ahern. . Add summary option to dump syscall statistics in 'trace', from David Ahern. . Improve error messages in 'trace', providing hints about system configuration steps needed for using it, from Ramkumar Ramachandra. . 'perf trace' now emits hints as to why tracing is not possible, helping the user to setup the system to allow tracing in the desired permission granularity, telling if the problem is due to debugfs not being mounted or with not enough permission for !root, /proc/sys/kernel/perf_event_paranoit value, etc. From Arnaldo Carvalho de Melo. * 'perf record' enhancements: . Check maximum frequency rate for record/top, emitting better error messages, from Jiri Olsa. . 'perf record' code cleanups, from David Ahern. . Improve write_output error message in 'perf record', from Adrian Hunter. . Allow specifying B/K/M/G unit to the --mmap-pages arguments, from Jiri Olsa. . Fix command line callchain attribute tests to handle the new -g/--call-chain semantics, from Arnaldo Carvalho de Melo. * 'perf kvm' enhancements: . Disable live kvm command if timerfd is not supported, from David Ahern. . Fix detection of non-core features, from David Ahern. * 'perf list' enhancements: . Add usage to 'perf list', from David Ahern. . Show error in 'perf list' if tracepoints not available, from Pekka Enberg. * 'perf probe' enhancements: . Support "$vars" meta argument syntax for local variables, allowing asking for all possible variables at a given probe point to be collected when it hits, from Masami Hiramatsu. * 'perf sched' enhancements: . Address the root cause of that 'perf sched' stack initialization build slowdown, by programmatically setting a big array after moving the global variable back to the stack. Fix from Adrian Hunter. * 'perf script' enhancements: . Set up output options for in-stream attributes, from Adrian Hunter. . Print addr by default for BTS in 'perf script', from Adrian Juntmer * 'perf stat' enhancements: . Improved messages when doing profiling in all or a subset of CPUs using a workload as the session delimitator, as in: 'perf stat --cpu 0,2 sleep 10s' from Arnaldo Carvalho de Melo. . Add units to nanosec-based counters in 'perf stat', from David Ahern. . Remove bogus info when using 'perf stat' -e cycles/instructions, from Ramkumar Ramachandra. * 'perf lock' enhancements: . 'perf lock' fixes and cleanups, from Davidlohr Bueso. * 'perf test' enhancements: . Fixup PERF_SAMPLE_TRANSACTION handling in sample synthesizing and 'perf test', from Adrian Hunter. . Clarify the "sample parsing" test entry, from Arnaldo Carvalho de Melo. . Consider PERF_SAMPLE_TRANSACTION in the "sample parsing" test, from Arnaldo Carvalho de Melo. . Memory leak fixes in 'perf test', from Felipe Pena. * 'perf bench' enhancements: . Change the procps visible command-name of invididual benchmark tests plus cleanups, from Ingo Molnar. * Generic perf tooling infrastructure/plumbing changes: . Separating data file properties from session, code reorganization from Jiri Olsa. . Fix version when building out of tree, as when using one of these: $ make help | grep perf perf-tar-src-pkg - Build perf-3.12.0.tar source tarball perf-targz-src-pkg - Build perf-3.12.0.tar.gz source tarball perf-tarbz2-src-pkg - Build perf-3.12.0.tar.bz2 source tarball perf-tarxz-src-pkg - Build perf-3.12.0.tar.xz source tarball $ from David Ahern. . Enhance option parse error message, showing just the help lines of the options affected, from Namhyung Kim. . libtraceevent updates from upstream trace-cmd repo, from Steven Rostedt. . Always use perf_evsel__set_sample_bit to set sample_type, from Adrian Hunter. . Memory and mmap leak fixes from Chenggang Qin. . Assorted build fixes for from David Ahern and Jiri Olsa. . Speed up and prettify the build system, from Ingo Molnar. . Implement addr2line directly using libbfd, from Roberto Vitillo. . Separate the GTK support in a separate libperf-gtk.so DSO, that is only loaded when --gtk is specified, from Namhyung Kim. . perf bash completion fixes and improvements from Ramkumar Ramachandra. . Support for Openembedded/Yocto -dbg packages, from Ricardo Ribalda Delgado. And lots and lots of other fixes and code reorganizations that did not make it into the list, see the shortlog, diffstat and the Git log for details!" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (300 commits) uprobes: Fix the memory out of bound overwrite in copy_insn() uprobes: Fix the wrong usage of current->utask in uprobe_copy_process() perf tools: Remove unneeded include perf record: Remove post_processing_offset variable perf record: Remove advance_output function perf record: Refactor feature handling into a separate function perf trace: Don't relookup fields by name in each sample perf tools: Fix version when building out of tree perf evsel: Ditch evsel->handler.data field uprobes: Export write_opcode() as uprobe_write_opcode() uprobes: Introduce arch_uprobe->ixol uprobes: Kill module_init() and module_exit() uprobes: Move function declarations out of arch perf/x86/intel: Add Ivy Bridge-EP uncore IRP box support perf/x86/intel/uncore: Add filter support for IvyBridge-EP QPI boxes perf: Factor out strncpy() in perf_event_mmap_event() tools/perf: Add required memory barriers perf: Fix arch_perf_out_copy_user default perf: Update a stale comment perf: Optimize perf_output_begin() -- address calculation ...
author: Linus Torvalds <torvalds@linux-foundation.org> 2013-11-12 10:06:34 +0900
committer: Linus Torvalds <torvalds@linux-foundation.org> 2013-11-12 10:06:34 +0900
commit: ad5d69899e52792671c1aa6c7360464c7edfe09c (patch)
tree: 21833c1fdab4b3cf791d4fdc86dd578e4a620514 /tools/perf/builtin-bench.c
parent: ef1417a5a6a400dbc1a2f44da716ab146a29ddc4 (diff)
parent: caea6cf52139116e43e615d87fcbf9823e197fdf (diff)
1 files changed, 125 insertions, 120 deletions
diff --git a/tools/perf/builtin-bench.c b/tools/perf/builtin-bench.c
index 77298bf892b8..e47f90cc7b98 100644
--- a/tools/perf/builtin-bench.c
+++ b/tools/perf/builtin-bench.c
@@ -1,21 +1,18 @@
 /*
- *
  * builtin-bench.c
  *
- * General benchmarking subsystem provided by perf
+ * General benchmarking collections provided by perf
  *
  * Copyright (C) 2009, Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
- *
  */
 
 /*
+ * Available benchmark collection list:
  *
- * Available subsystem list:
- *  sched ... scheduler and IPC mechanism
+ *  sched ... scheduler and IPC performance
  *  mem   ... memory access performance
- *
+ *  numa  ... NUMA scheduling and MM performance
  */
-
 #include "perf.h"
 #include "util/util.h"
 #include "util/parse-options.h"
@@ -25,112 +22,92 @@
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <sys/prctl.h>
 
-struct bench_suite {
-	const char *name;
-	const char *summary;
-	int (*fn)(int, const char **, const char *);
+typedef int (*bench_fn_t)(int argc, const char **argv, const char *prefix);
+
+struct bench {
+	const char	*name;
+	const char	*summary;
+	bench_fn_t	fn;
 };
-						\
-/* sentinel: easy for help */
-#define suite_all { "all", "Test all benchmark suites", NULL }
-
-#ifdef LIBNUMA_SUPPORT
-static struct bench_suite numa_suites[] = {
-	{ "mem",
-	  "Benchmark for NUMA workloads",
-	  bench_numa },
-	suite_all,
-	{ NULL,
-	  NULL,
-	  NULL                  }
+
+#ifdef HAVE_LIBNUMA_SUPPORT
+static struct bench numa_benchmarks[] = {
+	{ "mem",	"Benchmark for NUMA workloads",			bench_numa		},
+	{ "all",	"Test all NUMA benchmarks",			NULL			},
+	{ NULL,		NULL,						NULL			}
 };
 #endif
 
-static struct bench_suite sched_suites[] = {
-	{ "messaging",
-	  "Benchmark for scheduler and IPC mechanisms",
-	  bench_sched_messaging },
-	{ "pipe",
-	  "Flood of communication over pipe() between two processes",
-	  bench_sched_pipe      },
-	suite_all,
-	{ NULL,
-	  NULL,
-	  NULL                  }
+static struct bench sched_benchmarks[] = {
+	{ "messaging",	"Benchmark for scheduling and IPC",		bench_sched_messaging	},
+	{ "pipe",	"Benchmark for pipe() between two processes",	bench_sched_pipe	},
+	{ "all",	"Test all scheduler benchmarks",		NULL			},
+	{ NULL,		NULL,						NULL			}
 };
 
-static struct bench_suite mem_suites[] = {
-	{ "memcpy",
-	  "Simple memory copy in various ways",
-	  bench_mem_memcpy },
-	{ "memset",
-	  "Simple memory set in various ways",
-	  bench_mem_memset },
-	suite_all,
-	{ NULL,
-	  NULL,
-	  NULL             }
+static struct bench mem_benchmarks[] = {
+	{ "memcpy",	"Benchmark for memcpy()",			bench_mem_memcpy	},
+	{ "memset",	"Benchmark for memset() tests",			bench_mem_memset	},
+	{ "all",	"Test all memory benchmarks",			NULL			},
+	{ NULL,		NULL,						NULL			}
 };
 
-struct bench_subsys {
-	const char *name;
-	const char *summary;
-	struct bench_suite *suites;
+struct collection {
+	const char	*name;
+	const char	*summary;
+	struct bench	*benchmarks;
 };
 
-static struct bench_subsys subsystems[] = {
-#ifdef LIBNUMA_SUPPORT
-	{ "numa",
-	  "NUMA scheduling and MM behavior",
-	  numa_suites },
+static struct collection collections[] = {
+	{ "sched",	"Scheduler and IPC benchmarks",		sched_benchmarks	},
+	{ "mem",	"Memory access benchmarks",			mem_benchmarks		},
+#ifdef HAVE_LIBNUMA_SUPPORT
+	{ "numa",	"NUMA scheduling and MM benchmarks",		numa_benchmarks		},
 #endif
-	{ "sched",
-	  "scheduler and IPC mechanism",
-	  sched_suites },
-	{ "mem",
-	  "memory access performance",
-	  mem_suites },
-	{ "all",		/* sentinel: easy for help */
-	  "all benchmark subsystem",
-	  NULL },
-	{ NULL,
-	  NULL,
-	  NULL       }
+	{ "all",	"All benchmarks",				NULL			},
+	{ NULL,		NULL,						NULL			}
 };
 
-static void dump_suites(int subsys_index)
+/* Iterate over all benchmark collections: */
+#define for_each_collection(coll) \
+	for (coll = collections; coll->name; coll++)
+
+/* Iterate over all benchmarks within a collection: */
+#define for_each_bench(coll, bench) \
+	for (bench = coll->benchmarks; bench->name; bench++)
+
+static void dump_benchmarks(struct collection *coll)
 {
-	int i;
+	struct bench *bench;
 
-	printf("# List of available suites for %s...\n\n",
-	       subsystems[subsys_index].name);
+	printf("\n        # List of available benchmarks for collection '%s':\n\n", coll->name);
 
-	for (i = 0; subsystems[subsys_index].suites[i].name; i++)
-		printf("%14s: %s\n",
-		       subsystems[subsys_index].suites[i].name,
-		       subsystems[subsys_index].suites[i].summary);
+	for_each_bench(coll, bench)
+		printf("%14s: %s\n", bench->name, bench->summary);
 
 	printf("\n");
-	return;
 }
 
 static const char *bench_format_str;
+
+/* Output/formatting style, exported to benchmark modules: */
 int bench_format = BENCH_FORMAT_DEFAULT;
 
 static const struct option bench_options[] = {
-	OPT_STRING('f', "format", &bench_format_str, "default",
-		    "Specify format style"),
+	OPT_STRING('f', "format", &bench_format_str, "default", "Specify format style"),
 	OPT_END()
 };
 
 static const char * const bench_usage[] = {
-	"perf bench [<common options>] <subsystem> <suite> [<options>]",
+	"perf bench [<common options>] <collection> <benchmark> [<options>]",
 	NULL
 };
 
 static void print_usage(void)
 {
+	struct collection *coll;
 	int i;
 
 	printf("Usage: \n");
@@ -138,11 +115,10 @@ static void print_usage(void)
 		printf("\t%s\n", bench_usage[i]);
 	printf("\n");
 
-	printf("# List of available subsystems...\n\n");
+	printf("        # List of all available benchmark collections:\n\n");
 
-	for (i = 0; subsystems[i].name; i++)
-		printf("%14s: %s\n",
-		       subsystems[i].name, subsystems[i].summary);
+	for_each_collection(coll)
+		printf("%14s: %s\n", coll->name, coll->summary);
 	printf("\n");
 }
 
@@ -159,44 +135,74 @@ static int bench_str2int(const char *str)
 	return BENCH_FORMAT_UNKNOWN;
 }
 
-static void all_suite(struct bench_subsys *subsys)	  /* FROM HERE */
+/*
+ * Run a specific benchmark but first rename the running task's ->comm[]
+ * to something meaningful:
+ */
+static int run_bench(const char *coll_name, const char *bench_name, bench_fn_t fn,
+		     int argc, const char **argv, const char *prefix)
 {
-	int i;
+	int size;
+	char *name;
+	int ret;
+
+	size = strlen(coll_name) + 1 + strlen(bench_name) + 1;
+
+	name = zalloc(size);
+	BUG_ON(!name);
+
+	scnprintf(name, size, "%s-%s", coll_name, bench_name);
+
+	prctl(PR_SET_NAME, name);
+	argv[0] = name;
+
+	ret = fn(argc, argv, prefix);
+
+	free(name);
+
+	return ret;
+}
+
+static void run_collection(struct collection *coll)
+{
+	struct bench *bench;
 	const char *argv[2];
-	struct bench_suite *suites = subsys->suites;
 
 	argv[1] = NULL;
 	/*
 	 * TODO:
-	 * preparing preset parameters for
+	 *
+	 * Preparing preset parameters for
 	 * embedded, ordinary PC, HPC, etc...
-	 * will be helpful
+	 * would be helpful.
 	 */
-	for (i = 0; suites[i].fn; i++) {
-		printf("# Running %s/%s benchmark...\n",
-		       subsys->name,
-		       suites[i].name);
+	for_each_bench(coll, bench) {
+		if (!bench->fn)
+			break;
+		printf("# Running %s/%s benchmark...\n", coll->name, bench->name);
 		fflush(stdout);
 
-		argv[1] = suites[i].name;
-		suites[i].fn(1, argv, NULL);
+		argv[1] = bench->name;
+		run_bench(coll->name, bench->name, bench->fn, 1, argv, NULL);
 		printf("\n");
 	}
 }
 
-static void all_subsystem(void)
+static void run_all_collections(void)
 {
-	int i;
-	for (i = 0; subsystems[i].suites; i++)
-		all_suite(&subsystems[i]);
+	struct collection *coll;
+
+	for_each_collection(coll)
+		run_collection(coll);
 }
 
 int cmd_bench(int argc, const char **argv, const char *prefix __maybe_unused)
 {
-	int i, j, status = 0;
+	struct collection *coll;
+	int ret = 0;
 
 	if (argc < 2) {
-		/* No subsystem specified. */
+		/* No collection specified. */
 		print_usage();
 		goto end;
 	}
@@ -206,7 +212,7 @@ int cmd_bench(int argc, const char **argv, const char *prefix __maybe_unused)
 
 	bench_format = bench_str2int(bench_format_str);
 	if (bench_format == BENCH_FORMAT_UNKNOWN) {
-		printf("Unknown format descriptor:%s\n", bench_format_str);
+		printf("Unknown format descriptor: '%s'\n", bench_format_str);
 		goto end;
 	}
 
@@ -216,52 +222,51 @@ int cmd_bench(int argc, const char **argv, const char *prefix __maybe_unused)
 	}
 
 	if (!strcmp(argv[0], "all")) {
-		all_subsystem();
+		run_all_collections();
 		goto end;
 	}
 
-	for (i = 0; subsystems[i].name; i++) {
-		if (strcmp(subsystems[i].name, argv[0]))
+	for_each_collection(coll) {
+		struct bench *bench;
+
+		if (strcmp(coll->name, argv[0]))
 			continue;
 
 		if (argc < 2) {
-			/* No suite specified. */
-			dump_suites(i);
+			/* No bench specified. */
+			dump_benchmarks(coll);
 			goto end;
 		}
 
 		if (!strcmp(argv[1], "all")) {
-			all_suite(&subsystems[i]);
+			run_collection(coll);
 			goto end;
 		}
 
-		for (j = 0; subsystems[i].suites[j].name; j++) {
-			if (strcmp(subsystems[i].suites[j].name, argv[1]))
+		for_each_bench(coll, bench) {
+			if (strcmp(bench->name, argv[1]))
 				continue;
 
 			if (bench_format == BENCH_FORMAT_DEFAULT)
-				printf("# Running %s/%s benchmark...\n",
-				       subsystems[i].name,
-				       subsystems[i].suites[j].name);
+				printf("# Running '%s/%s' benchmark:\n", coll->name, bench->name);
 			fflush(stdout);
-			status = subsystems[i].suites[j].fn(argc - 1,
-							    argv + 1, prefix);
+			ret = run_bench(coll->name, bench->name, bench->fn, argc-1, argv+1, prefix);
 			goto end;
 		}
 
 		if (!strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
-			dump_suites(i);
+			dump_benchmarks(coll);
 			goto end;
 		}
 
-		printf("Unknown suite:%s for %s\n", argv[1], argv[0]);
-		status = 1;
+		printf("Unknown benchmark: '%s' for collection '%s'\n", argv[1], argv[0]);
+		ret = 1;
 		goto end;
 	}
 
-	printf("Unknown subsystem:%s\n", argv[0]);
-	status = 1;
+	printf("Unknown collection: '%s'\n", argv[0]);
+	ret = 1;
 
 end:
-	return status;
+	return ret;
 }
author	Linus Torvalds <torvalds@linux-foundation.org>	2013-11-12 10:06:34 +0900
committer	Linus Torvalds <torvalds@linux-foundation.org>	2013-11-12 10:06:34 +0900
commit	ad5d69899e52792671c1aa6c7360464c7edfe09c (patch)
tree	21833c1fdab4b3cf791d4fdc86dd578e4a620514 /tools/perf/builtin-bench.c
parent	ef1417a5a6a400dbc1a2f44da716ab146a29ddc4 (diff)
parent	caea6cf52139116e43e615d87fcbf9823e197fdf (diff)