In perf manual, I find two metrics to obtain llc misses: PERF_COUNT_HW_CACHE_MISSES Cache misses. Usually this indicates Last Level Cache misses;
PERF_COUNT_HW_INSTRUCTIONS PERF_COUNT_HW_CACHE_REFERENCES PERF_COUNT_HW_CACHE_MISSES 特征是其后端必须使用硬件 PMU 来监控。比如 PERF_COUNT_HW_INSTRUCTIONS,你要想知道运行期间所产生的 instruction 数,就必须借助硬件才能实现。 3.2.2 software 与软件相关的事件。典型事件有: PERF_COUNT_SW_PAGE_FAULTS PER...
104 PERF_COUNT_HW_CACHE_REFERENCES = 2, 105 PERF_COUNT_HW_CACHE_MISSES = 3, 106 PERF_COUNT_HW_BRANCH_INSTRUCTIONS = 4, 107 PERF_COUNT_HW_BRANCH_MISSES = 5, 108 PERF_COUNT_HW_BUS_CYCLES = 6, 109 }; 到了真正的函数x86_pmu_hw_config,我们发现竟然在这里还有 event->hw.config = ARCH...
PERF_COUNT_HW_CACHE_REFERENCES =2, PERF_COUNT_HW_CACHE_MISSES =3, PERF_COUNT_HW_BRANCH_INSTRUCTIONS =4, PERF_COUNT_HW_BRANCH_MISSES =5, PERF_COUNT_HW_BUS_CYCLES =6, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND =7, PERF_COUNT_HW_STALLED_CYCLES_BACKEND =8, PERF_COUNT_HW_REF_CPU_CYCLES =9...
你可以使用perf命令配合具体的硬件事件来收集CPU的cache misses和cycles信息。例如,要收集cache misses的...
PERF_COUNT_HW_CACHE_MISSES PERF_COUNT_HW_BRANCH_INSTRUCTIONS PERF_COUNT_HW_BRANCH_MISSES PERF_COUNT_HW_BUS_CYCLES PERF_COUNT_HW_STALLED_CYCLES_FRONTEND (since Linux 3.0) PERF_COUNT_HW_STALLED_CYCLES_BACKEND (since Linux 3.0) PERF_COUNT_HW_REF_CPU_CYCLES (since Linux 3.3) ...
perf_event_open({type=PERF_TYPE_HARDWARE, size=0 /* PERF_ATTR_SIZE_??? */, config=PERF_COUNT_HW_CACHE_MISSES, ...}, 0, -1, -1, PERF_FLAG_FD_CLOEXEC) = 6 perf_event_open({type=PERF_TYPE_HARDWARE, size=0 /* PERF_ATTR_SIZE_??? */, **config=PERF_COUNT_HW_BRANCH_INSTRUCT...
hw/hardware显示支持的硬件事件相关,如: perf list hardwaresw/software显示支持的软件事件列表: perf list swcache/hwcache显示硬件cache相关事件列表: perf list cachepmu显示支持的PMU事件列表: perf list pmutracepoint显示支持的所有tracepoint列表,这个列表就比较庞大: perf list tracepoint ...
Cache accesses. Usually this indicates Last Level Cache accesses but this may vary depending on your CPU. This may include prefetches and coherency messages; again this depends on the design of your CPU. PERF_COUNT_HW_CACHE_MISSES Cache misses. Usually this indicates Last Level Cache misses; ...
Cache-misses: cache 失效的次数。 perf-top: 对于一个指定的性能事件(默认是CPU周期),显示消耗最多的函数或指令 perf top [-e <EVENT> | --event=EVENT] [<options>] 要用于实时分析各个函数在某个性能事件上的热度,能够快速的定位热点函数,包括应用程序函数、模块函数与内核函数,甚至能够定位到热点指令。