Finding CPU cycles with PMU registers
Anubhav Sharma
anubhav at cse.iitb.ac.in
Thu Apr 14 11:47:10 EDT 2016
Hello,
I'm trying to learn how the Performance Monitoring Unit (PMU) works in
Intel Core x86 systems. For example, I want to create a kernel module
that counts the number of CPU cycles of a process, the PID of which is
provided as a parameter. The same thing can be done by using perf:
perf stat -e cycles ./testProgram
The module init function has the following program flow:
1. I register a nmi handler function 'pmc_handler' to detect when a
counter overflows.
apic_write(APIC_LVTPC, APIC_DM_NMI);
register_nmi_handler(NMI_LOCAL, pmc_handler, 0, "perf_handler");
2. For every cpu, I write 0x0 to various MSRs.
for_each_online_cpu(cpu)
{
wrmsr64_safe_on_cpu(cpu, IA32_PERF_GLOBAL_CTRL, 0x0);
wrmsr64_safe_on_cpu(cpu, IA32_PERF_GLOBAL_OVF_CTRL, 0x0);
wrmsr64_safe_on_cpu(cpu, IA32_PERF_FIXED_CTR_CTRL, 0x0);
wrmsr64_safe_on_cpu(cpu, IA32_PMC0, 0x0);
wrmsr64_safe_on_cpu(cpu, IA32_PERFEVTSEL0, 0x0);
}
3. For every online cpu:
a. Set the 0th and 62nd bit of IA32_PERF_GLOBAL_OVF_CTRL to set
overflow bit.
wrmsr64_safe_on_cpu(cpu, IA32_PERF_GLOBAL_OVF_CTRL, (0x1 << 0)
| ((u64) 0x1 << 62));
b. Program IA32_PERFEVTSEL0 such that it measures unhalted CPU
cycles with interrupts and overflows
enabled in user and os mode.
wrmsr64_safe_on_cpu(cpu, IA32_PERFEVTSEL0, (u64) INST_UNHALTED
| INT_ENABLE | COUNTER_ENABLE | USR_MODE | OS_MODE);
c. Write (u64)-999 to the IA32_PMC0 counter.
wrmsr64_safe_on_cpu(cpu, IA32_PMC0, counterVal);
I'm not sure how step 3c works. Probably, it is done to make
sure that the counter does overflow more often. I could really use some
explanation on this part.
Since the module is supposed to count CPU cycles of a particular
process, we need to know when the process is running and when it is
waiting.
To get that information, I have added a hook in the __schedule()
function in kernel/sched/core.c. To know when a process terminates, I've
added
a hook in do_exit() function of kernel/exit.c.
Now whenever our process is scheduled to a CPU, I enable the PMU
counters by writing 0x1 to IA32_PERF_GLOBAL_CTRL on every cpu. Similarly,
whenever our process is scheduled out, I disable the PMU counters by
writing 0x0 to IA32_PERF_GLOBAL_CTRL. This setup allows to count CPU
cycles of
only the process we are interested in.
At this point we have a system where the counters will be enabled when
our process is running on a CPU. Also, whenever the counter overflows,
the pmc_handler
function is called (because that's what we did in Step 1).
In the pmc_handler function, I do the following:
1. Disable the counters temporarily before reading from them.
wrmsr64_safe_on_cpu(cpu, IA32_PERF_GLOBAL_CTRL, 0x0);
2. Increase total count by the value of the counter.
totalCount += read_msrs_on_cpu(cpu, IA32_PMC0);
3. Enable counters again.
wrmsr64_safe_on_cpu(cpu, IA32_PERF_GLOBAL_CTRL, 0x1);
When the process terminates, i.e. when the exit hook is called for our
process, I output the value of totalCount.
This setup seems correct to me but the value of totalCount and the value
what perf gives me are vastly different. (56,000 [my module] v/s 700,000
[perf])
(For perf I'm reading r003c which is the same as INST_UNHALTED(0x003c)
as per Intel Core performance manual. )
Can anyone please help me with this? I've been stuck at this for quite
some time. I suspect there could be a conceptual flaw in the whole setup.
Thanks and regards,
Anubhav Sharma
(Kernel Novice)
More information about the Kernelnewbies
mailing list