Current and correct CPU clock and asm("cpuid")
Peter Senna Tschudin
peter.senna at gmail.com
Mon Oct 3 14:57:27 EDT 2011
Hi Peter,
Thanks for the repply. I've realized that I have no need to transform
the arbitrary number in something like seconds because I'm interested
in comparing them.
Is it safe to say that if I do not make the division by
CPU_THOUSAND_HZ I have the number of clock cycles that were "spent"
between the calls to getticks()(including some for getticks() itself)?
Please see below.
Thank you!
Peter
On Mon, Oct 3, 2011 at 1:17 PM, Peter Teoh <htmldeveloper at gmail.com> wrote:
> why not u put a sleep(1) here like this:
>
>> ticks tickBegin, tickEnd;
>> tickBegin = getticks();
>>
>
> sleep(1);
>
>>
>> tickEnd = getticks();
>> double time = (tickEnd-tickBegin)/CPU_THOUSAND_HZ;
>>
> Then u know that it is reading the TSC values for 1 sec. And by
> running the same program on different system u will get different
> "time" values, and then u divide by that values for THAT system - so
> that eventually running the same program on different system will get
> u the same difference of ticks, which in our present case is "1".
> After this "normalization", you can run your system with any timing
> difference, and maximum achievable resolution is of course 1 sec. Is
> that what u wanted?
That sounds as great idea but:
- may dynamic clock rate and multiple CPU cores mess with your proposal?
- How precise is sleep about sleeping for 1 second?
- I hope that the out of order execution mechanism of the CPU gets
frustrated with your proposal and runs the instructions in the order
we're expecting (tickBegin-> sleep-> tickEnd). How can we be sure that
the instructions were run in correct order?
>
> BTW, modern OS does not use TSC any more, but yes, your assembly can
> still access and read TSC. But the OS usually read from HPET (which
> is how sleep(1) calculate the time differences) and to read the HPET
> here is a link:
>
> http://www.fftw.org/cycle.h
Looking cycle.h I found this familiar code(starts on line 216):
/*----------------------------------------------------------------*/
/*
* X86-64 cycle counter
*/
static __inline__ ticks getticks(void)
{
unsigned a, d;
asm volatile("rdtsc" : "=a" (a), "=d" (d));
return ((ticks)a) | (((ticks)d) << 32);
}
The code found on cycle.h is so similar to the one I was using that I
guess that both codes were written by the same author. I got the code
I'm using from the paper at:
http://people.virginia.edu/~chg5w/page3/assets/MeasuringUnix.pdf
>
> And query the OS via:
>
> cat /sys/devices/system/clocksource/clocksource0/*
> hpet acpi_pm
> hpet
>
> and u can see from above that "tsc" is missing from my system.
> (linux kernel is 2.6.35-22)
>
> For TSC, I am not sure what is the highest resolution u can go, but in
> a modern SoC chip, with 600Mhz core speed (speaking of PowerPC
> http://en.wikipedia.org/wiki/PowerPC_e500), the fastest execution is
> 600 millions instruction per sec, assuming the instruction is one insn
> per clock. With this kind of speed, TSC is a very bad for measuring
> time differences.
This is my mistake. I did not told you about my tests will run only on x86 arch.
>
> On Mon, Oct 3, 2011 at 9:27 AM, Peter Senna Tschudin
> <peter.senna at gmail.com> wrote:
>> Dear list members,
>>
>> I'm following:
>>
>> http://people.virginia.edu/~chg5w/page3/assets/MeasuringUnix.pdf
>>
>> And I'm trying to measure executing time of simple operations with RDTSC.
>>
>> See the code below:
>>
>> #include <stdio.h>
>> #define CPU_THOUSAND_HZ 800000
>> typedef unsigned long long ticks;
>> static __inline__ ticks getticks(void) {
>> unsigned a, d;
>> asm("cpuid");
>> asm volatile("rdtsc" : "=a" (a), "=d" (d));
>> return (((ticks)a) | (((ticks)d) << 32));
>> }
>>
>> void main() {
>> ticks tickBegin, tickEnd;
>> tickBegin = getticks();
>>
>> // code to time
>>
>> tickEnd = getticks();
>> double time = (tickEnd-tickBegin)/CPU_THOUSAND_HZ;
>>
>> printf ("%Le\n", time);
>> }
>>
>> How can the C code detects the correct value for CPU_THOUSAND_HZ? The
>> problems I see are:
>> - It is needed to collect the information for the CPU that will run
>> the process. On Core i7 processors, different cores can run at
>> different clock speed at same time.
>> - If the clock changes during the execution of process, what should
>> it do? When is the best time for collecting the clock speed?
>>
>> The authors of the paper are not sure about the effects of
>> "asm("cpuid");" Does it ensure that the entire process will run on the
>> same CPU, and will serialize it avoiding out of order execution by the
>> CPU?
>>
>> Thank you very much! :-)
>>
>> Peter
>>
>>
>> --
>> Peter Senna Tschudin
>> peter.senna at gmail.com
>> gpg id: 48274C36
>>
>> _______________________________________________
>> Kernelnewbies mailing list
>> Kernelnewbies at kernelnewbies.org
>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>
>
>
>
> --
> Regards,
> Peter Teoh
>
--
Peter Senna Tschudin
peter.senna at gmail.com
gpg id: 48274C36
More information about the Kernelnewbies
mailing list