benchmarking - Measuring processor ticks in C -


i wanted calculate difference in execution time when executing same code inside function. surprise, however, clock difference 0 when use clock()/clock_t start , stop timer. mean clock()/clock_t not return number of clicks processor spent on task?

after bit of searching, seemed me clock_gettime() return more fine grained results. , indeed does, instead end abitrary number of nano(?)seconds. gives hint of difference in execution time, it's hardly accurate how many clicks difference amounts to. have find out?

#include <math.h> #include <stdio.h> #include <time.h>  #define m_pi_double (m_pi * 2)  void rotatetest(const float *x, const float *c, float *result) {     float rotationfraction = *x / *c;     *result = m_pi_double * rotationfraction; }  int main() {      int i;     long test_total = 0;     int test_count = 1000000;     struct timespec test_time_begin;     struct timespec test_time_end;      float r = 50.f;     float c = 2 * m_pi * r;     float x = 3.f;     float result_inline = 0.f;     float result_function = 0.f;      (i = 0; < test_count; i++) {         clock_gettime(clock_process_cputime_id, &test_time_begin);         float rotationfraction = x / c;         result_inline = m_pi_double * rotationfraction;         clock_gettime(clock_process_cputime_id, &test_time_end);         test_total += test_time_end.tv_nsec - test_time_begin.tv_nsec;     }      printf("inline clocks %li, avg %f (result %f)\n", test_total, test_total / (float)test_count,result_inline);      (i = 0; < test_count; i++) {         clock_gettime(clock_process_cputime_id, &test_time_begin);         rotatetest(&x, &c, &result_function);         clock_gettime(clock_process_cputime_id, &test_time_end);         test_total += test_time_end.tv_nsec - test_time_begin.tv_nsec;     }      printf("function clocks %li, avg %f (result %f)\n", test_total, test_total / (float)test_count, result_inline);      return 0; } 

i using gcc version 4.8.4 on linux 3.13.0-37-generic (linux mint 16)

first of all: mentioned in comments, clocking single run of execution 1 other no good. if goes down hill, call getting time might take longer actual execution of operation.

please clock multiple runs of operation (including warm phase swapped in) , calculate average running times.

clock() isn't guaranteed monotonic. isn't number of processor clicks (whatever define be) program has run. best way describe result clock() "a best effort estimation of time 1 of cpus has spent on calculation current process". benchmarking purposes clock() useless.

as per specification:

the clock() function returns implementation's best approximation processor time used process since beginning of implementation-dependent time related process invocation.

and additionally

to determine time in seconds, value returned clock() should divided value of macro clocks_per_sec.

so, if call clock() more resolution, out of luck.

for profiling/benchmarking, should --if possible-- use 1 of performance clocks available on modern hardware. prime candidates probably

edit: question references clock_process_cputime_id, linux' way of exposing tsc.

if (or both) available depends on hardware in operating system specific.


Comments

Popular posts from this blog

qt - Using float or double for own QML classes -

Create Outlook appointment via C# .Net -

ios - Swift Array Resetting Itself -