Friday, June 8, 2012

High performance function/scope timers, and some more macro love (Mac OS X)

Looking for a Windows version of this timer? It's in the following post, click here to go to it.

In my last post I talked about the fact that macros aren't always evil, and that reminded me of a function/scope timer I made for LEGO Universe, which took advantage of some fun macros. So today I'll show you how to create that timer.

This timer outputs its elapsed time whenever it leaves scope, so by using one at the beginning of your functions your can output how long a function call takes. Or you can use one anywhere else where a scope if defined.

Below is the main.cpp:

// main.cpp
#include <iostream>
#include <math.h>

#include "PerfTimer.h"

void someFcn()
    // someFcn timer is now in scope
    for (unsigned int i = 0; i < 10000000; ++i )
        int j = sqrt(5);

int main(int argc, const char * argv[])
        // MyCustomScope timer is now in scope

        // someFcn timer created by 'CreateFcnTimer' is now out of scope
    // MyCustomScope timer is now out of scope
    return 0;

Inside of someFcn()you see a simple macro call: CreateFcnTimer; 
This macro creates a timer for us that will automatically output the name of the function and the time it took to execute it, it prints this information when the timer goes out of scope, which is immediately after the function is done executing.

Inside of main()you see a set of brackets used to create a scope, and within that scope we call a macro named ScopeTimer, and this time we give it whatever name we'd like, in this case it's "MyCustomScope". This timer goes out of scope when the scope that we defined with the brackets ends. You can use this technique to subdivide larger sections of code and time how long they take to execute.

The output from this program, when I ran it on my iMac was:

someFcn: 101.953
MyCustomScope: 102.08

The output is in milliseconds, but the timer is accurate down to 1 microsecond (1/1000th of a millisecond). You can see here that someFcn() was called within the unnamed scope we created in main(), and the timer within someFcn went out of scope after someFcn() finished executing, and then "MyCustomScope" went out of scope soon after. The results here show us that it takes about 102.08 milliseconds to loop 10 million times and create an integer on the stack, set it to the square root of 5, increment it, then remove it from the stack, during each loop. The results from the scope timer show that someFcn()'s code took up most of the time, and the remaining 0.127 milliseconds was for the call to someFcn(), the function timer within, and then exit someFcn(). We can deduce this because the 102.08 milliseconds recorded by the function timer began after the function timer was already created and the call to someFcn had been made and was already in progress. 0.127 seems like a lot, most of that time is in the creation of the timer within someFcn and the outputting of timer info to the console, not for the function call. If function calls took 0.127 milliseconds then computers would be performing like they were about 35 years ago.

Here's the version of the performance timer that will run on Mac OSX. In my next post I will post the code for Windows. I'm not trying to draw this out into two posts, it's just really late and I haven't written and tested the Windows version yet (the difference will be only in the timer class internally used).

// PerfTimer.h
#pragma once

#include <iostream>
#include <mach/mach.h>
#include <mach/mach_time.h>
#include <unistd.h>

class PerfTimer
    PerfTimer(const char* timerName)
        m_name = std::string(timerName);
        // Start the timer
        m_startTime = mach_absolute_time();
        uint64_t endTime = mach_absolute_time();
        uint64_t elapsed = (endTime - m_startTime);
        static mach_timebase_info_data_t sTimebaseInfo;
        if ( sTimebaseInfo.denom == 0 )
        uint64_t elapsed_nano = elapsed * sTimebaseInfo.numer / sTimebaseInfo.denom;
        // Output timer duration in milliseconds
        std::cout << m_name.c_str() << ": " << elapsed_nano / 1000000.0 << " milliseconds" << std::endl;
    uint64_t m_startTime;
    std::string m_name;


    #define CreateFcnTimer PerfTimer __func__timer(__func__)

    // The 'x' in 'xtimer' is replaced by whatever string is passed into the scope timer
    // but the quotations are removed. So if the calling code is ScopeTimer("stuff")
    // then 'xtimer' becomes 'timerstuff'. This makes sure every timer name is unique
    // within the scoe it is used.
    #define ScopeTimer(x) PerfTimer xtimer(x)

    // The timers can be shut off in shipping builds by replacing their standard
    // macros with these.
    #define CreateFcnTimer ((void)0)
    #define ScopeTimer(x) ((void)0)

First I'll explain how the timer itself works, then how the macros work. When the timer is constructed it records the name passed into the constructor so we can output it later, and then records the exact system time. Then, when it destructs it queries the system time, and then outputs the difference between the time it was constructed and destructed. The timer is basically a simple object that records when it is created and when it leaves scope or is destructed. The CreateFcnTimer macro works by creating a PerfTimer instance, and it names the variable the function name followed by the word 'timer'. This ensures that the name of the timer variable is fairly unique so it's unlikely to have collisions with other times you might use within the same scope. Also, the name of the function is the name passed into the constructor, and when the timer destructs it's that name that will output next to the time elapsed. The MyCustomScope macro is similar to CreateFcnTimer except that it does not use the function name automatically so you can name it whatever you choose. Similarly to CreateFcnTimer, it uses the name you pass into the macro to generate the name of the timer instance variable, so you would only have a conflict when using this macro if you happened to create another timer with the exact same name in the same scope.

And if you notice, I've used the technique from my previous post to ensure that these timers will not run on a shipping build, which would slow down your program a good amount with the creation of the timers and especially the console output of them.

No comments:

Post a Comment