Friday, June 8, 2012

Performance function/scope timers, for Windows

In my last post I showed you have to create an easy-to-use timer for Mac OSX programs that would calculate the time it took a function or scope to execute. Here's the Windows version I promised.

Here's the main.cpp:
// main.cpp
#include <iostream>

#include "PerfTimer.h"

void someFcn()
{
    CreateFcnTimer;
    
    for (unsigned int i = 0; i < 10000000; ++i )
    {
        int j = 5;
        ++j;
    }
}

int main(int argc, const char * argv[])
{
    {
        ScopeTimer("MyCustomScope");
        
        someFcn();
    }
    
    return 0;
}

And here's the timer class itself.
// PerfTimer.h
#pragma once
 
#include <iostream>
#include <windows.h>
 
class PerfTimer
{
public:
    PerfTimer(const char* timerName)
    {
        m_name = std::string(timerName);

 QueryPerformanceFrequency(&m_frequency);
         
        // Start the timer
        QueryPerformanceCounter(&m_startTime);
    }
     
    ~PerfTimer()
    {
        LARGE_INTEGER endTime;
 QueryPerformanceCounter(&endTime);
         
        // Output timer duration in milliseconds
        std::cout << m_name.c_str() << ": " << ((endTime.QuadPart - m_startTime.QuadPart) * 1000.0 / m_frequency.QuadPart) << std::endl;
    }
     
private:
    LARGE_INTEGER m_startTime;
    LARGE_INTEGER m_frequency;
    std::string m_name;
};
 
#ifndef SHIPPING_BUILD
 
    #define CreateFcnTimer PerfTimer __FUNCTION__timer(__FUNCTION__)
 
    // The 'x' in 'xtimer' is replaced by whatever string is passed into the scope timer
    // but the quotations are removed. So if the calling code is ScopeTimer("stuff")
    // then 'xtimer' becomes 'timerstuff'. This makes sure every timer name is unique
    // within the scoe it is used.
    #define ScopeTimer(x) PerfTimer xtimer(x)
 
#else
    // The timers can be shut off in shipping builds by replacing their standard
    // macros with these.
    #define CreateFcnTimer ((void)0)
    #define ScopeTimer(x) ((void)0)
 
#endif

The only difference between this Windows version and the Mac version is that you need to include Windows.h, and you have to query the frequency using QueryPerformanceFrequency, and factor the frequency into your calculations.

High performance function/scope timers, and some more macro love (Mac OS X)

Looking for a Windows version of this timer? It's in the following post, click here to go to it.

In my last post I talked about the fact that macros aren't always evil, and that reminded me of a function/scope timer I made for LEGO Universe, which took advantage of some fun macros. So today I'll show you how to create that timer.

This timer outputs its elapsed time whenever it leaves scope, so by using one at the beginning of your functions your can output how long a function call takes. Or you can use one anywhere else where a scope if defined.

Below is the main.cpp:

// main.cpp
#include <iostream>
#include <math.h>

#include "PerfTimer.h"

void someFcn()
{
    CreateFcnTimer;
    // someFcn timer is now in scope
    
    for (unsigned int i = 0; i < 10000000; ++i )
    {
        int j = sqrt(5);
        ++j;
    }
}

int main(int argc, const char * argv[])
{
    {
        ScopeTimer("MyCustomScope");
        // MyCustomScope timer is now in scope

        someFcn();
        // someFcn timer created by 'CreateFcnTimer' is now out of scope
    }
    // MyCustomScope timer is now out of scope
    
    return 0;
}

Inside of someFcn()you see a simple macro call: CreateFcnTimer; 
This macro creates a timer for us that will automatically output the name of the function and the time it took to execute it, it prints this information when the timer goes out of scope, which is immediately after the function is done executing.

Inside of main()you see a set of brackets used to create a scope, and within that scope we call a macro named ScopeTimer, and this time we give it whatever name we'd like, in this case it's "MyCustomScope". This timer goes out of scope when the scope that we defined with the brackets ends. You can use this technique to subdivide larger sections of code and time how long they take to execute.

The output from this program, when I ran it on my iMac was:

someFcn: 101.953
MyCustomScope: 102.08

The output is in milliseconds, but the timer is accurate down to 1 microsecond (1/1000th of a millisecond). You can see here that someFcn() was called within the unnamed scope we created in main(), and the timer within someFcn went out of scope after someFcn() finished executing, and then "MyCustomScope" went out of scope soon after. The results here show us that it takes about 102.08 milliseconds to loop 10 million times and create an integer on the stack, set it to the square root of 5, increment it, then remove it from the stack, during each loop. The results from the scope timer show that someFcn()'s code took up most of the time, and the remaining 0.127 milliseconds was for the call to someFcn(), the function timer within, and then exit someFcn(). We can deduce this because the 102.08 milliseconds recorded by the function timer began after the function timer was already created and the call to someFcn had been made and was already in progress. 0.127 seems like a lot, most of that time is in the creation of the timer within someFcn and the outputting of timer info to the console, not for the function call. If function calls took 0.127 milliseconds then computers would be performing like they were about 35 years ago.

Here's the version of the performance timer that will run on Mac OSX. In my next post I will post the code for Windows. I'm not trying to draw this out into two posts, it's just really late and I haven't written and tested the Windows version yet (the difference will be only in the timer class internally used).


// PerfTimer.h
#pragma once

#include <iostream>
#include <mach/mach.h>
#include <mach/mach_time.h>
#include <unistd.h>

class PerfTimer
{
public:
    PerfTimer(const char* timerName)
    {
        m_name = std::string(timerName);
        
        // Start the timer
        m_startTime = mach_absolute_time();
    }
    
    ~PerfTimer()
    {
        uint64_t endTime = mach_absolute_time();
        uint64_t elapsed = (endTime - m_startTime);
    
        static mach_timebase_info_data_t sTimebaseInfo;
        if ( sTimebaseInfo.denom == 0 )
        {
            mach_timebase_info(&sTimebaseInfo);
        }
        
        uint64_t elapsed_nano = elapsed * sTimebaseInfo.numer / sTimebaseInfo.denom;
        
        // Output timer duration in milliseconds
        std::cout << m_name.c_str() << ": " << elapsed_nano / 1000000.0 << " milliseconds" << std::endl;
    }
    
private:
    uint64_t m_startTime;
    std::string m_name;
};

#ifndef SHIPPING_BUILD

    #define CreateFcnTimer PerfTimer __func__timer(__func__)

    // The 'x' in 'xtimer' is replaced by whatever string is passed into the scope timer
    // but the quotations are removed. So if the calling code is ScopeTimer("stuff")
    // then 'xtimer' becomes 'timerstuff'. This makes sure every timer name is unique
    // within the scoe it is used.
    #define ScopeTimer(x) PerfTimer xtimer(x)

#else
    // The timers can be shut off in shipping builds by replacing their standard
    // macros with these.
    #define CreateFcnTimer ((void)0)
    #define ScopeTimer(x) ((void)0)

#endif
First I'll explain how the timer itself works, then how the macros work. When the timer is constructed it records the name passed into the constructor so we can output it later, and then records the exact system time. Then, when it destructs it queries the system time, and then outputs the difference between the time it was constructed and destructed. The timer is basically a simple object that records when it is created and when it leaves scope or is destructed. The CreateFcnTimer macro works by creating a PerfTimer instance, and it names the variable the function name followed by the word 'timer'. This ensures that the name of the timer variable is fairly unique so it's unlikely to have collisions with other times you might use within the same scope. Also, the name of the function is the name passed into the constructor, and when the timer destructs it's that name that will output next to the time elapsed. The MyCustomScope macro is similar to CreateFcnTimer except that it does not use the function name automatically so you can name it whatever you choose. Similarly to CreateFcnTimer, it uses the name you pass into the macro to generate the name of the timer instance variable, so you would only have a conflict when using this macro if you happened to create another timer with the exact same name in the same scope.

And if you notice, I've used the technique from my previous post to ensure that these timers will not run on a shipping build, which would slow down your program a good amount with the creation of the timers and especially the console output of them.

Saturday, June 2, 2012

C++ Macros aren't completely evil, I promise

As a C++ programmer you'll undoubtedly hear of many macro horror stories. Macros can be easily abused, can bloat your program, and can be a debugging nightmare. But, macros are not completely evil, when used properly they can be clean and useful.

The other day I was updating a logging system. For those that don't know, a logging system (or logger) is usually a system created along side a program or engine that allows programmers to log messages, or errors, and to stream their output in different ways, like to a debug console, or to a file, or even an email in some rare cases (like a critical program failing in a bad way and a live product team needs to know immediately). Anyway, I was updating a logging system, and I needed two things that it did not yet offer:
  • I needed the parameters being logged to have no expense within the shipped product (the product that was in the customer's hands). 
    • Without macros there are ways people try to limit logging commands:
      • Multiple function definitions for your logging commands, the shipped version of these do nothing when called.
        #define SHIPPING_BUILD
        #include <iostream>
        
        


namespace logging


        {
        

#ifndef SHIPPING_BUILD
            void print( int i )


            {


                std::cout << i;


            }
        

#else


            void print ( int i ) { }
        

#endif
        

}
        
        


int expensiveFcnReturningAnInt()


        {
            // The compiler would actually optimize this so it would


            // be extremely cheap to call, but you get the idea, this


            // function would be something you don't want to happen


            // in a shipping build if used only for logging.

 
            return 10 * 9 * 8 * 7 * 6 * 5 * 4 * 3 * 2 * 1;


        }
        
        


int main(int argc, const char * argv[])


        {


            logging::print( expensiveFcnReturningAnInt() );


        
            return 0;


        }
      • In the above example the expense of the logging itself is removed, there is nothing logged to the debug console, or written to a log file. But the main issue here, is that expensiveFcnReturningAnInt() was still called, which is expensive, completely unnecessary, and useless
    • With macros you can affect the code at compile time. In this example the logging statement will no longer even exist in the program for shipping builds.
      • #define SHIPPING_BUILD
        #include <iostream>
        
        #ifndef SHIPPING_BUILD
            #define INTERNAL_PRINT(x) logging::print(x)
        #else
            #define INTERNAL_PRINT(x) ((void)0)
        #endif
        
        namespace logging
        {
            void print( int i )
            {
                std::cout << i;
            }
        }
        
        int expensiveFcnReturningAnInt()
        {
            // The compiler would actually optimize this so it would
            // be extremely cheap to call, but you get the idea, this
            // function would be something you don't want to happen
            // in a shipping build if used only for logging.
            return 10 * 9 * 8 * 7 * 6 * 5 * 4 * 3 * 2 * 1;
        }
        
        int main(int argc, const char * argv[])
        {
            INTERNAL_PRINT( expensiveFcnReturningAnInt() );
            
            return 0;
        }
        
    • In this sample, the call inside of main() to INTERNAL_PRINT is completely optimized out of the program when SHIPPING_BUILD is defined.
  • I needed a simple, one-line logging command that accepted a stream of information, just like an std::stringstream.
    • Without a macro, you would be forced to make an entire class to try and get you close to this functionality, and even then you might not get it. 
    • With a macro you can have exactly what you want:
    • #define SHIPPING_BUILD
      #include <iostream>
      
      #ifndef SHIPPING_BUILD
      #include <sstream>
      
      #define PRINT_STREAM(x) \
          { \
              std::stringstream strStream; \
              strStream << x; \
              logging::print(strStream.str().c_str()); \
          } 
      #else         
          #define PRINT_STREAM(x) ((void)0) 
      #endif
      
      namespace logging
      {
          void print( const char* message )
          {
              std::cout << message;
          }
      }
      
      int expensiveFcnReturningAnInt()
      {
          // The compiler would actually optimize this so it would
          // be extremely cheap to call, but you get the idea, this
          // function would be something you don't want to happen
          // in a shipping build if used only for logging.
          return 10 * 9 * 8 * 7 * 6 * 5 * 4 * 3 * 2 * 1;
      }
      
      int main(int argc, const char * argv[])
      {
          PRINT_STREAM( "Expensive fcn result is: " << expensiveFcnReturningAnInt() );
          
          return 0;
      }
I would recommend splitting your logging stuff into its own header, or class even. The examples above have everything in the same file/snippet to keep things easier to read for this blog.