Friday, April 20, 2012

C++11: Strongly-typed enum

Enums have always been a useful feature of C++, allowing the programmer to easily make a list of values that are related to one another, with names that are easy to understand. In fact, enums are at the heart of the messaging system I talk about in this post. However, the classic C++ enum is subject to many issues as well that make it less useful than it could be, and this is why in C++11 they have introduced a strongly-typed enum, called an  enum class.

1.) Implicit conversion to an integer.
Enums are not type-safe. They do prevent you from directly assigning one type of enum value to another, but there is nothing stopping you from casting an integer directly to an enum type.

enum PrimaryColor
{
    Red = 0,
    Blue,
    Yellow
};

enum FavoriteColor
{
    Green = 0,
    Orange,
    Purple
};

PrimaryColor pC = Red;
FavoriteColor fC = Orange;

pC = fC;    // Error, you cannot assign one enum value directly to another.

pC = Green; // Same error

bool primaryColorIsGreater = (pC >= Green);   // Bad! This is allowed, but probably isn't intended.

Notice on the final line we are able to make a comparison between a primary color and a favorite color. Green isn't even a primary color, so this is probably not intended.

If you want something type safe, the enum class comes to the rescue!

enum class PrimaryColor
{
    Red = 0,
    Green,
    Blue
};

enum class FavoriteColor
{
    Green = 0, 
    Orange,
    Purple
};

PrimaryColor pC = Red;
FavoriteColor fC = Orange;

pC = fC;    // Error, you cannot assign one enum value directly to another.

pC = Green; // Same error

bool primaryColorIsGreater = (pC >= FavoriteColor::Green);  // Error, you cannot directly compare types from different enums.


2.) Scope
Enums are not strongly scoped. The enumerators of an enum have scope in the same scope as the enum itself. For example, what if in the example above we wanted FavoriteColor to have some of the colors as PrimaryColor?

enum PrimaryColor
{
    Red = 0,
    Blue,
    Yellow
};

enum FavoriteColor
{
    Green = 0,
    Red,   // Error, 'Red' is already defined in the PrimaryColor enum
    Blue   // Error, 'Blue' is already defined in the PrimaryColor enum
};

We're not able to do this with standard enums because they're not strongly scoped, however, the new enum class will let us do this.

enum class PrimaryColor
{
    Red = 0,
    Green,
    Blue
};

enum class FavoriteColor
{
    Green = 0, 
    Red,    // Ok
    Blue    // Ok
};

3.) Inability to specify underlying type
The underlying type of an enum is not portable, because different compilers will use different underlying types for an enum. For example, if you're using an enum directly in a packet of information, the sender and receiver may have a different perception of what size that enum value takes.

enum Version
{
    Version1 = 1,
    Version2 = 2,
    Version3 = 3
};

struct Packet
{
    Version version;     // Bad! This size can vary by implementation.
    
    // More data here
}

You can workaround this, but it's not ideal (hence calling it a 'workaround'):
struct Packet
{
    unsigned char version;     // This works, but requires casting
    
    // More data here
}

The workaround is ugly, we shouldn't have to store a version number as a char, and require the user on the other end to understand what type of data is stored in that char, and force them to cast it back to a integer (or unsigned integer). The enum class solves this issue for us, by allowing us to specify the underlying type of the enum, so we can guarantee what size it will be.

enum class Version : unsigned
{
    Version1 = 1,
    Version2 = 2,
    Version3 = 3
};

struct Packet
{
    Version version;    // This is now safe to do, we know that 'version' is an unsigned int.
    
    // More data here
}

Also, because the size of a standard enum differs by implementation, using values that assume signed or unsigned can be unsafe. Take, for example, the enum below:

enum MyEnum
{
   Value1 = 1,
   Value2 = 2,
   ValueBig = 0xFFFFFFF0U
};

Note that the last value has been explicitly set to an unsigned int. Because of the differing implementations of enums by different compilers, the resulting value of ValueBig also differs depending on what you're compiling with. This means that your code is not longer portable, and will only work as intended in some compilers. For compilers that treat enums as unsigned ValueBig will be 4294967280, for those that treat is as signed ValueBig will be -16. Even worse, there are compilers that treat ValueBig as 4294967280, but when comparing it against -1 will tell you that ValueBig is less than -1. The enum class solves this problem by allowing the programmer to specify the type. If we want 'ValueBig' to be 0xFFFFFFF0U then we just make sure that our  enum class  is specified to be an unsigned int:

enum class MyEnum : unsigned
{
   Value1 = 1,
   Value2 = 2,
   ValueBig = 0xFFFFFFF0U   // ValueBig is now guaranteed to be 4294967280
};

In Visual Studio 2011 the enum class is signed by default. I'm willing to bet that the C++11 standard defines this to be the required default, to prevent issues like what occurred with the basic enum.

enum class MyEnum     // We do not specify the value
{
   Value1 = 1,
   Value2 = 2,
   ValueBig = 0xFFFFFFF0U   // ValueBig in VS2011 is -16
};

Some IDEs are helpful and will show you what your values are going to be so there are no surprises.











Sources:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2347.pdf

Thursday, April 19, 2012

C++11: Range-based for-loops and auto

One of the downsides to using C++ in recent years has been how verbose it is compared to some other languages, such as Java, C#, or Scala, just to name a few. One of the goals of the new C++11 standard was to allow for code that is more terse, and I must say, I love it.

One of the things about using STL in C++ that has always bothered me was how bulky it is. Let's take for example, iterating over a vector of strings, using the standard C++03 for-loop.

std::vector<std::string> words;
words.push_back("Hello");
words.push_back("World");
words.push_back("...and good morning!");

std::vector<std::string>::const_iterator wordItr;
for (wordItr = words.begin(); wordItr != words.end(); ++wordItr)
{
    std::cout << (*wordItr).c_str() << " ";
}

That's a lot of code just to loop through the words and print each one. In C++11 they have introduced a new keyword: auto. auto assumes the type of whatever is assigned to it, making it less verbose when the type would be especially long, such as std::vector::const_iterator. Let's see if we can clean up the code a bit using the auto keyword.

std::vector<std::string> words;
words.push_back("Hello");
words.push_back("World");
words.push_back("...and good morning!");

for (auto word = words.cbegin(); word != words.cend(); ++word)
{
   std::cout << (*word).c_str() << " ";
}

That's a fair amount better. We didn't have to spend an entire line of code just creating our iterator variable, auto saved so much room we we able to do it inline within the for-loop. However, we still have to initialize to begin(), and compare against end(), and increment the iterator. Those things are so standardized, it would be great if the for loop could take care of that for us. And it can! C++11 introduces range-based for-loops to do just that.

std::vector<std::string> words;
words.push_back("Hello");
words.push_back("World");
words.push_back("...and good morning!");

for (auto word: words)
{
    std::cout << word.c_str() << " ";
}

auto takes care of the type for us so we don't have to type out the entire iterator, and the range-based for-loop takes care of the initialization, comparison, and incrementing of the variable. On top of that, the range-based for-loop takes care of giving us the value pointed to by the iterator itself, so we don't have to dereference the iterator to get the string it points to. This means instead of (*word).c_str() we can just do word.c_str().

Also, for those of you who use Visual Studio, the good news is that Visual Studio 11 will have support for range-based for-loops. It was reported in September that it would not have support for it, but after people complained for months about it, it looks like Microsoft decided to put some more work into their compiler. If you would like to try it out in Visual Studio the VS11 Beta is out and supports it: http://blogs.msdn.com/b/vcblog/archive/2012/02/29/10272778.aspx

I have noticed however that the IDE doesn't properly recognize it, and will give you warnings if you hover over parts of the for-loop, but it does compile and run. I'm sure this is something they'll have fixed in the final version. Regardless, VS11 does not support nearly as much of C++11 as other compilers like GCC, which are free, and I'm not sure I could restrict myself to a compiler that would restrict my learning and development as a programmer. But, I sure do love the IDE of Visual Studio, and I'm going to miss that.