union for uint32_t and uint8_t[4] undefined behavior?-Collection of common programming errors

[edit: read my edited section below, as I’m now unsure of whether this is undefined behavior or not; I’ll leave the majority of my answer the same, however, until I can confirm further] Yes, this is undefined behavior. The C++ Standard, section 9.5.1, states:

In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time. [ Note: One special guarantee is made in order to simplify the use of unions: If a standard-layout union contains several standard-layout structs that share a common initial sequence (9.2), and if an object of this standard-layout union type contains one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of standard-layout struct members; see 9.2. — end note ]

This means that only the most recently written to member can validly be read from as well (reading from the others is technically undefined behavior). Only one member of the union can be active at any time. Not two.

You might ask why? Consider your example. C++ does not mandate the endianness of addr32. It could be big-endian, little-endian, or middle-endian. If you write to addr8, and then read from addr32, C++ cannot guarantee you’ll get the right value out because of the endianness in this case. One one computer, it could be one value, and on another, it could be a different value. Hence, doing so (that is, writing to one member and reading a different one) is undefined behavior.

Edit: For those wondering what “active” means, the MSDN documentation on Unions states:

The active member of a union is the one whose value was most recently set, and only that member has a valid value.

Edit Edit: I had always thought the behavior of doing this was undefined, but now I’m not so sure after R. Martinho Fernandes’s comments and answer and after re-reading the quote from MSDN. The value is certainly unspecified/undefined, but now I’m not so sure if the behavior is (undefined value means you might get different results back; undefined behavior means your system might crash, the two being different things). I’m going to consider this further and talk with others I know to see if I can find a more explicit answer.

I do think it’s safe to say, however, that in general reading an inactive member in a union can be undefined behavior (except for the special note in the Standard, of course), but I don’t know if it always is (i.e. there may be some exceptions beyond the special note in the section of the C++ Standard I’ve quoted).

Originally posted 2013-11-10 00:13:25.