Undefined-Behavior at its best, is it -boundary break? -bad pointer arithmetic? Or just -ignore of aliasing?-Collection of common programming errors

I’m working now for some weeks with c99 focusing undefined behaviour. I wanted to test some strange code while trying to respect the rules. The result was this code:

(plz forgive me the variable names, i had eaten a clown)

int main(int arg, char** argv)
{

    unsigned int uiDiffOfVars;
    int LegalPointerCast1, LegalPointerCast2, signedIntToRespectTheRules;
    char StartVar;//Only use to have an adress from where we can move on
    char *TheAccesingPointer;
    int iTargetOfPointeracces;

    iTargetOfPointeracces= 0x55555555;

    TheAccesingPointer = (char *) &StartVar;
    LegalPointerCast2 = (int) &StartVar;
    LegalPointerCast1 = (int) &iTargetOfPointeracces;

    if ((0x80000000 & LegalPointerCast2) != (0x80000000 & LegalPointerCast1))
    {
        //as im not sure in how far 
        //"— Apointer is converted to other than an integer or pointer type (6.5.4)." is treating unsigned integers,
        //im checking this way.
        printf ("try it on next machine!\r\n");
        return 1;
    }

    if ((0x80000000 & LegalPointerCast2) == 0)
        uiDiffOfVars = abs (LegalPointerCast1) - abs (LegalPointerCast2);
    else
        uiDiffOfVars = abs (LegalPointerCast2) - abs (LegalPointerCast1);

    LegalPointerCast2 = (int) TheAccesingPointer;
    signedIntToRespectTheRules = abs ((int) uiDiffOfVars);
    TheAccesingPointer = (char *)(LegalPointerCast2 + signedIntToRespectTheRules);

    printf ("%c\r\n", *TheAccesingPointer);//Will the output be an 'U' ?

    return 0;
}

So this code is undefined behavior at its best. I get different results, whether I’m not accessing any memory-area, that i don’t own, nor accessing any uninitialized memory. (afaik)

The first critical rule was, I’m not allowed to add or subtract pointer which lets them leaving their array bounds. But I’m allowed to cast a pointer into integer, there I’m able calculate with, as I want, am I not?

My second assumption was as I’m allowed to assign a pointer an address thats valid, its a valid operation to assign this calculated address to a pointer. Since I’m acting with a char pointer, there is also no break of strict aliasing rules, as a char* is allowed to alias anything.

So which rule is broken, that this causes UB?

are single Variables also to be understood as “Arrays”, and I’m breaking this rule?

— Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that does not point into, or just beyond, the same array object (6.5.6).

If so, I’m also allowed to do this?

int var;
int *ptr;
ptr = &var;
ptr = ptr + 1;

Because the result is almost pretty sure undefined behavior. compiling with MSVC2010 it puts out the expected “U”, but on freeBSD using clang and gcc I get depending on optimization level pretty funny and different results each time. (what in my eyes shouldn’t be as far the bahavior is defined).

So any ideas what is causing this nasal dragons?

  1. You are basically running into paragraph 6.3.2.3 Pointer ad 5 in conversion from int to char* in the assignment to TheAccesingPointer.

    An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.

    The use of all abs functions makes it very dependent on the actual implementation what happens. Basically it will only work if iTargetOfPointeracces has a higher address than StartVar. If you lose all occurrences of abs I think you will get 'U' on most if not all architectures and with most if not all compilers.

    Ironically this is not undefined behavior but implementation defined behavior. But when you don’t get 'U' the TheAccesingPointer is not pointing to an entity of the referenced type, most likely it is not pointing to an entity at all.

    If it is not pointing to an entity then (of course) you will run into undefined behavior when dereferencing it in the printf following paragraph 6.5.3.2 ad 4

    The unary * operator denotes indirection. If the operand points to a function, the result is a function designator; if it points to an object, the result is an lvalue designating the object. If the operand has type ‘‘pointer to type’’, the result has type ‘‘type’’. If an invalid value has been assigned to the pointer, the behavior of the unary * operator is undefined.

    Let’s elaborate two scenarios where all addresses on the stack have bit 31 set, which is quite common under Linux.

    Scenario A: Suppose &StartVar < &iTargetOfPointeracces then

      abs(LegalPointerCast1) - abs(LegalPointerCast2)
    = LegalPointerCast2 - LegalPointerCast1 (by both < 0)
    = (char*)(&StartVar) - (char*)(&iTargetOfPointeracces)
    < 0 (by &StartVar < &iTargetOfPointeracces)
    So uiDiffOfVars = (char*)(&StartVar) - (char*)(&iTargetOfPointeracces)
    and signedIntToRespectTheRules = -uiDiffOfVars (by (int)uiDiffOfVars < 0)
    thus  TheAccesingPointer
    = (char *)(&StartVar + (char*)(&iTargetOfPointeracces) - (char*)(&StartVar))
    = (char*)(&iTargetOfPointeracces)
    

    So in this scenario you will get 'U'.

    Scenario B: Suppose &StartVar > &iTargetOfPointeracces then

      abs(LegalPointerCast1) - abs(LegalPointerCast2)
    = LegalPointerCast2 - LegalPointerCast1 (by both < 0)
    = (char*)(&StartVar) - (char*)(&iTargetOfPointeracces)
    > 0 (by &StartVar > &iTargetOfPointeracces)
    So uiDiffOfVars = (char*)(&StartVar) - (char*)(&iTargetOfPointeracces)
    and signedIntToRespectTheRules = uiDiffOfVars (by (int)uiDiffOfVars > 0)
    thus TheAccesingPointer
    = (char *)(&StartVar + (char*)(&StartVar) - (char*)(&iTargetOfPointeracces))
    = (char *)(2*(char*)&StartVar - (char*)(&iTargetOfPointeracces))
    

    In this scenario it is very unlikely that TheAccesingPointer is pointing to some entity, so undefined behavior is triggered in dereferencing this pointer. So my point is that the calculation of TheAccesingPointer is implementation defined, where the above calculations are very common. If the computed pointer is not pointing to iTargetOfPointeracces, as in scenario B, undefined behavior is triggered.

    Different optimization levels may result in a different order of StartVar' andiTargetOfPointeracces’ on the stack and that may explain the different result for different optimization levels.

    I don’t think single variables count as an array.

Originally posted 2013-11-10 00:10:39.