Testing initialization safety of final fields-Collection of common programming errors

12 years ago

admin

3 minutes

Better understanding of why this test does not fail can come from understanding of what actually happens when constructor is invoked. Java is a stack-based language. TestClass.f = new TestClass(); consists of four action. First new instruction is called, its like malloc in C/C++, it allocates memory and places a reference to it on the top of the stack. Then reference is duplicated for invoking a constructor. Constructor in fact is like any other instance method, its invoked with the duplicated reference. Only after that reference is stored in the method frame or in the instance field and becomes accessible from anywhere else. Before the last step reference to the object is present only on the top of creating thread’s stack and no body else can see it. In fact there is no difference what kind of field you are working with, both will be initialized if TestClass.f != null. You can read x and y fields from different objects, but this will not result in y = 0. For more information you should see JVM Specification and Stack-oriented programming language articles.

UPD: One important thing I forgot to mention. By java memory there is no way to see partially initialized object. If you do not do self publications inside constructor, sure.

JLS:

An object is considered to be completely initialized when its constructor finishes. A thread that can only see a reference to an object after that object has been completely initialized is guaranteed to see the correctly initialized values for that object’s final fields.

JLS:

There is a happens-before edge from the end of a constructor of an object to the start of a finalizer for that object.

Broader explanation of this point of view:

It turns out that the end of an object’s constructor happens-before the execution of its finalize method. In practice, what this means is that any writes that occur in the constructor must be finished and visible to any reads of the same variable in the finalizer, just as if those variables were volatile.

UPD: That was the theory, let’s turn to practice.

Consider the following code, with simple non-final variables:

public class Test {

    int myVariable1;
    int myVariable2;

    Test() {
        myVariable1 = 32;
        myVariable2 = 64;
    }

    public static void main(String args[]) throws Exception {
        Test t = new Test();
        System.out.println(t.myVariable1 + t.myVariable2);
    }
}

The following command displays machine instructions generated by java, how to use it you can find in a wiki:

java.exe -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -Xcomp -XX:PrintAssemblyOptions=hsdis-print-bytes -XX:CompileCommand=print,*Test.main Test

It’s output:

...
0x0263885d: movl   $0x20,0x8(%eax)    ;...c7400820 000000
                                    ;*putfield myVariable1
                                    ; - Test::@7 (line 12)
                                    ; - Test::main@4 (line 17)
0x02638864: movl   $0x40,0xc(%eax)    ;...c7400c40 000000
                                    ;*putfield myVariable2
                                    ; - Test::@13 (line 13)
                                    ; - Test::main@4 (line 17)
0x0263886b: nopl   0x0(%eax,%eax,1)   ;...0f1f4400 00
...

Field assignments are followed by NOPL instruction, one of it’s purposes is to prevent instruction reordering.

Why does this happen? According to specification finalization happens after constructor returns. So GC thread cant see a partially initialized object. On a CPU level GC thread is not distinguished from any other thread. If such guaranties are provided to GC, than they are provided to any other thread. This is the most obvious solution to such restriction.

Results:

1) Constructor is not synchronized, synchronization is done by other instructions.

2) Assignment to object’s reference cant happen before constructor returns.