If you examine the
// The next five lines appear in five
// separate appropriately named files.
interface I1 { }
interface I2 { }
interface I3 { }
class I1I2 implements I1, I2 { } // sometimes, it doesn't implement I2
class I1I3I2 implements I1, I3, I2 { }
class tricky {
I2 i2;
static I2 si2;
boolean isI2(Object x) {
return (x == null) || (x instanceof I2);
}
void expectsI2(I2 x) {
String result = isI2(x) ? "pass" : "fail";
System.err.println(result);
}
I2 cloaksI2(boolean b, I1I2 x, I1I3I2 y) {
I2 i2 = b ? (I2) x : (I2) y;
this.i2 = i2;
si2 = i2;
expectsI2(i2);
return i2;
}
public static void main(String args[]) {
tricky t = new tricky();
t.expectsI2(t.cloaksI2(true, new I1I2(), new I1I3I2()));
t.expectsI2(t.i2);
t.expectsI2(si2);
}
}
expectsI2
method and isI2
method,
it would seem that this code could never, ever print "fail";
either it passes, or dies with some sort of an exception.
In particular, for it to print "fail", the static type
correctness of parameter x
to expectsI2
cannot hold.
As it happens, Sun's javac and java fail to detect
an incompatible class change, and "fail" can be obtained
without any exceptions being thrown, and the cause
appears to be incorrect (with respect to both old and
new VM specs) bytecodes.
The relevant (allegeded to be defective) bytecodes are:
According to the new spec, these should not pass verification.
Method I2 cloaksI2(boolean, I1I2, I1I3I2)
0 iload_1
1 ifeq 8
4 aload_2
5 goto 9
8 aload_3
9 astore 4
11 aload_0
12 aload 4
14 putfield #18
The merge at 9, storing into local 4 should yield Object
(or perhaps I1
) according to 4.9.2, fourth paragraph
before end:
To merge two operand stacks, the number of values on each stack must be identical. The types of values on the stacks must also be identical, except that differently typed reference values may appear at corresponding places on the two stacks. In this case, the merged operand stack contains a reference to an instance of the first common superclass of the two types. Such a reference type always exists because the type Object is a superclass of all class and interface types. If the operand stacks cannot be merged, verification of the method fails.
Object
does not implement I2
, and the instructions at 14,
19, 25 and 30 require I2
(at least, depending upon the
definition of "consistent"). See any of the invoke
instructions, for instance invokeinterface
, second
paragraph of the description. For a less ambiguous
requirement of I2
, see the description of areturn
, first
sentence of description, or the description of putfield
,
or the description of putstatic
. Astoundingly, putfield
and putstatic
can store type-incorrect data into the heap.
According to the description of the data flow analyzer in section 4.9.2, because these instructions do not see appropriately typed operands, verification must fail.
2. Model the effect of the instruction on the operand stack and local variable array by doing the following:However, using jdk1.2rc2, jdk1.2, and jdk1.2.1 "java -verify", it does not.If the instruction uses values from the operand stack, ensure that there are a sufficient number of values on the stack and that the top values on the stack are of an appropriate type. Otherwise, verification fails.
If the instruction uses a local variable, ensure that the specified local variable contains a value of the appropriate type. Otherwise, verification fails.
Now, it happens that as long as "correctly" (i.e., correct, ignoring this defect) compiled bytecodes from consistent Java source files are processed, everything turns out ok. Even though the bytecode-inferred type (using the type system and merge operations in the VM spec) is wrong, the actual type is ok. However, this lack of verification means that it is possible to construct an inconsistent set of classes whose inconsistency remains undetected, with other effects that are somewhat surprising.
If I1I2
is modified to not implement I2
and recompiled in
isolation, then the program can be verified and rerun
yielding "fail, fail, fail, fail" output, which
demonstrates that the base case of type analysis from
Section 4.9.2
Next, a data-flow analyzer is initialized. For the first instruction of the method, the local variables that represent parameters initially contain values of the types indicated by the method's type descriptor; the operand stack is empty.can be invalidated w/o any errors being thrown. This should raise a red flag with students of proof by induction.
This also means that, running Java bytecodes under Sun's
VM, a static type of "interface anything" is a mere
decoration; a return value or parameter value or field
with bytecode-static type I2
need not actually
implement I2
.
There are several possible "fixes" to this problem, including at least these:
Object
(or
so it seems from the outside) and the only time
the interface type is actually checked is before
an invokeinterface
. This has the unfortunate
side effect of not correctly implementing Java,
See the JLS, page 29, paragraph 4, sentence 4 --
with this "fix", it is possible for an interface-typed
variable from Java source to not contain
a value of that type. In addition, this probably
hinders Java performance in any application whose
performance is not determined by compilation time.
It's also necessary to re-prove type correctness
and security of the VM, since the base case of the
old proof is now false.
Object[][]
results in three
triples in a list. The
merge operation is accomplished simply by
walking the two lists of triple to be merged,
perform the appropriate class and set-of-interface
merges at each item on the list, and truncating
the result list to the shortest of the two inputs.
This argues that changing the verifier to use a tighter static analysis (fix 2b) or changing the interpreter to check all uses (fix 2a, slightly less strict) will not break compatibility with any already-compiled programs. Note, too, that all of the test cases we've constructed to demonstrate this problem contain actual incompatible class change errors (JLS 13.4.4, 13.5.2), and require the use of a value from the incompatibly changed type in a context where the actual type is incompatible with the signature type of a method or field.
In addition, because the peculiar merge is so rare, it should be perfectly acceptable to "optimize" interface operations with a simple extension to the verifier in a system implementing 2a. Very often, when two types are merged, one is a supertype of the other. In that case, there is no need to consider the set of interfaces, because it is automatically implied by the class portion of the resulting merge. Similarly, when two interface types are merged, if one is derived from the other, the result is again consistent, and the resulting single interface type can be treated as the "class". A single bit can record this consistent state, and any subsequent interface-requiring uses (storing fields, returning values, passing parameters, invoking a method) where the interface set is consistent with the class type can be statically checked against the interfaces implemented by the class.
When the two types merged are not consistent, subsequent uses will check dynamically. This can be accomplished with the same sort of internal bytecode rewriting described in Chapter 9 of the first edition of the VM specification.