`
RednaxelaFX
  • 浏览: 3045841 次
  • 性别: Icon_minigender_1
  • 来自: 海外
社区版块
存档分类
最新评论

数组协变带来的静态类型漏洞

    博客分类:
  • Java
阅读更多
在刚才一个通不过Java字节码校验的例子的例子里,我们看到JVM会对其所加载的.class文件做校验,以保证类型安全。但Java里有这么一种情况,是编译器和JVM的字节码校验都无法检测到,而要到实际运行的时候才能发现的错误——数组的协变导致的类型静态系统漏洞。

还是像前一帖一样,用ASM来生成字节码:
import java.io.FileOutputStream;
import org.objectweb.asm.ClassWriter;
import org.objectweb.asm.MethodVisitor;
import org.objectweb.asm.Opcodes;

public class TestASM implements Opcodes {
    public static void main(String[] args) throws Exception {
        ClassWriter cw = new ClassWriter(0);
        cw.visit(
            V1_5,               // class format version
            ACC_PUBLIC,         // class modifiers
            "TestVerification", // class name fully qualified name
            null,               // generic signature
            "java/lang/Object", // super class fully qualified name
            new String[] { }    // implemented interfaces
        );
        
        MethodVisitor mv = cw.visitMethod(
            ACC_PUBLIC + ACC_STATIC,   // access modifiers
            "main",                    // method name
             "([Ljava/lang/String;)V", // method description
             null,                     // generic signature
             null                      // exceptions
        );
        mv.visitCode();
        mv.visitInsn(ICONST_1);
        mv.visitTypeInsn(ANEWARRAY, "java/lang/Float");
        mv.visitTypeInsn(CHECKCAST, "[Ljava/lang/Object;");
        mv.visitVarInsn(ASTORE, 0);
        mv.visitVarInsn(ALOAD, 0);
        mv.visitInsn(ICONST_0);
        mv.visitLdcInsn("a string");
        mv.visitInsn(AASTORE);
        mv.visitVarInsn(ALOAD, 0);
        mv.visitInsn(ICONST_0);
        mv.visitInsn(AALOAD);
        mv.visitMethodInsn(INVOKEVIRTUAL, "java/lang/Object", "toString", "()V");
        mv.visitInsn(RETURN);
        mv.visitMaxs(3, 1);
        mv.visitEnd(); // end method
        cw.visitEnd(); // end class
        
        byte[] clz = cw.toByteArray();
        FileOutputStream out = new FileOutputStream("TestVerification.class");
        out.write(clz);
        out.close();
    }
}


得到的是:
public class TestVerification extends java.lang.Object
  minor version: 0
  major version: 49
  Constant pool:
const #1 = Asciz        TestVerification;
const #2 = class        #1;     //  TestVerification
const #3 = Asciz        java/lang/Object;
const #4 = class        #3;     //  java/lang/Object
const #5 = Asciz        main;
const #6 = Asciz        ([Ljava/lang/String;)V;
const #7 = Asciz        java/lang/Float;
const #8 = class        #7;     //  java/lang/Float
const #9 = Asciz        [Ljava/lang/Object;;
const #10 = class       #9;     //  "[Ljava/lang/Object;"
const #11 = Asciz       a string;
const #12 = String      #11;    //  a string
const #13 = Asciz       toString;
const #14 = Asciz       ()V;
const #15 = NameAndType #13:#14;//  toString:()V
const #16 = Method      #4.#15; //  java/lang/Object.toString:()V
const #17 = Asciz       Code;

{
public static void main(java.lang.String[]);
  Code:
   Stack=3, Locals=1, Args_size=1
   0:   iconst_1
   1:   anewarray       #8; //class java/lang/Float
   4:   checkcast       #10; //class "[Ljava/lang/Object;"
   7:   astore_0
   8:   aload_0
   9:   iconst_0
   10:  ldc     #12; //String a string
   12:  aastore
   13:  aload_0
   14:  iconst_0
   15:  aaload
   16:  invokevirtual   #16; //Method java/lang/Object.toString:()V
   19:  return

}


这次的代码其实直接用Java源码也能表示出来,也就是:
public class TestVerification {
    public static void main(String[] args) {
        Object[] array = (Object[]) new Float[1];
        array[0] = "a string"; // 问题出在这里
        array[0].toString();
    }
}

编译不会有任何问题。这代码也是完全符合Java规范,也满足JVM的静态校验对类型的要求,所以加载时的校验也没问题。

但是运行的话……
Exception in thread "main" java.lang.ArrayStoreException: java.lang.String
        at TestVerification.main(Unknown Source)

很明显我们没办法把一个String类型的对象保存到一个Float[]里,但由于Java数组是协变的,所以Java的静态类型系统允许我们这么做,却会到运行时扔异常出来。

.NET很不幸的模仿了Java的这个特性,也把数组设计为协变的。因而CLI与JVM一样(JVM:aastore;CLI:stelem),也必须在运行时对数组的保存做动态类型检查。这对性能的影响自然不太好,而且也使得VM的实现更复杂……诶。

《Virtual Machines: Versatile Platforms for Systems and Processes》影印版第289页倒数第二段提到:
引用
Hence, if an object is accessed, the field information for the access can also be checked statically (there is an exception for arrays, given in the next paragraph).

然后在接下来的一段里,这本书却只提到了动态检查数组访问时越界检查,而没有提到由协变带来的静态类型漏洞。我觉得这里还是提一下协变问题比较好的。毕竟,数组长度并不是Java的静态类型的一部分,它的检查只能留待运行时检查(VM可以根据数据流分析而消除许多数组越界和空指针检查就是了);而类型协变是静态类型系统的一部分,却有漏洞所以运行时仍然要检查,这就不爽了。

看看Martin Odersky最近的一个访谈里对Java数组的协变的评论:
Martin Odersky 写道
Bill Venners: You said you found it frustrating at times to have the constraints of needing to be backwards compatible with Java. Can you give some specific examples of things you couldn't do when you were trying to live within those constraints, which you were then able to do when you changed to doing something that's binary but not source compatible?

Martin Odersky: In the generics design, there were a lot of very, very hard constraints. The strongest constraint, the most difficult to cope with, was that it had to be fully backwards compatible with ungenerified Java. The story was the collections library had just shipped with 1.2, and Sun was not prepared to ship a completely new collections library just because generics came about. So instead it had to just work completely transparently.

That's why there were a number of fairly ugly things. You always had to have ungenerified types with generified types, the so called raw types. Also you couldn't change what arrays were doing so you had unchecked warnings. Most importantly you couldn't do a lot of the things you wanted to do with arrays, like generate an array with a type parameter T, an array of something where you didn't know the type. You couldn't do that. Later in Scala we actually found out how to do that, but that was possible only because we could drop in Scala the requirement that arrays are covariant.

Bill Venners: Can you elaborate on the problem with Java's covariant arrays?

Martin Odersky: When Java first shipped, Bill Joy and James Gosling and the other members of the Java team thought that Java should have generics, only they didn't have the time to do a good job designing it in. So because there would be no generics in Java, at least initially, they felt that arrays had to be covariant. That means an array of String is a subtype of array of Object, for example. The reason for that was they wanted to be able to write, say, a “generic” sort method that took an array of Object and a comparator and that would sort this array of Object. And then let you pass an array of String to it. It turns out that this thing is type unsound in general. That's why you can get an array store exception in Java. And it actually also turns out that this very same thing blocks a decent implementation of generics for arrays. That's why arrays in Java generics don't work at all. You can't have an array of list of string, it's impossible. You're forced to do the ugly raw type, just an array of list, forever. So it was sort of like an original sin. They did something very quickly and thought it was a quick hack. But it actually ruined every design decision later on. So in order not to fall into the same trap again, we had to break off and say, now we will not be upwards compatible with Java, there are some things we want to do differently.


P.S. 不知道协变是什么的同学可以读读Wikipedia上的词条

P.P.S 不认识Martin Odersky的同学请留意:只要用到Java 5的泛型,你们的代码里就有他的痕迹。他是Pizza语言的设计者,后来参与了GJ(Generic Java)的设计;后者就是后来Java 5中的泛型的基石。Martin还设计了Scala << 知道Scala的人肯定比知道Pizza的多多了……
分享到:
评论
3 楼 Saito 2009-05-05  
RednaxelaFX 写道

Saito 写道请您移驾看个东西.. 答疑解惑
http://www.iteye.com/topic/378747
OK,已回复。其实观察现象的时候大家都经常犯迷糊。刚才我写前一帖的时候就犯迷糊没写return,虽然没影响结论不过还是不太好。细心这种习惯真难培养……至少对我来说 XD

   
   呵呵. 再次感谢. 
2 楼 RednaxelaFX 2009-05-05  
Saito 写道
请您移驾看个东西.. 答疑解惑
http://www.iteye.com/topic/378747

OK,已回复。其实观察现象的时候大家都经常犯迷糊。刚才我写前一帖的时候就犯迷糊没写return,虽然没影响结论不过还是不太好。细心这种习惯真难培养……至少对我来说 XD
1 楼 Saito 2009-05-05  
请您移驾看个东西.. 答疑解惑.  


          http://www.iteye.com/topic/378747

相关推荐

    VB 静态数组实例

    在VB中,静态数组的声明通常在模块、类或结构的声明部分进行,其大小和数据类型在编译时即确定,因此具有较高的效率。 ### 2. 声明静态数组 在VB中,声明静态数组的基本语法如下: ```vb Dim 数组名(下界 To 上界) ...

    44.java数组静态初始化.zip

    44.java数组静态初始化.zip44.java数组静态初始化.zip44.java数组静态初始化.zip44.java数组静态初始化.zip44.java数组静态初始化.zip44.java数组静态初始化.zip44.java数组静态初始化.zip44.java数组静态初始化.zip...

    数组动态初始化与静态初始化的区别

    动态初始化: 手动输入数组的长度,由系统给出默认初始值.(只明确元素的个数,不明确具体的值) ...数组静态初始化: 数据类型[] 数组名=new 数据类型[]{元素1,元素2...} eg: int [] arry=new int[]={11,22,33}

    Delphi实例源码演示静态与动态数组变量的不同

    摘要:Delphi源码,系统相关,动态数组,静态数组  本示例用于演示静态数组变量与动态数组变量的不同。    输出结果是这样的:  ______________________________________________________  256256  4  ____...

    52.java二维数组静态初始化.zip

    52.java二维数组静态初始化.zip52.java二维数组静态初始化.zip52.java二维数组静态初始化.zip52.java二维数组静态初始化.zip52.java二维数组静态初始化.zip52.java二维数组静态初始化.zip52.java二维数组静态初始化....

    Delphi中静态二维数组的使用

    本教程将深入探讨如何在Delphi中使用静态二维数组来实现九九乘法表的输出,这有助于理解Delphi中的数组概念以及基本的循环控制结构。 首先,我们需要了解什么是静态二维数组。在Delphi中,静态数组是在编译时声明并...

    论文研究-Java语言中数组越界故障的静态测试研究.pdf

    针对Java语言中常见数组越界故障进行了分析,并从面向具体故障的测试思想出发,建立了Java语言中数组越界的故障模型,结合静态测试的特点,给出了一种静态查找此类故障的方法。此方法已实现,并已应用于面向故障的...

    Delphi实例源码演示静态与动态数组变量的不同.rar

    静态数组在声明时必须指定其大小,且这个大小在程序运行期间是不可变的。这种类型的数组在编译时就分配了固定大小的内存空间,因此它的元素数量在程序执行过程中不能改变。例如: ```delphi type TMyStaticArray =...

    静态和动态数组-静态和动态数组

    静态数组和动态数组,初学者可以了解一下,对于动态和静态数组的区别有所描述。

    数组类型赋值及改变

    在编程领域,数组是一种基础且重要的数据结构,用于存储同类型元素的集合。数组类型赋值及改变是编程中常见的操作,特别是在C/C++这样的语言中,数组的处理需要特别注意,因为它们不是对象,而是直接在内存中分配的...

    将字节数组转换为各种基本类型

    `BitConverter`类提供了静态方法,用于将各种基本类型转换为字节数组,以及将字节数组转换回这些基本类型。这个类考虑了不同平台上的字节顺序(大端序和小端序),确保数据的正确性。 3. 基本类型转换方法 - `To...

    Delphi 样例-动态数组和静态数组.rar

    在 Delphi 编程环境中,数组是一种非常重要的数据结构,用于存储同类型的多个元素。本示例代码着重探讨了动态数组、静态数组以及 TBytes 的使用,并展示了它们与 TMemoryStream 的结合应用。理解这些概念对于高效地...

    C语言数组-C语言实现使用静态数组实现循环队列.zip

    在C语言中,数组是一种非常基础且重要的数据结构,它允许我们存储同类型的数据集合。在本主题中,我们将深入探讨如何使用C语言中的静态数组来实现一个循环队列。循环队列是一种线性数据结构,它巧妙地利用了数组的...

    C++中关于[]静态数组和new分配的动态数组的区别分析

    在C++编程语言中,数组是一种基本的数据结构,分为静态数组和动态数组两种类型。静态数组和动态数组在使用上有显著的区别,这些差异主要体现在内存分配、大小确定、生命周期以及内存管理等方面。 首先,静态数组在...

    VBA 数组进阶——进阶即为探寻细节,“实践出真知”。

    数组是一组连续可索引的具有相同内在数据类型的元素所成的集合,在未指定为变体变量情况下。 2、上界、下界和下标 上界:数组某一维可用的最大下标。用Ubound 函数表示。 下界:数组某一维的最小下标。默认最小下标...

    C++ 数组 多维数组 -- 二维数组

    在 C++ 中,定义数组的语法为:`储存类型符 数据类型符 数组名 [ 数组长度 ][ 数组长度 ]`。例如,`static int a[3][4]` 定义了一个静态的整型二维数组 `a`,它有 3 行 4 列。 在 C++ 中,我们可以使用多种方式来...

    C语言数组:C语言数组定义、二维数组、动态数组、字符串数组_C语言中文网1

    C语言中的数组是一种重要的数据结构,它允许程序员存储和管理大量相同类型的数据。数组由同一类型的元素序列组成,可以通过一个唯一的索引来访问每个元素。数组的索引通常从0开始,使得第一个元素的索引是0,最后一...

    数据结构实战 -- 线性结构静态数组表示法(C实现源码)

    由于静态数组的大小不可变,因此当数组满时,插入操作可能会失败。同样,删除元素后,需要将后续元素向前移动以填补空位。例如: ```c int insert(StaticList* list, int index, int value) { if (list-&gt;size == ...

    C语言实现使用静态数组.zip

    在C语言中,静态数组是一种基础且重要的数据结构,它在内存中预先分配固定的大小,用于存储一组相同类型的数据。本资源"**C语言实现使用静态数组.zip**"似乎包含了一个名为"Queue_Array"的项目,可能是一个用C语言...

Global site tag (gtag.js) - Google Analytics