Java虚拟机的体系结构---运行时数据区

官方文档:https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-2.html

一、宏观角度

Java虚拟机规范中对于虚拟机结构的描述大概分为以下几个方面:

  1. 关于class文件格式(The class file format)
  2. 数据类型(Data Type)
  3. 原始类型和值(Primitive Types and Values)
  4. 参考类型和值(Reference Types and Values)
  5. 运行时数据区(Run-Time Data Areas)
  6. 帧(Frames)
  7. 对象的表示(Representation of Objects)
  8. 浮点运算( Floating-Point Arithmetic)
  9. 特殊方法(Special Methods)
  10. 异常(Exception)
  11. 指令集概览(Instruction Set Summary)
  12. 类库(Class Libraries)
  13. 公共设计与私有实现(Public Design, Private Implementation)

故如果可以遵循这些Java虚拟机的规范去设计,就可以正确的实现一个执行JVM语系的语言的解释平台。这里只对比较关键的部分Java虚拟机规范进行深入学习,以提升自己的视野,从更底层的角度了解Java虚拟机的工作原理。

二、运行时数据区

关于Java虚拟机中的运行时数据区,规范中这样描述:

The Java Virtual Machine defines various run-time data areas that are used during execution of a program. Some of these data areas are created on Java Virtual Machine start-up and are destroyed only when the Java Virtual Machine exits. Other data areas are per thread. Per-thread data areas are created when a thread is created and destroyed when the thread exits.

翻译过来就是:

Java虚拟机定义了在程序执行期间使用的各种运行时数据区域。其中一些数据区域是在Java虚拟机启动时创建的,只有在Java虚拟机退出时才会被销毁。其他数据区域是每个线程的。每线程数据区域在创建线程时创建,在线程退出时销毁。

至此,大概清楚了Java虚拟机对于程序执行期间所使用的运行时数据区大概是分成了两个部分:

  1. 公共的运行时数据区:随虚拟机启动而创建,随虚拟机退出而销毁
  2. 每个线程所使用的运行时数据区:随线程创建时创建,随线程退出时而销毁

继续深入,深入前先看下目录,对于运行时数据区,官方文档这样定义了目录:

image-20210220154923409

所以我们现在大概知道了运行时数据区的组成:

1. PC寄存器(program counter register 程序计数器)

官方介绍:

The Java Virtual Machine can support many threads of execution at once (JLS §17). Each Java Virtual Machine thread has its own pc (program counter) register. At any point, each Java Virtual Machine thread is executing the code of a single method, namely the current method (§2.6) for that thread. If that method is not native, the pc register contains the address of the Java Virtual Machine instruction currently being executed. If the method currently being executed by the thread is native, the value of the Java Virtual Machine’s pc register is undefined. The Java Virtual Machine’s pc register is wide enough to hold a returnAddress or a native pointer on the specific platform.

翻译过来就是:

Java虚拟机可以同时支持多个执行线程。每个Java虚拟机线程都有自己的pc(程序计数器)寄存器。在任何时候,每个Java虚拟机线程都在执行单个方法的代码,即该线程的当前方法。如果该方法不是Native方法,则pc寄存器包含当前正在执行的Java虚拟机指令的地址。如果线程当前执行的方法是Native方法,则Java虚拟机的pc寄存器的值是未定义的。Java虚拟机的pc寄存器足够宽,可以容纳特定平台上的返回地址或本机指针。

总结: PC寄存器就是运行时数据区中生命周期与线程关联的部分。每个线程的PC寄存器上记录了正在执行的Java虚拟机指令地址,但如果当前线程执行的是Native方法,PC寄存器上的值就是未定义。

2. Java虚拟机栈(Java Virtual Machine Stacks)

官方介绍:

Each Java Virtual Machine thread has a private Java Virtual Machine stack, created at the same time as the thread. A Java Virtual Machine stack stores frames (§2.6). A Java Virtual Machine stack is analogous to the stack of a conventional language such as C: it holds local variables and partial results, and plays a part in method invocation and return. Because the Java Virtual Machine stack is never manipulated directly except to push and pop frames, frames may be heap allocated. The memory for a Java Virtual Machine stack does not need to be contiguous.

In the First Edition of The Java® Virtual Machine Specification, the Java Virtual Machine stack was known as the Java stack.

This specification permits Java Virtual Machine stacks either to be of a fixed size or to dynamically expand and contract as required by the computation. If the Java Virtual Machine stacks are of a fixed size, the size of each Java Virtual Machine stack may be chosen independently when that stack is created.

A Java Virtual Machine implementation may provide the programmer or the user control over the initial size of Java Virtual Machine stacks, as well as, in the case of dynamically expanding or contracting Java Virtual Machine stacks, control over the maximum and minimum sizes.

The following exceptional conditions are associated with Java Virtual Machine stacks:

  • If the computation in a thread requires a larger Java Virtual Machine stack than is permitted, the Java Virtual Machine throws a StackOverflowError.
  • If Java Virtual Machine stacks can be dynamically expanded, and expansion is attempted but insufficient memory can be made available to effect the expansion, or if insufficient memory can be made available to create the initial Java Virtual Machine stack for a new thread, the Java Virtual Machine throws an OutOfMemoryError.

翻译过来就是:

每个Java虚拟机线程都有一个与线程同时创建的私有Java虚拟机栈。Java虚拟机栈存储“栈帧”。Java虚拟机栈类似于传统语言(如C)的堆栈:它保存局部变量和部分结果,并在方法调用和返回中起作用。因为除了推送和弹出帧之外,Java虚拟机栈从不被直接操作,所以帧可以被堆分配。Java虚拟机栈的内存不需要是连续的。

在Java虚拟机规范的第一版中,Java虚拟机栈被称为Java栈(Java Stack)。

这个规范允许Java虚拟机栈具有固定的大小,或者根据计算的需要动态地扩展和收缩。如果Java虚拟机栈的大小是固定的,则在创建该堆栈时,可以独立选择每个Java虚拟机堆栈的大小。

Java虚拟机实现可以向程序员或用户提供对Java虚拟机栈的初始大小的控制,以及在动态扩展或收缩Java虚拟机栈的情况下,控制最大和最小大小。

以下异常情况与Java虚拟机栈相关:

  1. 如果线程中的计算需要比允许的更大的Java虚拟机栈,则Java虚拟机抛出stackoverflower。

  2. 如果可以动态扩展Java虚拟机栈,并且尝试扩展,但无法提供足够的内存来实现扩展,或者如果内存不足,无法为新线程创建初始Java虚拟机栈,则Java虚拟机将抛出OutOfMemoryError。

总结:Java虚拟机栈也是数据运行时数据区中与线程生命周期关联的部分。 Java虚拟机栈在第一版规范中也被叫做Java堆栈。 Java虚拟机栈的功能就是保存线程中的局部变了和部分结果,并且在方法调用和返回中起作用。Java虚拟机栈内存不需要是连续的。Java虚拟机栈可以设置固定大小,也可以动态扩展收缩。

3. 堆(Heap)

官方介绍:

The Java Virtual Machine has a heap that is shared among all Java Virtual Machine threads. The heap is the run-time data area from which memory for all class instances and arrays is allocated.

The heap is created on virtual machine start-up. Heap storage for objects is reclaimed by an automatic storage management system (known as a garbage collector); objects are never explicitly deallocated. The Java Virtual Machine assumes no particular type of automatic storage management system, and the storage management technique may be chosen according to the implementor’s system requirements. The heap may be of a fixed size or may be expanded as required by the computation and may be contracted if a larger heap becomes unnecessary. The memory for the heap does not need to be contiguous.

A Java Virtual Machine implementation may provide the programmer or the user control over the initial size of the heap, as well as, if the heap can be dynamically expanded or contracted, control over the maximum and minimum heap size.

The following exceptional condition is associated with the heap:

  • If a computation requires more heap than can be made available by the automatic storage management system, the Java Virtual Machine throws an OutOfMemoryError.

翻译过来就是:

Java虚拟机有一个堆,在所有Java虚拟机线程之间共享。堆是运行时数据区域,从中为所有类实例和数组分配内存。

堆是在虚拟机启动时创建的。对象的堆存储由自动存储管理系统(称为垃圾回收器)回收;对象从不显式释放。Java虚拟机不假设特定类型的自动存储管理系统,存储管理技术可以根据实现者的系统需求进行选择。堆可以是固定大小的,也可以根据计算的需要进行扩展,如果不需要更大的堆,则可以收缩堆。堆的内存不需要是连续的。

Java虚拟机实现可以让程序员或用户控制堆的初始大小,如果堆可以动态扩展或收缩,还可以控制堆的最大和最小大小。

以下异常情况与堆关联:

如果计算需要的堆超过了自动存储管理系统所能提供的堆,Java虚拟机将抛出OutOfMemoryError。

总结:Java虚拟机中的堆是运行时数据区中的公共部分,生命周期与虚拟机绑定,所有的线程可共享。主要功能就是为所有的类实例和数组分配内存。垃圾回收针对的就是这个堆内存区域。堆的大小可以是固定的,也可以动态扩展收缩。Java虚拟机不设定限制垃圾回收器,使用者可以自己根据实际的系统需求选择使用哪些垃圾回收器。

4. 方法区(Method Area)

官方介绍:

The Java Virtual Machine has a method area that is shared among all Java Virtual Machine threads. The method area is analogous to the storage area for compiled code of a conventional language or analogous to the “text” segment in an operating system process. It stores per-class structures such as the run-time constant pool, field and method data, and the code for methods and constructors, including the special methods (§2.9) used in class and instance initialization and interface initialization.

The method area is created on virtual machine start-up. Although the method area is logically part of the heap, simple implementations may choose not to either garbage collect or compact it. This specification does not mandate the location of the method area or the policies used to manage compiled code. The method area may be of a fixed size or may be expanded as required by the computation and may be contracted if a larger method area becomes unnecessary. The memory for the method area does not need to be contiguous.

A Java Virtual Machine implementation may provide the programmer or the user control over the initial size of the method area, as well as, in the case of a varying-size method area, control over the maximum and minimum method area size.

The following exceptional condition is associated with the method area:

  • If memory in the method area cannot be made available to satisfy an allocation request, the Java Virtual Machine throws an OutOfMemoryError.

翻译过来就是:

Java虚拟机有一个在所有Java虚拟机线程之间共享的方法区域。方法区类似于常规语言编译代码的存储区,或类似于操作系统进程中的“文本”段。它存储每个类的结构,如运行时常量池字段方法数据,以及方法和构造函数的代码,包括类和实例初始化以及接口初始化中使用的特殊方法。

方法区域是在虚拟机启动时创建的。尽管方法区域在逻辑上是堆的一部分但简单的实现可能选择不进行垃圾收集或压缩。此规范不要求方法区域的位置或用于管理已编译代码的策略。方法区域可以是固定大小的,或者可以根据计算的需要进行扩展,并且可以在不需要更大的方法区域时收缩。方法区域的内存不需要是连续的。

Java虚拟机实现可以提供程序员或用户对方法区域的初始大小的控制,以及在大小不同的方法区域的情况下,控制最大和最小方法区域大小。

以下异常情况与方法区域有关:

如果方法区域中的内存无法用于满足分配请求,Java虚拟机将抛出OutOfMemoryError。

总结:方法区是运行时数据区中的公共部分,生命周期与虚拟机绑定,所有线程可共享。其实方法区只是一个逻辑概念,本质上方法区就是堆的一部分。但是因为方法区存储的通常都是常量、字段、方法数据、代码文本等,所以可以不进行垃圾回收,故Java虚拟机规范单独把方法区拆出来作为运行时数据区的一部分。方法区的内存不需要是连续的,方法区的大小也可以由用户自行设定。

5. 运行时常量池(Run-Time Constant Pool)

官方文档:

A run-time constant pool is a per-class or per-interface run-time representation of the constant_pool table in a class file (§4.4). It contains several kinds of constants, ranging from numeric literals known at compile-time to method and field references that must be resolved at run-time. The run-time constant pool serves a function similar to that of a symbol table for a conventional programming language, although it contains a wider range of data than a typical symbol table.

Each run-time constant pool is allocated from the Java Virtual Machine’s method area (§2.5.4). The run-time constant pool for a class or interface is constructed when the class or interface is created (§5.3) by the Java Virtual Machine.

The following exceptional condition is associated with the construction of the run-time constant pool for a class or interface:

  • When creating a class or interface, if the construction of the run-time constant pool requires more memory than can be made available in the method area of the Java Virtual Machine, the Java Virtual Machine throws an OutOfMemoryError.

See §5 (Loading, Linking, and Initializing) for information about the construction of the run-time constant pool.

翻译过来就是:

运行时常量池是类文件中常量池表的每个类或每个接口的运行时表示。它包含几种常量,从编译时已知的数字字面值到必须在运行时解析的方法和字段引用。运行时常量池的功能类似于传统编程语言的符号表,尽管它包含的数据范围比典型的符号表更广。

每个运行时常量池都是从Java虚拟机的方法区域分配的。类或接口的运行时常量池是在Java虚拟机创建类或接口时构造的。

以下异常情况与类或接口的运行时常量池的构造相关:

在创建类或接口时,如果构建运行时常量池所需的内存超过了Java虚拟机的方法区域中可用的内存,则Java虚拟机将抛出OutOfMemoryError。

总结:运行时常量池属于方法区,方法区数据堆,所以运行时常量池同方法区一样,也是一个逻辑概念,同样也是运行时数据区中的公共部分,与虚拟机生命周期绑定,所有线程共享。

6. 本地方法栈(Native Method Stacks)

官方文档:

An implementation of the Java Virtual Machine may use conventional stacks, colloquially called “C stacks,” to support native methods (methods written in a language other than the Java programming language). Native method stacks may also be used by the implementation of an interpreter for the Java Virtual Machine’s instruction set in a language such as C. Java Virtual Machine implementations that cannot load native methods and that do not themselves rely on conventional stacks need not supply native method stacks. If supplied, native method stacks are typically allocated per thread when each thread is created.

This specification permits native method stacks either to be of a fixed size or to dynamically expand and contract as required by the computation. If the native method stacks are of a fixed size, the size of each native method stack may be chosen independently when that stack is created.

A Java Virtual Machine implementation may provide the programmer or the user control over the initial size of the native method stacks, as well as, in the case of varying-size native method stacks, control over the maximum and minimum method stack sizes.

The following exceptional conditions are associated with native method stacks:

  • If the computation in a thread requires a larger native method stack than is permitted, the Java Virtual Machine throws a StackOverflowError.
  • If native method stacks can be dynamically expanded and native method stack expansion is attempted but insufficient memory can be made available, or if insufficient memory can be made available to create the initial native method stack for a new thread, the Java Virtual Machine throws an OutOfMemoryError.

翻译过来就是:

Java虚拟机的实现可以使用传统的堆栈(俗称“C堆栈”)来支持Native方法(用Java编程语言以外的语言编写的方法)。Native方法栈也可用于实现Java虚拟机指令集的解释器,如C语言。不能加载本机方法且本身不依赖传统堆栈的Java虚拟机实现不需要提供本机方法栈。如果提供了本机方法堆栈,则通常在创建每个线程时为每个线程分配。

此规范允许本机方法堆栈可以是固定大小的,也可以根据计算的需要动态扩展和收缩。如果本机方法堆栈的大小是固定的,则在创建该堆栈时,可以独立选择每个本机方法堆栈的大小。

Java虚拟机实现可以为程序员或用户提供对本机方法堆栈初始大小的控制,以及在大小不同的本机方法堆栈的情况下,控制最大和最小方法堆栈大小。

以下异常情况与本机方法堆栈相关:

如果线程中的计算需要比允许的更大的本机方法堆栈,Java虚拟机将抛出StackOverflowError。

如果可以动态扩展本机方法堆栈并尝试扩展本机方法堆栈,但可用内存不足,或者如果可用内存不足,无法为新线程创建初始本机方法堆栈,则Java虚拟机将抛出OutOfMemoryError。

总结:本地方法栈是用来支持native方法的,本地方法栈不是每个JVM实现所必须的(如果你设计的这个JVM实现根本就不需要执行native方法,那你就根本没必要设计这个本地方法栈),如果需要设计本地方法栈,则需要和线程生命周期绑定。这个本地方法栈是每个线程独享的,属于运行时数据区中非公共部分。


根据官方虚拟机规范文档,可以得出运行时数据区的逻辑关系如下:

image-20210220155858053

补充: 实际上Hotspot虚拟机并没有区分Java虚拟机栈和本地方法栈,直接通用了。

三、运行时数据区在Java6/7/8中的对比

  1. 在Java6的运行时数据区中,运行时常量池–(属于)-> 方法区 –(属于)–> 堆 , 按照堆内存的分代划分,方法区属于永久代。直接内存用于NIO。Java堆用于存放对象的实例。而方法区存放类信息、运行时常量池、静态变量、字符串常量池。

  2. 在Java7的运行时数据区中,运行时常量池–(属于)-> 方法区 –(属于)–> 堆 , 按照堆内存的分代划分,方法区属于永久代。直接内存用于NIO。Java堆用于存放对象的实例、静态变量、字符串常量池。而方法区存放类信息、运行时常量池。

  3. 在Java8的运行时数据区中,方法区直接放到了本地内存中的元空间中(Java8取消了永久代,新增了元空间,元空间直接放在本地内存中的)。所以此时本地内分为NIO所使用的直接内存和元空间。Java堆用于存放对象的实例、静态变量、字符串常量池。方法区存放类信息、运行时常量池

image-20210220160037168

image-20210220160054581

以上图片来自于网络