`
thinkingmt
  • 浏览: 24630 次
  • 性别: Icon_minigender_1
  • 来自: 桂林
社区版块
存档分类
最新评论

How JVM works?

 
阅读更多

JVM is a very important part for JAVA. It stands for Java Virtual Machine.



The gap between human to machine

Before understanding the theory of how JVM works, we'd better be aware of why we called Java, C, and C# as high level language.

The common characteristics of those languages are they are human readable, but they are not machine readable. Before the machine running the application written by high level languages, the source code must be converted to a specific format that can be understood by computer (CPU), no matter which language you use.



//People: it's easy, just multiplication between x and y.  
int x = 1;  
int y = 2;  
int z;  
z = x*y;  
  
//Machine: sorry, I cannot understand your code.

 


Computers understand only one language, the machine code. Machine code is a sequence of binary digits which only contains 1 or 0. Owing the manufacturers will decide the meanings of each combination of sequence of bits, it extremely important to convert the source code into the proper machine code  for the corresponding CPU, or application/ software cannot be performed as expected, beacuse the CPU cannot concisely understand it (I called it compatiable issues).


Who can compensate the gap? 

To make up the gap between human and machine, in other words, converting the high level source code to machine code, we must need a "translator" to do this work. Such a translator is called compiler.


As mentioned earlier, there exists multiple CPUs, so we need separate compilers for different hardwares (CPUs). For example, the same C code will have to be compiled using Apple Macintosh compatiable compilers in order to run on the Apple computers; if the users also want to run the same code on Windows running on the Intel platform, then user need another C-compiler for Windows.


Simply put a compiler converts a source code file (which is a simple text file) into an executable file that can be run on the host computer. But in effect, the process is more complex than it. 


Below is an example of how C compiler works (the details pls see link: http://www.codeproject.com/Articles/1825/The-Common-Language-Runtime-CLR-and-Java-Runtime-E#_interpreters):





Note:


1. if the picture above is lost, see the copy in the "My Picture"!




What is interpreters?

Looks similar but different with compiler, interpreters are another extreme to running programming languages. Pure interpreters do not do translation work like compilers.


Interpreters take the code written by high level language code and execute them one by one, so Pure interpreters have no chance to do any code optimizations at all.  And it also unable to check the syntax like compilers. 


Examples of pure interpreters are some scripting languages that interact with operating systems. The shell scripts in Linux, the Batch files (.bat) and command files (.cmd) in Windows are all examples of pure interpreted languages.


Below is a figure shows how pure interpreters work:



What is the hybrid approach?

But most of the popular modern languages are not pure interpreter based, they are either compiled (like C and C++) or hybrid approach (like Java).


Below is a figure that shows how the hybrid compiler-interpreter work:



As is obvious from the above diagrams, today's popular interpreted languages are not purely-interpreted. They follow the "compilation" technique to produce an intermediate code (e.g. Microsoft's Intermediate Language - MSIL, Sun's Java Byte Code etc.). It is this intermediate language that the interpreter works on, and not the original high level source code. This approach rids (avoids) many of the problems inherent in pure-interpreted languages, and gives many of the advantages of fully-compiled languages.

 

 

 

The execution mechanism of compiled and interpreted language:

A compiler does this conversion off-line and in one go (as discussed in the Who can compensate the gap? section); whereas the interpreter does this conversion one-program statement-by-one.


A compiled program runs in a fetch-execute cycle whereas an interpreted program runs in adecode-fetch-execute cycle. The decoding is done by the interpreter, whereas the fetch and execute operations are done by the CPU. In an interpreter the bottleneck is the decoding phase, and hence an interpreted program may be 30-100% slower than a compiled program.


Below are two figures that illustrates the flow of execution of compilers (first figure) and interpreters (second figure):






It is evident from the above flowcharts, that an interpreted program has an overhead of decoding each statement one-by-one; thus in an interpreted program the bottleneck is the decoding process.


Both compiled and interpreted approaches have their own advantages and disadvantages, the details are not seeked later. Readers must NOTE THAT both of those two approaches eventually convert the source code to machine language, but the process are different.





Compare and Contrast Compiled and Interpreted languages (extreme important to link the concept of compiler&interpreter with the next section which discuss the Java platform independence, JIT compiler and .NET IL compiler):



Languages can be developed either as fully-compiled, pure-interpreted, or hybrid compiled-interpreted. As a matter of fact, most of the current programming languages have both a compiled and interpreted versions available.

Both compiled and interpreted approaches have their advantages and disadvantages. Let's start with the compiled languages.


Compiled languages (Sample: C and C++)

  1. One of the biggest advantages of Compiled languages is their execution speed. A program written in C/C++ runs 30-70 % faster then an equivalent program written in Java.
  2. Compiled code also takes less memory as compared to an interpreted program.
  3. On the down side - a compiler is much more difficult to write than an interpreter.
  4. A compiler does not provide much help in debugging a program - how many times have you received a "Null pointer exception" in your C code and have spent hours trying to figure out where in your source code did the exception occurred. (Maybe this is the reason of why debugging C program is such an annoying work!!!)
  5. The executable Compiled code is much bigger in size than an equivalent interpreted code e.g. a C/C++ .exe file is much bigger than an equivalent Java .class file
  6. Compiled programs are targeted towards a particular platform and hence are platform dependent.
  7. Compiled programs do not allow security to be implemented with in the code - e.g. a compiled program can access any area of the memory, and can do whatever it wants with your PC (most of the viruses are made in compiled languages).
  8. Due to loose security and platform dependence - a compiled language is not particularly suited to be used to develop Internet or web-based applications.

Interpreted languages

  1. Interpreted language provides excellent debugging support. A Java programmer only spends a few minutes fixing a "Null pointer exception", because Java runtime not only specifies the nature of exception but also gives the exact line number and function call sequence (the famous stack trace information) where the exception occurred. This facility is something that a compiled language can never provide.
  2. Another advantage is that Interpreters are much easier to build then a compiler.
  3. One of the biggest advantages of Interpreters is that they make platform-independence possible.
  4. Interpreted language also allow high degree of security - something badly needed for an Internet application.
  5. An intermediate language code size is much smaller than a compiled executable code.
  6. Platform independence, and tight security are the two most important factors that make an interpreted language ideally suited for Internet and web-based applications.
  7. Interpreted languages have some serious drawbacks. The interpreted applications take up more memory and CPU resources. This is because in order to run a program written in interpreted language; the corresponding interpreter must be run first. Interpreters are sophisticated, intelligent and resource hungry programs and they take up lot of CPU cycles and RAM.
  8. Due to interpreted application's decode-fetch-execute cycle; they are much slower than compiled programs.
  9. Interpreters also do lot of code-optimization, security violation checking at run-time; these extra steps take up even more resources and further slows the application down.



Platform dependence issues for compiled languages:

As explained above, after the compilers compile the source code to the .obj code, then a linker converts it to an executable code. Both the .obj and the executable code are mahince/ platform dependent

In brief, C/ C++ are platform dependent and it is a shortcoming of it.



How about Java?

To develop a Java application, there are a package you must have: the JDK (Java Development Kit) and install it on the computer. Like the SDK (Software Development Kit) of other languages, the JDK is a comprehensive set of software that includes all the bits and pieces required for developing Java applications.



JDK includes:

  • JVM (Java Virtual Machine)
  • JRE (Java Runtime Environment)  - Note that JVM is actually a part of JRE.
  • Java packages and framework classes
  • Javac (compiler)
  • Java debugger.


After complete the application, programmer can use compiler to compile the source code (.java) and produce the class file (.class). The class file is an intermediate java byte code file.


The byte code file is tricky becasue this file is the machine independent intermediate code that can be executed on any computer that with the JRE installed.


What makes Java the platform independence is the UBIQUITY of JRE. JREs are available for most of the commercial and popular platforms. Programmers compelete the code once and the same program will run on any platform.


Note that the JDK must compatiable with the platform, which means that differnt platform need to install different JDK. See the below figure:


 

 

What is JVM? (extract from web, see the resource at reference section)

Before I discuss the JVM in details, let me clarify a few related terms.

  • Java Development Kit (JDK): This includes ALL the basic Java framework packages, a compiler (javac), JRE, a JVM, debugger etc. in short all you need to develop, debug, compile and run our Java program.
  • Java Runtime Environment (JRE): This is a subset of the JDK. It does not include a debugger, compiler, and framework classes. This includes the bare minimum that a computer needs in order to run a .class file (mainly JVM and essential APIs).
  • Java Virtual Machine (JVM): JVM is a part of JRE. The .class file is passed over to JVM which then runs the program. The JRE ensures that the code does not violate any of the security restrictions. Remember that the byte-code (.class file) is not directly run on the host machine; it needs to be converted to the host machine's language. This conversion is done by the JVM. While converting the JVM ensures the security and may also optimize the code. There are many commercial JVMs available in the market - different JVMs have different capabilities, and varying degree of performance. In order to produce efficient, code with minimum delay a JVM needs to have great amount of intelligence built into it. Which would also make the JVM larger in size. Remember that for a Java program to run, the JVM must be loaded in the memory, and it is obvious that a large sized JVM would need much more computer resources than a compact one. So there has to be a fine balance between the size of a JVM and its capabilities. This is why a Java program is always 30-70% slower than equivalent C++ program.

The initial JVMs were extremely slow and were resource hungry - because actually, it interprets the byte code. In recent years lot of efficient JVMs have surfaced. These JVMs use different compilation techniques to produce efficient machine code in as less a time as possible. One such technique is called Just-In-Time (JIT) compilation (introduced since Java 1.1). This technique has also been used in .NET.



Just In Time Compilation (JIT):

Just In Time Compilation (JIT): JIT compilation is neither a traditional compiler (ahead-of-time compiler) nor a pure interpreter, it is a compiler, but it work like an interpreter, it is a hybrid beast! See below:


1. JIT works not before the execution of the program, but along with the program (along with the program running is what looks like an interpreter, rather than a compiler, but it still not interpreting). DO NOT THINK that the bytecode has been translated into native machine code already before you run the program! It is WRONG! The bytecode will be performed by JVM (exactly JIT) when you just start to run the program (anyway, it is part of the runtime environment).


2. Even start to run the program, JIT does not compile all the bytecode, it contains sophisticated logic to decide when to compile which part of the bytecode. This is why this approach of compile named Just In Time compilation.



http://java.dzone.com/articles/just-time-compiler-jit-hotspot states:
According to most researches, 80% of execution time is spent in executing 20% of code. That would be great if there was a way to determine those 20% of code and to optimize them. That's exactly what JIT does - during runtime it gathers statistics, finds the "hot" code compiles it from JVM interpreted bytecode (that is stored in .class files) to a native code that is executed directly by Operating System and heavily optimizes it. Smallest compilation unit is single method. Compilation and statistics gathering is done in parallel to program execution by special threads. During statistics gathering the compiler makes hypotheses about code function and as the time passes tries to prove or to disprove them. If the hypothesis is dis-proven the code is deoptimized and recompiled again.

The name "Hotspot" of Sun (Oracle) JVM is chosen because of the ability of this Virtual Machine to find "hot" spots in code.

What optimizations does JIT?
Let's look closely at more optimizations done by JIT.
Inline methods - instead of calling method on an instance of the object it copies the method to caller code. The hot methods should be located as close to the caller as possible to prevent any overhead.
Eliminate locks if monitor is not reachable from other threads
Replace interface with direct method calls for method implemented only once to eliminate calling of virtual functions overhead
Join adjacent synchronized blocks on the same object
Eliminate dead code
Drop memory write for non-volatile variables
Remove prechecking NullPointerException and IndexOutOfBoundsException
Et cetera

 

Below is another piece of description about JIT in wikipedia:



http://en.wikipedia.org/wiki/Just-in-time_compilation States:
In a bytecode-compiled system, source code is translated to an intermediate representation known as bytecode. Bytecode is not the machine code for any particular computer, and may be portable among computer architectures. The bytecode may then be interpreted by, or run on, a virtual machine. The JIT compiler reads the bytecodes in many sections (or in full, rarely) and compiles them dynamically into machine language so the program can run faster. Java performs runtime checks on various sections of the code and this is the reason the entire code is not compiled at once.[1] This can be done per-file, per-function or even on any arbitrary code fragment; the code can be compiled when it is about to be executed (hence the name "just-in-time"), and then cached and reused later without needing to be recompiled.

 


Just-in-time (JIT) compilers promise to improve the performance of Java applications. Rather than letting the JVM run byte code, a JIT compiler translates code into the host machine's native language. Thus, applications gain the performance enhancement of compiled code while maintaining Java's portability. 


Although the JIT compile provides great improvement in program's execution speed (compared with the initial pure interpreted process), it involves the overhead of converting the byte-code to native code at runtime. It is for this reason that despite the JIT the Java programs are still slower that an equivalent C/C++ program.


A Java Applet is a special Java program that is only allowed to run inside a browser window. When you embed a Java Applet in your web page, the browser sees the Applet tag and downloads the byte code (the .class file) for the applet from the specified location. Once the byte code is downloaded, the browser uses the JVM (included in the browser itself) to run the Applet, ensuring that the Applet does not execute any insecure APIs - mainly the APIs that access the client machine hardware.


Given the concept of the JVM, it is obvious that any programming language that compiles into Java byte code can use the JVM for running the program. We are all aware of how Java code (.java) is converted into byte code (.class) which is then run by the JVM on the host machine. What if we make a compiler of C++, that converts a C++ source file (.c or .cpp) into a java-byte code file (.class) rather than into an .obj file. Theoretically it is possible, whether it is practical or not is a different issue all together. In fact there have been many languages that have compilers which produce java byte code that can then be run by the JVM, for example, Groovy. This article belittles Microsoft's claim that the CLR is the only platform to support the language antagonism. JVM can also (and in fact already is) be used by different languages.


See the below figure that illustrates how JVM works in brief:


 

 

TO BE CONTINUED (CLR part is not demonstrated in thei article, see my another article c# stuff which you must know that discuss the CLR knowledge particularly)!

 

 

 

Conclusion:

 

 

 

 

 

Notice:

this article is adapted from the:

The Common Language Runtime (CLR) and Java Runtime Environment (JRE)

written by Kashif Manzoor .



 

分享到:
评论

相关推荐

    How Tomcat works(PDF)

    《How Tomcat Works》这本书深入浅出地介绍了Apache Tomcat这款广泛应用的Java Servlet容器的工作原理。Tomcat作为开源软件,是许多Web应用的基础,尤其在轻量级开发和测试环境中非常常见。以下是对Tomcat核心知识点...

    How Tomcat Works 中文版

    《How Tomcat Works》是一本深入探讨Apache Tomcat工作原理的中文版书籍,对于Java Web开发者来说,理解Tomcat的工作机制至关重要。Tomcat是Apache软件基金会的Jakarta项目中的一个核心部分,它是一个开源的、免费的...

    HowTomcatWorks-master.zip

    "HowTomcatWorks"项目,正如其名,旨在帮助开发者了解Tomcat的工作原理,通过源代码分享,使我们有机会深入探究这个强大的服务器内部机制。 1. **Tomcat架构概览** Tomcat的架构设计分为几个主要部分:Catalina...

    How Software Works

    《How Software Works》一书深入探讨了软件工作的原理,涵盖了从编程语言到系统设计的各个方面。作为一本关于软件工程的专业读物,它旨在帮助读者理解软件开发的核心概念,特别是聚焦于Java这一流行的编程语言。Java...

    HOw Tomcat Works

    4. **内存分配**:通过JVM参数调整堆内存大小,避免内存溢出。 5. **日志管理**:合理配置日志级别和滚动策略,以减少磁盘I/O。 6. **连接器优化**:根据应用需求选择合适的协议(如HTTP/1.1或HTTP/2),并调整...

    How Tomcat Works中文版

    《How Tomcat Works中文版》是一本深入剖析Tomcat服务器工作原理和结构的书籍,对于想要从普通程序员晋升为高级程序员的技术爱好者来说,这是一份不可或缺的参考资料。Tomcat作为Apache软件基金会的开源项目,是Java...

    HowTomcatWorks.zip

    《How Tomcat Works》这本书是理解Apache Tomcat服务器工作原理的宝贵资源,它深入解析了Tomcat的内部机制,帮助开发者更好地部署和管理Java Web应用。Tomcat是Apache软件基金会的项目之一,是一款开源的、轻量级的...

    How Tomcat Works 高清中文版

    《How Tomcat Works》是一本深入解析Tomcat服务器工作原理的书籍,中文高清版的发布为Java工程师提供了更便捷的学习途径。尽管该书可能基于较早版本的Tomcat进行讲解,但Tomcat的核心设计理念和主要功能至今仍保持着...

    译How Tomcat Works(第三章)

    《译How Tomcat Works(第三章)》这篇文章深入解析了Apache Tomcat服务器的工作原理,Tomcat作为开源的Java Servlet容器,是许多Web应用程序的基础。在本章中,我们将聚焦于Tomcat如何处理HTTP请求,以及它如何加载和...

    how tomcat works

    3. **HowTomcatWorks_SampleChapters.zip**:这可能是书籍《How Tomcat Works》的部分章节样本,提供了更深入的理论和技术细节,帮助读者更深入地理解Tomcat的内部工作机制。 **详细知识点:** - **Tomcat架构**:...

    How Tomcat Works

    《How Tomcat Works》这本书深度剖析了Tomcat 4.0和5.0版本的源代码,对于理解这个流行的开源Java应用服务器的工作原理极其有价值。Tomcat是Apache软件基金会的一个项目,作为轻量级Web服务器和Servlet容器,它在...

    How Tomcat Works(tomcat工作原理)

    Jasper负责编译JSP页面为Java源码,再将其编译为字节码,最终由JVM执行。这个过程分为三个步骤:解析JSP文件为XML,生成Servlet源码,编译Servlet。用户在浏览器中访问JSP页面时,实际上是调用了由JSP生成的Servlet...

    WEB服务器工作机制由浅至深(9):【How Tomcat Works】第16章关闭钩子以及之后的章节简述

    【WEB服务器工作机制由浅至深(9):【How Tomcat Works】第16章 关闭钩子以及之后的章节简述】 在深入探讨Tomcat服务器的工作机制时,我们来到了第16章,这一章主要讨论了“关闭钩子”(Shutdown Hooks)的概念及其在...

    WEB服务器工作机制由浅至深(8):【How Tomcat Works】第14章Server和Service

    《WEB服务器工作机制由浅至深(8):【How Tomcat Works】第14章Server和Service》 在深入探讨Web服务器的工作机制时,Tomcat作为Apache软件基金会的开源Java Servlet容器,扮演着至关重要的角色。本章将聚焦于Tomcat...

    java深入学习教程书籍ppt及pdf集合

    how tomcat works中文版367页pdf j2ee教程2010ppt java并发编程培训(阿里巴巴)ppt java反射机制总结pdf java数据结构上机实践指导教程pdf java网络编程pdf jvm内存问题最佳实践ppt jvm实现机制ppt jvm调优word ...

    HowTomcatWorks:tomcat如何工作

    4. **JVM启动优化**:Tomcat会进行一些JVM的调优,如设置堆内存大小、开启垃圾回收策略等。 三、请求处理流程 1. **接收请求**:Coyote接收到来自网络的HTTP请求。 2. **协议处理**:Coyote将HTTP请求转化为内部...

    细述 Java垃圾回收机制→How Java Garbage Collection Works- - Android 1

    Java垃圾回收机制是Java虚拟机(JVM)中的一种自动管理内存的机制,它可以自动地将不再使用的对象从内存中回收,以释放更多的内存空间供其他对象使用。本文将详细介绍Java垃圾回收机制的工作原理、各个阶段的执行...

    Programming.APIs.With.The.Spark.Web.Framework.B017OLT37I

    Throughout the course of this guide, we introduce the benefits of using the Spark web framework, demonstrate how it works with Java, and compare language behavior with other languages like Kotlin, Go...

Global site tag (gtag.js) - Google Analytics