`
RednaxelaFX
  • 浏览: 3038980 次
  • 性别: Icon_minigender_1
  • 来自: 海外
社区版块
存档分类
最新评论

从Sun的javac源码中抽取出来的LL(1) Java语法

    博客分类:
  • Java
阅读更多
该语法抽取自OpenJDK 6 build 17中javac的语法分析器,j2se/src/share/classes/com/sun/tools/javac/parser/Parser.java
该代码以GPLv2许可证开源。

注意Parser类的注释:
/** The parser maps a token sequence into an abstract syntax
 *  tree. It operates by recursive descent, with code derived
 *  systematically from an LL(1) grammar. For efficiency reasons, an
 *  operator precedence scheme is used for parsing binary operation
 *  expressions.
 *
 *  <p><b>This is NOT part of any API supported by Sun Microsystems.  If
 *  you write code that depends on this, you do so at your own risk.
 *  This code and its internal interfaces are subject to change or
 *  deletion without notice.</b>
 */

Sun JDK 6中的javac使用了递归下降与运算符优先级的混合解析方式。主要是递归下降式,在解析二元运算表达式时采用运算符优先级方式以提高解析效率。

下面是从各个语法分析方法前的注释中提取出来的LL(1)语法。顺序有调整。
该语法采用EBNF记法,其中[]表示可选(0或1个),{}表示任意个(0或多个),()表示分组,双引号内的是字面量,没有被双引号包围的名字是语法规则名。
有几条规则,如SuperSuffix,可能有多个定义;它们是从不同方法的注释中提取出来的,其内容本应该不矛盾,我还没仔细看清楚到底这里有什么问题。
CompilationUnit = [ { "@" Annotation } PACKAGE Qualident ";"] {ImportDeclaration} {TypeDeclaration}

AnnotationsOpt = { '@' Annotation }

ImportDeclaration = IMPORT [ STATIC ] Ident { "." Ident } [ "." "*" ] ";"

TypeDeclaration = ClassOrInterfaceOrEnumDeclaration
                | ";"

ClassOrInterfaceOrEnumDeclaration = ModifiersOpt
         (ClassDeclaration | InterfaceDeclaration | EnumDeclaration)

ModifiersOpt = { Modifier }
Modifier = PUBLIC | PROTECTED | PRIVATE | STATIC | ABSTRACT | FINAL
         | NATIVE | SYNCHRONIZED | TRANSIENT | VOLATILE | "@"
         | "@" Annotation

Annotation              = "@" Qualident [ "(" AnnotationFieldValues ")" ]

AnnotationFieldValues   = "(" [ AnnotationFieldValue { "," AnnotationFieldValue } ] ")"

AnnotationFieldValue    = AnnotationValue
                        | Identifier "=" AnnotationValue

AnnotationValue          = ConditionalExpression
                        | Annotation
                        | "{" [ AnnotationValue { "," AnnotationValue } ] [","] "}"

ClassDeclaration = CLASS Ident TypeParametersOpt [EXTENDS Type]
                   [IMPLEMENTS TypeList] ClassBody

InterfaceDeclaration = INTERFACE Ident TypeParametersOpt
                       [EXTENDS TypeList] InterfaceBody

EnumDeclaration = ENUM Ident [IMPLEMENTS TypeList] EnumBody

EnumBody = "{" { EnumeratorDeclarationList } [","]
                [ ";" {ClassBodyDeclaration} ] "}"

EnumeratorDeclaration = AnnotationsOpt [TypeArguments] IDENTIFIER [ Arguments ] [ "{" ClassBody "}" ]

TypeList = Type {"," Type}

ClassBody     = "{" {ClassBodyDeclaration} "}"
InterfaceBody = "{" {InterfaceBodyDeclaration} "}"

ClassBodyDeclaration =
    ";"
  | [STATIC] Block
  | ModifiersOpt
    ( Type Ident
      ( VariableDeclaratorsRest ";" | MethodDeclaratorRest )
    | VOID Ident MethodDeclaratorRest
    | TypeParameters (Type | VOID) Ident MethodDeclaratorRest
    | Ident ConstructorDeclaratorRest
    | TypeParameters Ident ConstructorDeclaratorRest
    | ClassOrInterfaceOrEnumDeclaration
    )
InterfaceBodyDeclaration =
    ";"
  | ModifiersOpt Type Ident
    ( ConstantDeclaratorsRest | InterfaceMethodDeclaratorRest ";" )

MethodDeclaratorRest =
    FormalParameters BracketsOpt [Throws TypeList] ( MethodBody | [DEFAULT AnnotationValue] ";")
VoidMethodDeclaratorRest =
    FormalParameters [Throws TypeList] ( MethodBody | ";")
InterfaceMethodDeclaratorRest =
    FormalParameters BracketsOpt [THROWS TypeList] ";"
VoidInterfaceMethodDeclaratorRest =
    FormalParameters [THROWS TypeList] ";"
ConstructorDeclaratorRest =
    "(" FormalParameterListOpt ")" [THROWS TypeList] MethodBody

QualidentList = Qualident {"," Qualident}

Qualident = Ident { DOT Ident }

TypeParametersOpt = ["<" TypeParameter {"," TypeParameter} ">"]

TypeParameter = TypeVariable [TypeParameterBound]
TypeParameterBound = EXTENDS Type {"&" Type}
TypeVariable = Ident

FormalParameters = "(" [ FormalParameterList ] ")"
FormalParameterList = [ FormalParameterListNovarargs , ] LastFormalParameter
FormalParameterListNovarargs = [ FormalParameterListNovarargs , ] FormalParameter

FormalParameter = { FINAL | '@' Annotation } Type VariableDeclaratorId
LastFormalParameter = { FINAL | '@' Annotation } Type '...' Ident | FormalParameter

MethodBody = Block

Statement =
     Block
   | IF ParExpression Statement [ELSE Statement]
   | FOR "(" ForInitOpt ";" [Expression] ";" ForUpdateOpt ")" Statement
   | FOR "(" FormalParameter : Expression ")" Statement
   | WHILE ParExpression Statement
   | DO Statement WHILE ParExpression ";"
   | TRY Block ( Catches | [Catches] FinallyPart )
   | SWITCH ParExpression "{" SwitchBlockStatementGroups "}"
   | SYNCHRONIZED ParExpression Block
   | RETURN [Expression] ";"
   | THROW Expression ";"
   | BREAK [Ident] ";"
   | CONTINUE [Ident] ";"
   | ASSERT Expression [ ":" Expression ] ";"
   | ";"
   | ExpressionStatement
   | Ident ":" Statement

Block = "{" BlockStatements "}"

BlockStatements = { BlockStatement }
BlockStatement  = LocalVariableDeclarationStatement
                | ClassOrInterfaceOrEnumDeclaration
                | [Ident ":"] Statement
LocalVariableDeclarationStatement
                = { FINAL | '@' Annotation } Type VariableDeclarators ";"

ParExpression = "(" Expression ")"

ForInit = StatementExpression MoreStatementExpressions
        |  { FINAL | '@' Annotation } Type VariableDeclarators

ForUpdate = StatementExpression MoreStatementExpressions

VariableDeclarators = VariableDeclarator { "," VariableDeclarator }

VariableDeclaratorsRest = VariableDeclaratorRest { "," VariableDeclarator }
ConstantDeclaratorsRest = ConstantDeclaratorRest { "," ConstantDeclarator }

VariableDeclarator = Ident VariableDeclaratorRest
ConstantDeclarator = Ident ConstantDeclaratorRest

VariableDeclaratorRest = BracketsOpt ["=" VariableInitializer]
ConstantDeclaratorRest = BracketsOpt "=" VariableInitializer

VariableDeclaratorId = Ident BracketsOpt

CatchClause     = CATCH "(" FormalParameter ")" Block

SwitchBlockStatementGroups = { SwitchBlockStatementGroup }
SwitchBlockStatementGroup = SwitchLabel BlockStatements
SwitchLabel = CASE ConstantExpression ":" | DEFAULT ":"

MoreStatementExpressions = { COMMA StatementExpression }

Expression = Expression1 [ExpressionRest]
ExpressionRest = [AssignmentOperator Expression1]
AssignmentOperator = "=" | "+=" | "-=" | "*=" | "/=" |
                     "&=" | "|=" | "^=" |
                     "%=" | "<<=" | ">>=" | ">>>="
Type = Type1
TypeNoParams = TypeNoParams1
StatementExpression = Expression
ConstantExpression = Expression

Expression1   = Expression2 [Expression1Rest]
Type1         = Type2
TypeNoParams1 = TypeNoParams2

Expression1Rest = ["?" Expression ":" Expression1]

Expression2   = Expression3 [Expression2Rest]
Type2         = Type3
TypeNoParams2 = TypeNoParams3

Expression2Rest = {infixop Expression3}
                | Expression3 INSTANCEOF Type
infixop         = "||"
                | "&&"
                | "|"
                | "^"
                | "&"
                | "==" | "!="
                | "<" | ">" | "<=" | ">="
                | "<<" | ">>" | ">>>"
                | "+" | "-"
                | "*" | "/" | "%"

Expression3    = PrefixOp Expression3
               | "(" Expr | TypeNoParams ")" Expression3
               | Primary {Selector} {PostfixOp}
Primary        = "(" Expression ")"
               | Literal
               | [TypeArguments] THIS [Arguments]
               | [TypeArguments] SUPER SuperSuffix
               | NEW [TypeArguments] Creator
               | Ident { "." Ident }
                 [ "[" ( "]" BracketsOpt "." CLASS | Expression "]" )
                 | Arguments
                 | "." ( CLASS | THIS | [TypeArguments] SUPER Arguments | NEW [TypeArguments] InnerCreator )
                 ]
               | BasicType BracketsOpt "." CLASS
PrefixOp       = "++" | "--" | "!" | "~" | "+" | "-"
PostfixOp      = "++" | "--"
Type3          = Ident { "." Ident } [TypeArguments] {TypeSelector} BracketsOpt
               | BasicType
TypeNoParams3  = Ident { "." Ident } BracketsOpt
Selector       = "." [TypeArguments] Ident [Arguments]
               | "." THIS
               | "." [TypeArguments] SUPER SuperSuffix
               | "." NEW [TypeArguments] InnerCreator
               | "[" Expression "]"
TypeSelector   = "." Ident [TypeArguments]
SuperSuffix    = Arguments | "." Ident [Arguments]

SuperSuffix = Arguments | "." [TypeArguments] Ident [Arguments]

BasicType = BYTE | SHORT | CHAR | INT | LONG | FLOAT | DOUBLE | BOOLEAN

ArgumentsOpt = [ Arguments ]

Arguments = "(" [Expression { COMMA Expression }] ")"

TypeArgumentsOpt = [ TypeArguments ]

TypeArguments  = "<" TypeArgument {"," TypeArgument} ">"

TypeArgument = Type
             | "?"
             | "?" EXTENDS Type {"&" Type}
             | "?" SUPER Type

BracketsOpt = {"[" "]"}

BracketsSuffixExpr = "." CLASS
BracketsSuffixType =

Creator = Qualident [TypeArguments] ( ArrayCreatorRest | ClassCreatorRest )

InnerCreator = Ident [TypeArguments] ClassCreatorRest

ArrayCreatorRest = "[" ( "]" BracketsOpt ArrayInitializer
                       | Expression "]" {"[" Expression "]"} BracketsOpt )

ClassCreatorRest = Arguments [ClassBody]

ArrayInitializer = "{" [VariableInitializer {"," VariableInitializer}] [","] "}"

VariableInitializer = ArrayInitializer | Expression

Ident = IDENTIFIER

Literal =
    INTLITERAL
  | LONGLITERAL
  | FLOATLITERAL
  | DOUBLELITERAL
  | CHARLITERAL
  | STRINGLITERAL
  | TRUE
  | FALSE
  | NULL


现在只是从源码中原样提取了注释,还没检查有没有提取错误或缺失。总之先记下来慢慢看。

P.S. OpenJDK项目中有一个Compiler Grammar项目,其中有一个用ANTLR写的语法文件,Java.g,值得参考。
分享到:
评论
4 楼 lurker0 2010-02-18  
LCC是LL(1)的
3 楼 RednaxelaFX 2010-02-09  
lwwin 写道
LL(1) 我记得以前唯一看过的LCC差不多吧^^?

我现在只记得LCC的解析器是手写的RD式,但至于语法是不是LL(1)我就不确定了,不记得它向前预读了多少个token。上次LCC源码已经是多久之前的事了……一年前?回头查查看。
2 楼 lwwin 2010-02-08  
有没有可能订阅一份这个的实时更新啊

完美版期待了^^
1 楼 lwwin 2010-02-08  
LL(1) 我记得以前唯一看过的LCC差不多吧^^?

相关推荐

    javac 源码 javac 源码

    Java 编译器 javac 是 Java 语言的核心组成部分,它将程序员编写的源代码(`.java` 文件)转换成可执行的字节码(`.class` 文件),使得 JVM(Java 虚拟机)能够运行这些程序。了解 javac 的源码对于深入理解 Java ...

    javac源码免费下载

    在深入理解javac源码的过程中,我们可以更好地了解Java语言的编译原理,以及Java平台如何实现跨平台的执行。 源码分析: 1. **词法分析**:javac首先进行词法分析,将源代码分割成一个个的词法单元(如标识符、...

    javac源码和运行说明文件.zip

    - Java编译器源码主要位于openjdk的`src/jdk.compiler/share/classes/com/sun/tools/javac`目录下,包括多个子模块,如parser(解析器)、main(主程序入口)、tree(抽象语法树)等。 - 源码中包含了处理类路径、...

    javac1.7源代码(完全版,可运行)

    Java编译器(javac)是Java开发工具包(JDK)的核心组件,负责将Java源代码转换为可执行的字节码。本资源提供的"javac1.7源代码"是一个完全版,包含了OpenJDK 1.7版本中的javac编译器的源代码。OpenJDK是一个开源...

    Javac内核源码

    Javac 内核源码深入揭示了 Java 编译过程的实现细节,这对于理解 Java 的运行机制、优化代码性能以及进行编译器开发具有重要的学习价值。 在 OpenJDK 6 版本中,Javac 作为开源项目,提供了一个了解 Java 编译器...

    jdk7 javac源码

    通过阅读和理解JDK 7中javac的源码,开发者可以深入了解Java语言的底层实现,这对于优化代码、解决编译问题以及进行语言特性的深入研究都具有极大的价值。同时,这也为理解后续版本JDK的改进和新特性提供了基础。

    javac 的 源 代 码

    1. 解析(Parsing):javac首先读取Java源文件,将其转化为抽象语法树(AST,Abstract Syntax Tree)。这个过程解析了Java源代码的语法结构,如类、方法、变量等。 2. 语义分析(Semantic Analysis):解析后的AST...

    JAVAC源码 LR分析法 源代码

    在`JAVAC`中,这些规则用于描述Java语法的各个部分,如类声明、方法定义等。上下文无关文法_百度百科.htm可能提供了关于CFG的详细解释,包括它的组成部分(非终结符、终结符、产生式和开始符号)以及如何构建和使用...

    Atom-linter-javac,关于保存的java Java。贡献给AtomCenter/Linter.zip

    在Java开发中,实时的语法检查(也称为linter)对于提高代码质量和减少错误至关重要,而Atom-linter-javac就是这样一款工具,它集成Javac编译器,能够在你保存Java源文件时自动检查代码的语法错误。 首先,我们需要...

    JAVA语法大全(基本语法)

    JAVA语法大全(基本语法) JAVA语法大全是指JAVA的基本语法,包括关键字、词法规则、数据类型、常量与变量、运算符和表达式、语句、数组和字符串等方面的知识点。 一、词法规则 词法规则是JAVA程序的基本组成部分...

    Java语法大全史上最全语法

    Java源代码(.java文件)首先通过编译器javac进行词法分析、语法分析以及类型检查,最终生成字节码文件(.class文件)。字节码文件被加载到JVM中进行验证(主要是重复静态类型检查),然后可以选择解释执行或编译为...

    java与javac命令详解

    Java 和 javac 命令是 Java 语言的基本组成部分,它们在 Java 应用程序的编译和执行过程中扮演着重要的角色。javac 命令用于编译 Java 程序源代码,生成字节码文件,而 java 命令用于执行已经编译的字节码文件。 ...

    Java-计算器源码.zip

    8. **编译与运行**:在Java中,源码需要通过`javac`编译器编译成字节码(`.class`文件),然后用`java`命令运行。在源码中,可能还会包含main方法,这是程序的入口点。 9. **测试与调试**:源码中可能包含单元测试...

    jdk9_javac.rar

    然而,直接使用从压缩包"jdk9_javac.rar"解压出来的编译器可能会遇到package冲突的问题。本文将深入探讨JDK9中javac编译器的特性和如何解决包冲突问题。 首先,JDK9引入了模块系统(Project Jigsaw),这是对Java...

    最新javac源码---GJC1.42_src.rar

    总结来说,"最新javac源码---GJC1.42_src.rar" 是一个深入学习Java编译过程的宝贵资料,它涵盖了词法分析、语法分析、类型检查、错误处理、代码优化等多个方面,对于提高Java编程水平和理解Java运行机制具有重要意义...

    Java毕业设计源码-基于JavajavaC语言试题生成与考试系统(源代码)(IT计算机专业Java源代码资料).zip

    Java毕业设计源码-基于JavajavaC语言试题生成与考试系统(源代码)(IT计算机专业Java源代码资料).zipJava毕业设计源码-基于JavajavaC语言试题生成与考试系统(源代码)(IT计算机专业Java源代码资料).zipJava毕业设计...

    javac-source-code-reading:javac源码调试-java source code

    最后,由于 javac 是开源项目,我们可以从 `javac-source-code-reading-master` 这个压缩包中获取完整的源码,通过阅读和实践,进一步加深对 Java 编译原理的理解。在开源社区,你可以找到许多关于 javac 源码的讨论...

    Java语法大全.pdf

    编译Java源代码的主要工具是`javac`,它会进行词法分析、语法分析和类型检查,并生成字节码文件(后缀为`.class`)。 Java的编译过程包含以下几个主要步骤: 1. **词法分析**:将源代码的字符序列转换成标记...

    javac--------src

    这个系列的内容可能涵盖了从源码级别的深入剖析javac的工作流程,包括词法分析、语法分析、语义分析、类文件生成等多个关键步骤。通过学习这些内容,开发者能够更深入地理解Java程序是如何从源代码转变为运行时的...

Global site tag (gtag.js) - Google Analytics