使用ANTLR处理文本 -

powertech

浏览: 539403 次

最近访客更多访客>>

zdfjt

wxwshdt

qijjmyth

ppgg908

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

使用ANTLR处理文本

博客分类：

java

java

引用

使用 Antlr 处理文本
https://www.ibm.com/developerworks/cn/java/j-lo-antlrtext/index.html
该文章写的非常好,无耐是2011年写的,与现有的antlr版本差别较大,编译不过去,编译过去,也测试不出来正确的结果,以下为用antlr4.2重写的

新项目使用maven和ant构建,需要以下几个文件

pom.xml
build.xml
SqlExtrator.g4语法文件
SqlExtrator.clj测文件
Test.java 测试代码

测试方法,

先用ant执行compile任务,生成和编译生成的一堆词法解析器和语法解析器代码,

test

SqlExtrator.clj

使有Test.java,手动编程调用

使用ant任务的截图,

pom.xml


<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
	<modelVersion>4.0.0</modelVersion>

	<groupId>com.xxx.lang</groupId>
	<artifactId>fieldTypeUpdate</artifactId>
	<version>0.0.1-SNAPSHOT</version>
	<packaging>jar</packaging>

	<name>fieldTypeUpdate</name>
	<url>http://maven.apache.org</url>


	<properties>
		<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
	</properties>

	<dependencies>
		<dependency>
			<groupId>junit</groupId>
			<artifactId>junit</artifactId>
			<version>3.8.1</version>
			<scope>test</scope>
		</dependency>
		<dependency>
			<groupId>org.antlr</groupId>
			<artifactId>antlr4</artifactId>
			<version>4.2</version>
		</dependency>
	</dependencies>

	<build>
		<plugins>
			<plugin>
				<groupId>org.codehaus.mojo</groupId>
				<artifactId>build-helper-maven-plugin</artifactId>
				<version>1.8</version>
				<executions>
					<execution>
						<id>add-source</id>
						<phase>generate-sources</phase>
						<goals>
							<goal>add-source</goal>
						</goals>
						<configuration>
							<sources>
								<source>src/generated/java</source>
							</sources>
						</configuration>
					</execution>
				</executions>
			</plugin>
		</plugins>
	</build>

</project>

build.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <project basedir="." default="test" name="mylang">
    <property environment="env"/>
    <property name="debuglevel" value="source,lines,vars"/>
    <property name="target" value="1.8"/>
    <property name="source" value="1.8"/>
    <property name="language" value="sqlExtrator"/>
    <path id="mylang.classpath">
    	<pathelement location="lib/antlr-2.7.7.jar"/>
		<pathelement location="lib/antlr-runtime-3.5.jar"/>
		<pathelement location="lib/antlr4-4.2.jar"/>
		<pathelement location="lib/antlr4-annotations-4.2.jar"/>
		<pathelement location="lib/antlr4-runtime-4.2.jar"/>
		<pathelement location="lib/junit-3.8.1.jar"/>
		<pathelement location="lib/org.abego.treelayout.core-1.0.1.jar"/>
		<pathelement location="lib/ST4-4.0.7.jar"/>
		<pathelement location="lib/stringtemplate-3.2.1.jar"/>
    </path>
    
    <path id="antlr.classpath">
        <pathelement location="antlr-4.7.1-complete.jar"/>
    </path>
    
    <path id="compile.path">
    	 <pathelement location="target/classes"/>
    </path>
    

    
    <target name="clean">
        <delete dir="target"></delete>
    	<delete dir="src/main/java/com/xxx/lang/mylang/${language}"></delete>
    </target>
    
    <target depends="clean" name="gen">
        <echo message="generate java from g4 file"/>
       <java classname="org.antlr.v4.Tool" fork="yes" failonerror="true">
    				<classpath refid="mylang.classpath"/>
    				<arg value="src/main/resources/SqlExtrator.g4"/>
    				<arg line="-package "/>
    				<arg value="com.xxx.lang.mylang.${language}"/>
    				<arg line="-o "/>
    				<arg value="src/main/java/com/xxx/lang/mylang/${language}/"/>
       				<arg value="-visitor"/>
       				<arg value="-no-listener"/>
       				<arg value="-encoding"/>
       				<arg value="UTF-8"/>
    			</java>
    </target>
    
    <target depends="gen" name="compile">
        <echo message="compile generate java file"/>
         <mkdir dir="target/classes"/>
        <javac debug="true" debuglevel="${debuglevel}" destdir="target/classes" includeantruntime="false" source="${source}" target="${target}">
            <src path="src/main/java"/>
        	<compilerarg line="-encoding UTF-8 "/>
            <classpath refid="mylang.classpath"/>
        </javac>
    </target>

    	
    
    <target name="test"  description="Run the main class" >
    			<java classname="org.antlr.v4.gui.TestRig" fork="yes" failonerror="true">
    				<classpath refid="antlr.classpath"/>
    				<classpath refid="compile.path"/>
    				<sysproperty key="file.encoding" value="UTF-8"/>
    				<arg value="com.xxx.lang.mylang.${language}.SqlExtrator"></arg>
    				<arg value="sql"></arg>
    				<arg value="-gui"></arg>
    				<arg value="src/test/java/SqlExtrator.clj"></arg>
    			</java>
    </target>
    	

</project>

SqlExtrator.g4 语法文件该语法文件,仅可以识别词法规定的字符,词法外的字符将会报错

	

grammar SqlExtrator; 


WS : (' ' |'\t' |'\r' |'\n' )+  ; 
 
INT: '0'..'9' + ;   
  
ID : ('a'..'z' |'A'..'Z' |'_' ) ('a'..'z' |'A'..'Z' |'_' |'0'..'9' )*;

EOL: ('\n' | '\r' | '\r\n')*;

SUCCESS:'DB20000I  The SQL command completed successfully.'EOL  ; 

SqlFrg :'INSERT INTO SYSA.' ID '(' ID ',' ID ')' WS 'VALUES' '(\'' ID '\',\'' INT '\')'EOL ;
 

txt:mysql=SqlFrg {System.out.println($mysql.text);} SUCCESS;

sql:(txt)+;

第二个版本的语法,添加了:
FILTER: .? -> skip;
仅这一行,这行代码,使用正则的非贪婪匹配规则,

引用

Wildcard Operator and Nongreedy Subrules

正则表达式贪婪与非贪婪模式

1.什么是正则表达式的贪婪与非贪婪匹配

　　如：String str="abcaxc";

　　　　Patter p="ab.*c";

　　贪婪匹配:正则表达式一般趋向于最大长度匹配，也就是所谓的贪婪匹配。如上面使用模式p匹配字符串str，结果就是匹配到：abcaxc(ab.*c)。

　　非贪婪匹配：就是匹配到结果就好，就少的匹配字符。如上面使用模式p匹配字符串str，结果就是匹配到：abc(ab.*c)。

2.编程中如何区分两种模式

　　默认是贪婪模式；在量词后面直接加上一个问号？就是非贪婪模式。

　　量词：{m,n}：m到n个

　　　　　*：任意多个

　　　　　+：一个到多个

　　　　　？：0或一个


	
grammar SqlExtrator; 



SqlFrg :'INSERT INTO SYSA.' ID '(' ID ',' ID ')' WS 'VALUES' '(\'' ID '\',\'' INT '\')' ;
 


fragment WS : (' ' |'\t' |'\r' |'\n' )+  ; 
 
fragment ID: ('a'..'z' |'A'..'Z' |'_' ) ('a'..'z' |'A'..'Z' |'_' |'0'..'9' )*; 

fragment INT: '0'..'9' + ;   

fragment EOL: '\n' | '\r' | '\r\n';

 SUCCESS:'DB20000I  The SQL command completed successfully.' ;



all: (SqlFrg    SUCCESS  {System.out.println($SqlFrg.text);})+ ; 

FILTER: .? -> skip;

SqlExtrator.clj 测试文件


INSERT INTO SYSA.IF_EMPUSRRLA(USRNUM,EMPNUM) VALUES('U037508','275159') 
DB20000I  The SQL command completed successfully. 

document.write(v+' test is '+result+'</br>');//该行代码在第一个版本的语法中会报错

INSERT INTO SYSA.IF_USRSTNRLA(USRNUM,STNNUM) VALUES('U037710','00026') 
DB20000I  The SQL command completed successfully.

Test.java 测试代码

public class Test {

	public static void main(String[] args)  {
		
		 try {
			String filename = "D:\\workplace\\fieldTypeUpdate\\src\\test\\java\\SqlExtrator.clj"; 
			 InputStream in = new FileInputStream(filename); 
			 ANTLRInputStream input = new ANTLRInputStream(in); 

			 SqlExtratorLexer lexer = new SqlExtratorLexer(input); 
			 
			 CommonTokenStream tokens = new CommonTokenStream(lexer); 
			 
			 SqlExtratorParser parser = new SqlExtratorParser(tokens);
			
			 parser.sql();
			System.out.println("done!");
		} catch (FileNotFoundException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		} catch (IOException e) {
			// TODO Auto-generated catch block
			e.printStackTrace();
		}
		
	}

}

测试结果控制台输出:

引用

INSERT INTO SYSA.IF_EMPUSRRLA(USRNUM,EMPNUM) VALUES('U037508','275159')

INSERT INTO SYSA.IF_USRSTNRLA(USRNUM,STNNUM) VALUES('U037710','00026')

done!

查看图片附件

分享到：

U盘自动压缩复制小程序 | 索引的创建对查询性能影响

2019-08-28 17:32
浏览 788
评论(0)
分类:开源软件
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

使用ANTLR处理文本

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

使用ANTLR处理文本

评论

发表评论

相关推荐

charles4.2下载与破解方法以及配置https

序列号批量生成算法

解决maven-metadata.xml文件下载卡死问题

rsync备份和删除指定文件

javaLocale格式化日期和数字

centos6 tomcat 启动脚本 tomcat服务

win7 命令行改IP和DNS

jenkins中集成sonar，使用findbug、pmd、checkstyle提升代码质量

jenkins 集成sonar

activeMQ5.14权限配置

solaris 使用解压版的jdk

solaris tomcat开机启动

HibernateTemplate Vs HibernateDaoSupport Vs Direct Hibernate Access

spring mvc mybatis will not be managed by Spring

java里判断一点是否在某个区域

12306的技术升级

工作流的123

sping mvc 使用@Value注解为controller注入值

googleapis.com域名访问慢的解决办法

ehcache-web缓存的使用和清除

最近访客更多访客>>