`
jojo_java
  • 浏览: 96437 次
  • 性别: Icon_minigender_1
  • 来自: 南京
社区版块
存档分类
最新评论

Clear Special character

    博客分类:
  • JAVA
 
阅读更多

 

	public static String[] analyzer(String string) {
		List<String> list = new ArrayList<String>();
		try {
			StringReader reader = new StringReader(string);
			IKSegmenter ik = new IKSegmenter(reader, true);
			Lexeme lexeme = null;
			while ((lexeme = ik.next()) != null) {
				list.add(lexeme.getLexemeText());
			}
		} catch (IOException e) {
			e.printStackTrace();
		}
		return list.toArray(new String[list.size()]);
	}

	public static String[] generate(String string) {
		List<String> list = new ArrayList<String>();
		string = clear_special_character(string);
		String[] tags = string.split("[,\\s]");
		for (String tag : tags) {
			tag = tag.trim();
			if (tag.length() > 0) {
				list.add(tag);
			}
		}
		return list.toArray(new String[list.size()]);
	}

	public static String clear_special_character(String string) {
		string = string.replaceAll("\\pP|\\pS", " ");
		string = string.replaceAll("\\s+", " ");
		return string;
	}
	
分享到:
评论

相关推荐

    unity3d得控件解释。。还有对unity3d的一些介绍

    * Special Characters:特殊字符输入 * Load Selection:加载选择的对象 * Save Selection:保存选择的对象 * Project Settings:项目设置 * Render Settings:渲染设置 * Graphics Emulation:图形模拟 * Network ...

    Microsoft Library MSDN4DOS.zip

    3.6 String and Character Translation Instructions 3.7 Instructions for Block-Structured Languages 3.8 Flag Control Instructions 3.9 Coprocessor Interface Instructions 3.10 Segment Register ...

    Unity3D_最全最简单菜单翻译_中英文对照.doc

    * Edit 编辑: Unity3D 编辑菜单,提供了多种编辑功能,例如 Frame、Select All、Special Characters 等。 * Load Selection 加载选择:加载已经保存的选择。 * Save Selection 保存选择:保存当前的选择状态。 * ...

    一个java正则表达式工具类源代码.zip(内含Regexp.java文件)

    * \a The alert (bell) character ('\u0007') \a The alert (bell) character ('\u0007') * \e The escape character ('\u001B') \e esc符号 ('\u001B') * \cx The control character ...

    rfc全部文档离线下载rfc1-rfc8505

    The link field is a special device used by the IMPs to limit certain kinds of congestion. They function as follows. Between every pair of HOSTs there are 32 logical full-duplex connections over ...

    2009 达内Unix学习笔记

    clear 清屏,清除(之前的内容并未删除,只是没看到,拉回上面可以看回)。 五、目录管理命令 pwd 显示当前所在目录,打印当前目录的绝对路径。 cd 进入某目录,DOS内部命令 显示或改变当前目录。 cd回车/cd ~ 都...

    端口查看工具

    contained comma character. * Version 1.34: o New Option: Remember Last Filter (The filter is saved in cports_filter.txt) * Version 1.33: o Added support for saving comma-delimited (.csv) files. ...

    UE(官方下载)

    UltraEdit includes several special insert functions under the Insert menu. You can use these functions to insert a file into the active file, insert a string into the file at every specified increment...

    netWindows_0.3.0_pre2

    WHETHER TORT (INCLUDING NEGLIGENCE), CONTRACT, OR OTHERWISE, SHALL THELICENSOR BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL,INCIDENTAL, OR CONSEQUENTIAL DAMAGES OF ANY CHARACTER ARISING ...

    JSP Simple Examples

    A directive is a way to give special instructions to the container at page translation time. The page directive is written on the top of the jsp page. Html tags in jsp In this example we have used ...

    Turbo C++ 3.0[DISK]

    special offer, and write for full details on how to receive a free IntroPak containing a $15 credit toward your first month's on-line charges. 2. Check with your local software dealer or users' ...

    Turbo C++ 3.00[DISK]

    special offer, and write for full details on how to receive a free IntroPak containing a $15 credit toward your first month's on-line charges. 2. Check with your local software dealer or users' ...

    Linux System Administrator Guide Version0.9

    - **Special Configuration in /etc/inittab**: Specific configurations for customizing the system. - **Booting in Single User Mode**: Instructions for booting in single-user mode. #### Logging In and ...

    SIEMENS CNC常用缩略语

    #### ISO-Code - Special tape code, number of holes per character always even (特殊纸带码,每个字符的孔数为偶数) ISO码是一种特殊的纸带码格式,其中每个字符的孔数总是偶数。这种编码方式在CNC系统中被广泛...

    VclZip pro v3.10.1

    Special OnGetNextTStream Event for Delphi 4,5, BCB 4, and 5 - Allows zipping multiple TStreams in one process - More efficient than calling ZipFromStream multiple times Capability to use the latest ...

Global site tag (gtag.js) - Google Analytics