转自:http://lbs.iteye.com/blog/208056
public static String Html2Text(String inputString) {
String htmlStr = inputString; //含html标签的字符串
String textStr ="";
java.util.regex.Pattern p_script;
java.util.regex.Matcher m_script;
java.util.regex.Pattern p_style;
java.util.regex.Matcher m_style;
java.util.regex.Pattern p_html;
java.util.regex.Matcher m_html;
try {
String regEx_script = "<[\\s]*?script[^>]*?>[\\s\\S]*?<[\\s]*?\\/[\\s]*?script[\\s]*?>"; //定义script的正则表达式{或<script[^>]*?>[\\s\\S]*?<\\/script> }
String regEx_style = "<[\\s]*?style[^>]*?>[\\s\\S]*?<[\\s]*?\\/[\\s]*?style[\\s]*?>"; //定义style的正则表达式{或<style[^>]*?>[\\s\\S]*?<\\/style> }
String regEx_html = "<[^>]+>"; //定义HTML标签的正则表达式
p_script = Pattern.compile(regEx_script,Pattern.CASE_INSENSITIVE);
m_script = p_script.matcher(htmlStr);
htmlStr = m_script.replaceAll(""); //过滤script标签
p_style = Pattern.compile(regEx_style,Pattern.CASE_INSENSITIVE);
m_style = p_style.matcher(htmlStr);
htmlStr = m_style.replaceAll(""); //过滤style标签
p_html = Pattern.compile(regEx_html,Pattern.CASE_INSENSITIVE);
m_html = p_html.matcher(htmlStr);
htmlStr = m_html.replaceAll(""); //过滤html标签
textStr = htmlStr;
}catch(Exception e) {
System.err.println("Html2Text: " + e.getMessage());
}
return textStr;//返回文本字符串
}
相关推荐
标题 "http://topkinghat.iteye.com/blog/840706" 提到的博客链接实际上指向了“姜铁”的个人博客文章,而描述中的 "NULL" 暂无具体信息。不过,标签“源码”和“工具”暗示了这篇博客可能涉及到软件开发的源代码...
博文链接:https://jackzhangyunjie.iteye.com/blog/202349
NULL 博文链接:https://albert0707.iteye.com/blog/562969
http://www.iteye.com/topic/699515 主题:Swing是一把刀 http://www.iteye.com/topic/702804 主题:Swing第二刀:枝间新绿一重重 http://www.iteye.com/topic/707540 主题:Swing第二小刀刀:星星之火可以燎原 ...
NULL 博文链接:https://wy649898543.iteye.com/blog/1423655
NULL 博文链接:https://ownraul.iteye.com/blog/1277047
从零开始学Spring Boot,没有积分的可以看博客:http://412887952-qq-com.iteye.com/ 【Spring Boot 系列博客】 0)前言【从零开始学Spring Boot】 : http://412887952-qq-com.iteye.com/blog/2291496 (1...
博文链接:https://avery-leo.iteye.com/blog/213980
NULL 博文链接:https://zw7534313.iteye.com/blog/426799
NULL 博文链接:https://itace.iteye.com/blog/2306140
NULL 博文链接:https://yizhilong28.iteye.com/blog/1161027
NULL 博文链接:https://zzz299.iteye.com/blog/691049
博文链接:https://wdfan.iteye.com/blog/197543
NULL 博文链接:https://timewalker.iteye.com/blog/1065615
NULL 博文链接:https://wincheer.iteye.com/blog/774298
NULL 博文链接:https://qiusenvs.iteye.com/blog/412763
博文链接:https://balaschen.iteye.com/blog/82579
手机理财1.0.7 博文链接:https://iwinyeah.iteye.com/blog/213328
NULL 博文链接:https://jaychang.iteye.com/blog/980159