中文转化为unicode

lz12366

浏览: 425537 次
性别:
来自: 济南

最近访客更多访客>>

rocex

hqb732

sunjor

oznyang

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

基础递进

Java JSP C C++C#

原文地址：
http://www.cnitblog.com/neatstudio/archive/2006/07/28/14315.html
js

mode="zhuan";

function encode(obj,btn){
   if(mode=="zhuan"){
       obj.value=obj.value.replace(/[^\u0000-\u00FF]/g,function($0){return escape($0).replace(/(%u)(\w{4})/gi,"&#x$2;")});
       btn.value="还原";
       mode="huan";
   }else{
       obj.value=unescape(obj.value.replace(/&#x/g,'%u').replace(/;/g,''));
       btn.value="转化";
       mode="zhuan";
   }
}

JAVA：

//转为unicode

public static void writeUnicode(final DataOutputStream out, final String value)  {
  try {
  final String unicode = gbEncoding( value );
  final byte[] data = unicode.getBytes();
  final int dataLength = data.length;

  System.out.println( "Data Length is: " + dataLength );
  System.out.println( "Data is: " + value );
  out.writeInt( dataLength ); //先写出字符串的长度
  out.write( data, 0, dataLength ); //然后写出转化后的字符串
  } catch (IOException e) {
 
  }
  }

 
  public static String gbEncoding( final String gbString ) {
  char[] utfBytes = gbString.toCharArray();
  String unicodeBytes = "";
  for( int byteIndex = 0; byteIndex < utfBytes.length; byteIndex ++ ) {
  String hexB = Integer.toHexString( utfBytes[ byteIndex ] );
  if( hexB.length() <= 2 ) {
  hexB = "00" + hexB;
  }
  unicodeBytes = unicodeBytes + "\\\\u" + hexB;
  }
  System.out.println( "unicodeBytes is: " + unicodeBytes );
  return unicodeBytes;
  }


/** *//*****************************************************
  * 功能介绍:将unicode字符串转为汉字
  * 输入参数:源unicode字符串
  * 输出参数:转换后的字符串
  *****************************************************/
 private String decodeUnicode( final String dataStr ) {
  int start = 0;
  int end = 0;
  final StringBuffer buffer = new StringBuffer();
  while( start > -1 ) {
  end = dataStr.indexOf( "\\\\u", start + 2 );
  String charStr = "";
  if( end == -1 ) {
  charStr = dataStr.substring( start + 2, dataStr.length() );
  } else {
  charStr = dataStr.substring( start + 2, end);
  }
  char letter = (char) Integer.parseInt( charStr, 16 ); // 16进制parse整形字符串。
  buffer.append( new Character( letter ).toString() );
  start = end;
  }
  return buffer.toString();
 }

JSP：

/** *//** ToUnicode.java */ 
package com.edgewww.util; 

import java.io.*; 

/** *//** 
* 字符串转换成Unicode码的类 
* @author 栾金奎 jsp@shanghai.com 
* @date 2001-03-05 
*/ 
public class ToUnicode { 

/** *//** 
* 把字符串转换成Unicode码 
* @param strText 待转换的字符串 
* @param code 转换前字符串的编码，如"GBK" 
* @return 转换后的Unicode码字符串 
*/ 
public String toUnicode(String strText,String code) throws UnsupportedEncodingException{ 
　　char c; 
　　String strRet = "" ; 
　　int intAsc; 
　　String strHex; 
　　strText = new String(strText.getBytes("8859_1"),code); 
　　for ( int i = 0; i < strText.length(); i++ ){ 
　　　　c = strText.charAt(i); 
　　　　intAsc = (int)c; 
　　　　if(intAsc>128){ 
　　　　　　strHex = Integer.toHexString(intAsc); 
　　　　　　strRet = strRet + "&#x" + strHex+";"; 
　　　　} 
　　　　else{ 
　　　　　　strRet = strRet + c; 
　　　　} 
　　} 
　　return strRet ; 
} 

} 

/** *//** 应用举例 */ 
/** *//** gbk2Unicode.jsp */ 
<meta http-equiv="Content-Type" content="text/html; charset=big5"> 
<jsp:useBean id="g2u" scope="session" class="com.edgewww.util.ToUnicode"/> 
<% String lang = "这是简体中文"; %> 
<br> 
<%=lang %> 
<br> 
<%=g2u.toUnicode(lang,"GBK") %>

分享到：

进阶Enum | 截取字符串

2010-05-28 16:37
浏览 1742
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

中文转化为unicode

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

中文转化为unicode

评论

发表评论

相关推荐

throw 和throws碰到runtimeException

Hashtable中的数据结构

关于StringBuffer的拓展

转帖 精解Classloader

jvm解析多线程

Classloader getResourceAsStream深入

ClassLoader的基础详解

关于Classloader的总结！loadClass的分析和加载细节的分析

unicode下的String

集合总结

自己写的根据unix纪元法 得到时间

进阶Enum

构造函数 方法 void 关键字

(转)ConcurrentHashMap实现细节

ConcurrentModificationException迭代集合删除元素！！

Checked Exception与Runtime Exception 的区别（转）

数组分配的字节码分析

最近访客更多访客>>

转帖精解Classloader

自己写的根据unix纪元法得到时间

构造函数方法 void 关键字