Java读带有BOM的UTF-8文件乱码解决方法

messon619

浏览: 45364 次
性别:
来自: 上海

最近访客更多访客>>

cjiuzhou

lvsenlin

kly377

BeeDances

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Java基础

Java F#

Java default io reader does not recognize all BOM markers. It it known to be fixed in JDK6, but I havent tested it yet. You can use UnicodeReader class to overcome problems and auto-recognize bom markers. It will give a transparent behaviour to underlying inputstreams.

Example code using UnicodeReader class
Here is an example method to read text file. It will recognize bom marker and skip it while reading.

   public static char[] loadFile(String file) throws IOException {
      // read text file, auto recognize bom marker or use
      // system default if markers not found.
      BufferedReader reader = null;
      CharArrayWriter writer = null;
      UnicodeReader r = new UnicodeReader(new FileInputStream(file), null);

      char[] buffer = new char[16 * 1024];   // 16k buffer
      int read;
      try {
         reader = new BufferedReader(r);
         writer = new CharArrayWriter();
         while( (read = reader.read(buffer)) != -1) {
            writer.write(buffer, 0, read);
         }
         writer.flush();
         return writer.toCharArray();
      } catch (IOException ex) {
         throw ex;
      } finally {
         try {
            writer.close(); reader.close(); r.close();
         } catch (Exception ex) { }
      }
   }

Example code to write UTF-8 with bom marker
Write bom marker bytes to start of empty file and all proper text editors have no problems using a correct charset while reading files. Java's OutputStreamWriter does not write utf8 bom marker bytes.

   public static void saveFile(String file, String data, boolean append) throws IOException {
      BufferedWriter bw = null;
      OutputStreamWriter osw = null;

      File f = new File(file);
      FileOutputStream fos = new FileOutputStream(f, append);
      try {
         // write UTF8 BOM mark if file is empty
         if (f.length() < 1) {
            final byte[] bom = new byte[] { (byte)0xEF, (byte)0xBB, (byte)0xBF };
            fos.write(bom);
         }

         osw = new OutputStreamWriter(fos, "UTF-8");
         bw = new BufferedWriter(osw);
         if (data != null) bw.write(data);
      } catch (IOException ex) {
         throw ex;
      } finally {
         try { bw.close(); fos.close(); } catch (Exception ex) { }
      }
   }

分享到：

webx | java sftp tools

2011-03-02 11:12
浏览 2464
评论(0)
分类:编程语言
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论