JAVA 读取lzo压缩文件 -

adofu

浏览: 4124 次
性别:
来自: 西安

最近访客更多访客>>

wozaishenghuo

woodding2008

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

JAVA 读取lzo压缩文件

博客分类：

大数据

JAVA 读取lzo压缩文件

今天犯了一个愚蠢的问题，用lzo做过压缩的数据，用lzop去读，但疼痛了好一会儿。

lzopcode和lzocode的做个简单介绍：

1.lzocode压缩过的文件都是以.lzo_deflate结尾,相应的加载类：（com.hadoop.compression.lzo.LzoCodec）
2.zopcode压缩过的文件都以.lzo结尾（com.hadoop.compression.lzo.LzopCodec）

读取lzocode文件

private static Configuration conf = new Configuration(true);
private static FileSystem hdfs;
private static Class<?> codecClass ;
private static CompressionCodec codec;
static {
        String path = "/usr/local/webserver/hadoop/etc/hadoop/";
        conf.addResource(new Path(path + "core-site.xml"));
        conf.addResource(new Path(path + "hdfs-site.xml"));
//加载解压lzo的class,对应的还有lzop的class
        codecClass = Class.forName("com.hadoop.compression.lzo.LzoCodec");
        codec = (CompressionCodec)ReflectionUtils.newInstance(codecClass, conf);
}
public List<String> readFile(String dir) {
        InputStream input = null;
        List<String> list = new LinkedList<String>();
        try {
            Path path = new Path(dir);
            FileSystem hdfs = FileSystem.get(URI.create(dir),conf);
            //获取hdsf上文件夹下面的文件
            FileStatus[] fileStatus = hdfs.listStatus(path);
            //遍历文件，逐一读取内容
            for (int i = 0; i < fileStatus.length; i++) {
                input = hdfs.open(new Path(fileStatus[i].getPath().toString()));
                //解压缩流
                input = codec.createInputStream(input);
                list.addAll(IOUtils.readLines(input,"utf8"));
            }
        } catch (IOException e) {
            e.printStackTrace();
        }finally{
            try {
                if(input != null)
                    input.close();
                hdfs.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        return list;
    }

分享到：