`

hadoop学习--HDFS

阅读更多
hadoop fs  -ls /
hdfs dfs -ls / #操作命令

1、架构

下图表示/test/a.log这个文件保存3个副本,该文件有blk_1,blk_2两个块,
第一个块保存在h0,h1,h3这3个服务器中,
第二个块保存在h0,h2,h4这3个服务器中。。


2、HDFS基础数据
NameNode是整个文件系统的管理节点;它维护着整个文件系统的文件目录树,文件/目录的元信息和每个文件对应的数据块列表;接收用户的操作请求。
以下这些文件是HDFS的关键元素,,,,并且保存在linux的文件系统中:
fsimage:元数据镜像文件,存储某一时段NameNode内存元数据信息
edits:操作日志文件
fstime:保存最近一次checkpoint的时间

a、Namenode始终在内存中保存metedata,用于处理“读请求”

b、当有“写请求”到来时,namenode会首先写editlog到磁盘,即向edits文件中写日志,成功返回后,才会修改内存,并且向客户端返回

c、Hadoop会维护一个fsimage文件,也就是namenode中metedata的镜像,但是fsimage不会随时与namenode内存中的metedata保持一致,而是每隔一段时间通过合并edits文件来更新内容

d、Secondary namenode就是用来合并fsimage和edits文件来更新NameNode的metedata的。




HDFS--CRUD
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
import org.junit.Before;
import org.junit.Test;


public class DfsTest {
    
    private FileSystem fs = null;
    private Path f = new Path("/words");
    
    @Before
    public void before(){
        
        try {
            fs = FileSystem.get(new URI("hdfs://jzk:9000"), new Configuration(), "jzk");
        } catch (IOException e) {
            e.printStackTrace();
        } catch (InterruptedException e) {
            e.printStackTrace();
        } catch (URISyntaxException e) {
            e.printStackTrace();
        }
    }
    
    @Test
    public void upload(){
        
        try {
            FSDataOutputStream out = fs.create(f, (short)1);
            FileInputStream in = new FileInputStream(new File("/home/jzk/tmp/words"));
            IOUtils.copyBytes(in, out, 4096, true);
        } catch (IllegalArgumentException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    
    @Test
    public void download(){
        
        try {
            FSDataInputStream in = fs.open(f);
            FileOutputStream out = new FileOutputStream(new File("/home/jzk/tmp/words1"));
            IOUtils.copyBytes(in, out, 4096, true);
        } catch (IllegalArgumentException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    
    @Test
    public void append(){
        
        try {
            FSDataOutputStream out = fs.exists(f) ? fs.append(f, 4096) : fs.create(f, (short)1);
            FileInputStream in = new FileInputStream(new File("/home/jzk/tmp/words"));
            IOUtils.copyBytes(in, out, 4096, true);
        } catch (IllegalArgumentException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    
    @Test
    public void del(){
        
        try {
            fs.delete(f, true);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}


  • 大小: 124.4 KB
  • 大小: 159.2 KB
  • 大小: 195 KB
  • 大小: 195 KB
分享到:
评论

相关推荐

    hadoop-hdfs-client-2.9.1-API文档-中文版.zip

    赠送jar包:hadoop-hdfs-client-2.9.1.jar 赠送原API文档:hadoop-hdfs-client-2.9.1-javadoc.jar 赠送源代码:hadoop-hdfs-client-2.9.1-sources.jar 包含翻译后的API文档:hadoop-hdfs-client-2.9.1-javadoc-...

    hadoop-hdfs-client-2.9.1-API文档-中英对照版.zip

    赠送jar包:hadoop-hdfs-client-2.9.1.jar; 赠送原API文档:hadoop-hdfs-client-2.9.1-javadoc.jar; 赠送源代码:hadoop-hdfs-client-2.9.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-client-2.9.1.pom;...

    hadoop-hdfs-2.7.3-API文档-中英对照版.zip

    赠送jar包:hadoop-hdfs-2.7.3.jar; 赠送原API文档:hadoop-hdfs-2.7.3-javadoc.jar; 赠送源代码:hadoop-hdfs-2.7.3-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.7.3.pom; 包含翻译后的API文档:hadoop...

    hadoop-hdfs-2.5.1-API文档-中文版.zip

    赠送jar包:hadoop-hdfs-2.5.1.jar; 赠送原API文档:hadoop-hdfs-2.5.1-javadoc.jar; 赠送源代码:hadoop-hdfs-2.5.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.5.1.pom; 包含翻译后的API文档:hadoop...

    hadoop最新版本3.1.1全量jar包

    hadoop-auth-3.1.1.jar hadoop-hdfs-3.1.1.jar hadoop-mapreduce-client-hs-3.1.1.jar hadoop-yarn-client-3.1.1.jar hadoop-client-api-3.1.1.jar hadoop-hdfs-client-3.1.1.jar hadoop-mapreduce-client-jobclient...

    hadoop-hdfs-2.6.5-API文档-中文版.zip

    赠送jar包:hadoop-hdfs-2.6.5.jar; 赠送原API文档:hadoop-hdfs-2.6.5-javadoc.jar; 赠送源代码:hadoop-hdfs-2.6.5-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.6.5.pom; 包含翻译后的API文档:hadoop...

    hadoop-hdfs-2.7.3-API文档-中文版.zip

    赠送jar包:hadoop-hdfs-2.7.3.jar; 赠送原API文档:hadoop-hdfs-2.7.3-javadoc.jar; 赠送源代码:hadoop-hdfs-2.7.3-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.7.3.pom; 包含翻译后的API文档:hadoop...

    hadoop-hdfs-2.9.1-API文档-中文版.zip

    赠送jar包:hadoop-hdfs-2.9.1.jar 赠送原API文档:hadoop-hdfs-2.9.1-javadoc.jar 赠送源代码:hadoop-hdfs-2.9.1-sources.jar 包含翻译后的API文档:hadoop-hdfs-2.9.1-javadoc-API文档-中文(简体)版.zip 对应...

    hadoop-hdfs-2.6.5-API文档-中英对照版.zip

    赠送jar包:hadoop-hdfs-2.6.5.jar; 赠送原API文档:hadoop-hdfs-2.6.5-javadoc.jar; 赠送源代码:hadoop-hdfs-2.6.5-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.6.5.pom; 包含翻译后的API文档:hadoop...

    hadoop-hdfs-2.5.1-API文档-中英对照版.zip

    赠送jar包:hadoop-hdfs-2.5.1.jar; 赠送原API文档:hadoop-hdfs-2.5.1-javadoc.jar; 赠送源代码:hadoop-hdfs-2.5.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.5.1.pom; 包含翻译后的API文档:hadoop...

    hadoop-hdfs-2.9.1-API文档-中英对照版.zip

    赠送jar包:hadoop-hdfs-2.9.1.jar; 赠送原API文档:hadoop-hdfs-2.9.1-javadoc.jar; 赠送源代码:hadoop-hdfs-2.9.1-sources.jar; 赠送Maven依赖信息文件:hadoop-hdfs-2.9.1.pom; 包含翻译后的API文档:hadoop...

    hadoop-hdfs-2.4.1.jar

    hadoop-hdfs-2.4.1.jar

    hadoop-hdfs-2.7.3

    hadoop-hdfs-2.7.3搭建flume1.7需要用到的包,还有几个包也有提供

    hadoop-hdfs-2.7.7.jar

    flume 想要将数据输出到hdfs,必须要有hadoop相关jar包。本资源是hadoop 2.7.7版本

    hadoop-hdfs-2.2.0.jar

    hadoop-hdfs-2.2.0.jar 点击下载资源即表示您确认该资源不违反资源分享的使用条款

    I001-hadoophdfs-mkdirs.7z

    标题"I001-hadoophdfs-mkdirs.7z"指向的是一个关于Hadoop HDFS(Hadoop Distributed File System)操作的压缩包文件,特别是关于创建目录(mkdirs)的教程或参考资料。Hadoop是Apache软件基金会开发的一个开源框架,...

    hadoop-hdfs-2.7.2.jar

    有时候需要查看修改或者回复hdfs的默认配置,在这个jar包里面,可以把hadoop-default.xml拿出来

    hadoop-client-2.6.5.jar

    hadoop-client-2.6.5.jar

    hadoop-hdfs-test-0.21.0.jar

    hadoop-hdfs-test-0.21.0.jar

    hadoop插件apache-hadoop-3.1.0-winutils-master.zip

    4. **配置文件**:修改`conf/core-site.xml`和`conf/hdfs-site.xml`配置文件,定义HDFS的相关参数,如默认的文件系统、NameNode地址等。 5. **启动Hadoop服务**:通过`sbin`目录下的脚本启动Hadoop的各个服务,如`...

Global site tag (gtag.js) - Google Analytics