第七章：小朱笔记hadoop之源码分析-hdfs分析第四节：namenode分析-format过程分析

huashuizhuhui

浏览: 239597 次
性别:
来自: 西安

最近访客更多访客>>

金易aa

XF银色子弹

baixy777

UP-GIS

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Hadoop深入浅出

第七章：小朱笔记hadoop之源码分析-hdfs分析

第四节：namenode分析

4.2 namenode format过程分析

namenode format操作是使用hadoop分布式文件系统前的步骤。如果不执行这个步骤，无法正确启动分布式文件系统。

（1）启动format

start-dfs.sh的脚本分析过程，传入的-format参数传入到执行类，namenode对应的执行类是org.apache.hadoop.hdfs.server.namenode.NameNode。

（2）执行NameNode.format格式化hdfs操作

这个过程是整个format的流程的主要部分，里面设计到FSNamesystem和FSImage两个和hdfs文件系统关系密切的类。

    //通过配置文件配置参数获取fsimage文件所存放的目录，本文后面的部分会单独介绍fsimage文件  
    Collection<File> dirsToFormat = FSNamesystem.getNamespaceDirs(conf);  
    //通过配置文件配置参数获取edits文件所存放的目录，本文后面的部分会单独介绍edits文件  
    Collection<File> editDirsToFormat =FSNamesystem.getNamespaceEditsDirs(conf);

对fsimage目录进行校验，如果已存在，需要用户确认是否要格式化，如果不允许格式化，则退出。

（3）FSImage.format

初始化系统的基本信息，包括版本信息，文件系统的编号等，调用clearDirectory完全删除sd所在的目录，然后再创建空目录，调用saveCurrent创建sd目录下的current目录和fsimage目录及相关文件。
注意：生成NameSpaceID的算法.

    /** Create new dfs name directory.  Caution: this destroys all files 
      * in this filesystem. */  
     void format(StorageDirectory sd) throws IOException {  
       sd.clearDirectory(); // 创建currrent目录，如果该目录存在，则会删除存在的current目录树    
       sd.lock(); // 加锁，对应的文件为in_use.lock  ;  
       try {  
         saveCurrent(sd);  
       } finally {  
         sd.unlock();  
       }  
       LOG.info("Storage directory " + sd.getRoot()  
                + " has been successfully formatted.");  
     }  
      
     public void format() throws IOException {  
       this.layoutVersion = FSConstants.LAYOUT_VERSION;  
       this.namespaceID = newNamespaceID();  
       this.cTime = 0L;  
       this.checkpointTime = FSNamesystem.now();  
       for (Iterator<StorageDirectory> it =   
                              dirIterator(); it.hasNext();) {  
         StorageDirectory sd = it.next();  
         format(sd);  
       }  
     }  
      
     /** 
      * Generate new namespaceID. 
      *  
      * namespaceID is a persistent attribute of the namespace. 
      * It is generated when the namenode is formatted and remains the same 
      * during the life cycle of the namenode. 
      * When a datanodes register they receive it as the registrationID, 
      * which is checked every time the datanode is communicating with the  
      * namenode. Datanodes that do not 'know' the namespaceID are rejected. 
      *  
      * @return new namespaceID 
      */  
     private int newNamespaceID() {  
       Random r = new Random();  
       r.setSeed(FSNamesystem.now());  
       int newID = 0;  
       while(newID == 0)  
         newID = r.nextInt(0x7FFFFFFF);  // use 31 bits only  
       return newID;  
     }

（4）saveCurrent(sd)过程分析

    protected void saveCurrent(StorageDirectory sd) throws IOException {  
      File curDir = sd.getCurrentDir();  
      NameNodeDirType dirType = (NameNodeDirType)sd.getStorageDirType();  
      // save new image or new edits  
      if (!curDir.exists() && !curDir.mkdir())  
        throw new IOException("Cannot create directory " + curDir);  
      if (dirType.isOfType(NameNodeDirType.IMAGE)) // 如果是fsimage目录    
        saveFSImage(getImageFile(sd, NameNodeFile.IMAGE)); // 保存fsimage映像    
      if (dirType.isOfType(NameNodeDirType.EDITS)) // 如果是edits日志文件目录    
        editLog.createEditLogFile(getImageFile(sd, NameNodeFile.EDITS)); // 创建一个新的edits文件    
      sd.write(); // 写版本文件VERSION    
      //写入支持旧版本的fsimage目录内容、写入版本信息到VERSION文件中、写入当前系统时间到fstime目录中  
      
    }

类FSImage初始化成员变量后，会遍历所有的元数据存储目录，以存储目录作为参数，依次调用format方法，format方法采用了重载的方式，可以根据输入参数的个数和类型确定所调用的方法，此处调用的方法为format(StorageDirectory sd)。该方法首先调用"sd.clearDirectory();"删除当前存储目录下[配置的fsimage路径（edits路径）/current]的所有内容；然后对传入的目录类型进行判断，如果是存储FSImage文件的目录，则调用saveFSImage保存FSImage，如果是存储Edits日志文件的目录，则调用editLog.createEditLogFile，在该目录下创建Edits文件；最后调用sd.write()方法在存储目录下创建fstime和VERSION文件，VERSION通常是在存储目录更新的最后写入，VERSION的存在表明存储目录下其他的文件已成功写入，因此该存储目录有效无需恢复，VERSION文件的内容为：layoutVersion、 storageType、namespaceID、cTime。

分享到：

第七章：小朱笔记hadoop之源码分析-hdfs分 ... | 第七章：小朱笔记hadoop之源码分析-hdfs分 ...

2013-05-22 00:09
浏览 1226
评论(0)
分类:互联网
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论