`
szgaea
  • 浏览: 217571 次
  • 性别: Icon_minigender_1
  • 来自: 深圳
社区版块
存档分类
最新评论

Hadoop NameNode内存空间计算说明

阅读更多
说明:文档转正Hadoop Jira,以做备份。
I've done some estimates on how much space our data structures take on the name-node per block, file and directory.
Brief overview of the data structures:
Directory tree (FSDirectory) is built of inodes. Each INode points either to an array of blocks
if it corresponds to a file or to a TreeMap<String, INode> of children INodes if it is a directory.
[Note: this estimates were made before Dhruba replaced the children TreeMap by ArrayList.]
Each block participates also in at least 2 more data structures.
BlocksMap contains a HashMap<Block, BlockInfo> of all blocks mapping a Block into a BlockInfo.
DatanodeDescriptor contains a TreeMap<Block, Block> of all blocks belonging to this data-node.
A block may or may not be contained also in other data-structures, like

UnderReplicatedBlocks
PendingReplicationBlocks
recentInvalidateSets
excessReplicateMap
Presence of a block in any of these structures is temporary and therefore I do not count them in my estimates.
The estimates can be viewed as lower bounds.

These are some classes that we are looking at here

class INode {
   String name;
   INode parent;
   TreeMap<String, INode> children;
   Block blocks[];
   short blockReplication;
   long modificationTime;
}

class Block {
   long blkid;
   long len;
}

class BlockInfo {
   FSDirectory.INode       inode;
   DatanodeDescriptor[]   nodes;
   Block                          block;
}
The calculations are made for a 64-bit java vm based on that
Reference size = 8 bytes
Object header size = 16 bytes
Array header size = 24 bytes

Commonly used objects:
TreeMap.Entry = 64 bytes. It has 5 reference fields
HashMap.Entry = 48 bytes. It has 3 reference fields
String header = 64 bytes.

The size of a file includes:

Size of an empty file INode: INode.children = null, INode.blocks is a 0-length array, and file name is empty. (152 bytes)
A directory entry of the parent INode that points to this file, which is a TreeMap.Entry. (64 bytes)
file name length times 2, because String represents each name character by 2 bytes.
Reference to the outer FSDirectory class (8 bytes)
The total: 224 + 2 * fileName.length

The size of a directory includes:

Size of an empty directory INode: INode.children is an empty TreeMap, INode.blocks = null, and file name is empty. (192 bytes)
A directory entry of the parent INode that points to this file, which is a TreeMap.Entry. (64 bytes)
file name length times 2
Reference to the outer FSDirectory class (8 bytes)
The total: 264 + 2 * fileName.length

The size of a block includes:

Size of Block. (32 bytes)
Size of BlockInfo. (64 + 8*replication bytes)
Reference to the block from INode.blocks (8 bytes)
HashMap.Entry referencing the block from BlocksMap. (48 bytes)
References to the block from all DatanodeDescriptors it belongs to.
This is a TreeMap.Entry size times block replication. (64 * replication)
The total: 152 + 72 * replication

Typical object sizes:
Taking into account that typical file name is 10-15 bytes and our default replication is 3 we can say that typical sizes are
File size: 250
Directory size: 290
Block size: 368

Object size estimate (bytes) typical size (bytes)
File 224 + 2 * fileName.length 250
Directory 264 + 2 * fileName.length 290
Block 152 + 72 * replication 368

One of our clusters has

Files: 10 600 000
Dirs: 310 000
Blocks: 13 300 000
Total Size (estimate): 7,63 GB
Memory used on the name-node (actual reported by jconsole after gc): 9 GB

This means that other data structures like NetworkTopology, heartbeats, datanodeMap, Host2NodesMap,
leases, sortedLeases, and multiple block replication maps occupy ~1.4 GB, which seems to be pretty high
and need to be investigated as well.

Based on the above estimates blocks should be the main focus of the name-node memory reduction effort.
Space used by a block is 50% larger compared to a file, and there is more blocks than files or directories.

Some ideas on how we can reduce the name-node size without substantially changing the data structures.

INode.children should be an ArrayList instead of a TreeMap. Already done HADOOP-1565. (-48 bytes)
Factor out the INode class into a separate class (-8 bytes)
Create base INode class and derive file inode and directory inode classes from the base.
Directory inodes do not need to contain blocks and replication fields (-16 bytes)
File inodes do not need to contain children field (-8 bytes)
String name should be replaced by a mere byte[]. (-(40 + fileName.length) ~ -50 bytes)
Eliminate the Block object.
We should move Block fields into into BlockInfo and completely get rid of the Block object. (-16 bytes)
Block object is referenced at least 5 times in our structures for each physical block.
The number of references should be reduced to just 2. (-24)
Remove name field from INode. File or directory name is stored in the corresponding directory
entry and does need to be duplicated in the INode (-8 bytes)
Eliminate INode.parent field. INodes are accessed through the directory tree, and the parent can
be remembered in a local variable while browsing the tree. There is no need to persistently store
the parent reference for each object. (-8 bytes)
Need to optimize data-node to block map. Currently each DatanodeDescriptor holds a TreeMap of
blocks contained in the node, and we have an overhead of one TreeMap.Entry per block replica.
I expect we can reorganize datanodeMap in a way that it stores only 1 or 2 references per replica
instead of an entire TreeMap.Entry. (-48 * replication)
Note: In general TreeMaps turned out to be very expensive, we should avoid using them if possible.
Or implement a custom map structure, which would avoid using objects for each map entry.

This is what we will have after all the optimizations

Object size estimate (bytes) typical size (bytes) current typical size (bytes)
File 112 + fileName.length 125 250
Directory 144 + fileName.length 155 290
Block 112 + 24 * replication 184 368
分享到:
评论

相关推荐

    Hadoop Namenode性能诊断及优化

    ### Hadoop Namenode性能诊断及优化 #### 一、Namenode简介与性能挑战 Hadoop作为大数据处理领域的核心技术之一,其分布式文件系统HDFS(Hadoop Distributed File System)是整个框架的重要组成部分。HDFS主要由两...

    hadoop2.0 2个namenode 2个datanode 部署

    Namenode 是 Hadoop 集群中的主节点,负责管理文件系统的命名空间和数据块的分布。它维护着文件系统的树形结构,记录着每个文件的元数据,如文件名、权限、所有者等信息。 Datanode 的作用 Datanode 是 Hadoop ...

    基于Hadoop集群平台的计算架构.docx

    Namenode 负责管理文件系统的名字空间和客户端对文件的访问,Datanode 负责管理其所在节点上的存储。 3. HDFS 的特征 HDFS 有以下基本特征: (1)对于整个集群有单一的命名空间。 (2)数据一致性,适合一次...

    基于Hadoop集群平台的计算架构.pdf

    基于 Hadoop 集群平台的计算架构 本文主要介绍了基于 Hadoop 集群平台的计算架构,包括 Hadoop 简介、HDFS 体系结构、Hadoop 集群搭建等方面的知识点。 Hadoop 简介 Hadoop 是 Apache 下的一个开源项目,是一个...

    基于Hadoop集群平台的计算架构 (2).docx

    基于 Hadoop 集群平台的计算架构 本文将详细介绍基于 Hadoop 集群平台的计算架构,包括 Hadoop 的简介、HDFS 的架构、Hadoop 集群的搭建等。 一、Hadoop 简介 Hadoop 是 Apache 下的一个项目,它是一个开源的可...

    hadoop分布计算安装.pptx

    Hadoop分布计算安装 Hadoop是Apache软件基金会旗下的一个开源分布式计算平台,以Hadoop分布式文件系统(HDFS)和MapReduce(Google MapReduce的开源实现)为核心的Hadoop为用户提供了系统底层细节透明的分布式基础...

    Hadoop文件存储结构

    例如,对于不再频繁访问的数据,可以降低其复制因子,从而释放更多的存储空间。 #### 总结 综上所述,Hadoop分布式文件系统(HDFS)通过对硬件故障的容忍、大规模数据集的高效处理、数据块的智能复制与组织,以及...

    hadoop分布计算配置.pptx

    副本数量的选择需要平衡数据安全性与存储空间的使用。 再者,`mapred-site.xml`文件配置了MapReduce作业的调度器。`mapred.job.tracker`属性指定了JobTracker的位置,它是MapReduce作业的管理和调度中心,这里设置...

    基于Hadoop架构的分布式计算和存储技术及其应用.pdf

    总的来说,Hadoop架构在分布式计算与存储方面的优势十分明显。它不仅可以高效地存储和处理海量数据,还具备良好的可扩展性,并且在成本上比传统数据处理系统更低。随着大数据技术的发展,Hadoop架构在很多领域得到了...

    hadoop-2.7.2.tar.gz

    4. Erasure Coding:2.7.2版本开始引入Erasure Coding,这是一种更节省存储空间的数据冗余策略,相比传统的三副本,可以减少存储成本,同时保证数据恢复能力。 5. 性能优化:包括提升数据读写速度、降低网络延迟等...

    Hadoop大数据计算平台搭建实践.pdf

    搭建Hadoop集群需要至少三台服务器,在硬件准备中需要考虑服务器的CPU、内存、存储空间等资源,以便于支持大规模数据处理需求。文中提到了使用5台Dell T5600主机作为实验环境。 5. 软件准备 在搭建Hadoop平台之前,...

    基于Hadoop集群平台的计算架构 (2).pdf

    Namenode 负责管理文件系统的名字空间和客户端对文件的访问,而 Datanodes 负责存储文件块。 MapReduce 是一个分布式并行编程框架,能够处理大规模的数据集。MapReduce 框架包括两个主要阶段:Map 阶段和 Reduce ...

    大数据 hadoop-3.1.3 linux 安装包

    完成配置后,初始化HDFS命名空间,格式化NameNode,通过`hadoop namenode -format`命令实现。接着,启动Hadoop的各个服务,包括DataNode、NameNode、ResourceManager、NodeManager等。可以使用`start-dfs.sh`和`...

    基于Hadoop分布式交通大数据存储分析平台设计.pdf

    Hadoop是一个开源的分布式存储和处理大数据的框架,它能有效地存储和处理PB级别的数据。Hadoop的核心是HDFS(Hadoop Distributed File System),它使用的是主从架构模式,其中包含一个NameNode作为主服务节点,负责...

    基于Hadoop的高可靠分布式计算平台的构建.pdf

    HDFS系统由一个NameNode和多个DataNode组成,其中NameNode负责提供元数据服务,即存储文件系统的命名空间和客户端对文件的访问操作,而DataNode则负责具体的数据存储。 在HDFS中,数据被分割成数据块(block),并...

    Hadoop分布式存储架构的性能分析.pdf

    Hadoop作为一个代表性的开源云计算平台,其分布式存储架构HDFS(Hadoop Distributed File System)和分布式处理框架MapReduce已经成为研究者和开发者关注的焦点。 Hadoop分布式存储架构的核心组件之一HDFS,是一个...

    Hadoop伪分布式搭建配置文件

    在大数据处理领域,Hadoop是一个不可或缺的开源框架,它提供了分布式存储(HDFS)和分布式计算(MapReduce)的能力。本文将详细介绍如何在单机环境下搭建Hadoop的伪分布式模式,这是一种模拟分布式环境的配置,适合...

    Hadoop3.1.3.rar

    在Hadoop 3.1.3中,HDFS引入了Erasure Coding,这是一种用于数据冗余和容错的新方法,相比于传统的三副本策略,它可以节省更多的存储空间。此外,HDFS还支持大文件块(128MB或更大),提高了大规模数据处理的效率。 ...

    hadoop shell命令

    在IT领域,尤其是大数据处理与分布式计算环境中,Hadoop无疑占据着举足轻重的地位。作为一款开源软件框架,Hadoop被设计用于分布式存储和处理大规模数据集,它包括了Hadoop Distributed File System (HDFS) 和...

Global site tag (gtag.js) - Google Analytics