`

Hadoop study notes - Datanode block scanner

 
阅读更多
Every datanode runs a block scanner, which periodically verifies all the blocks stored
on the datanode. This allows bad blocks to be detected and fixed before they are read
by clients. The DataBlockScanner maintains a list of blocks to verify and scans them one
by one for checksum errors. The scanner employs a throttling mechanism to preserve
disk bandwidth on the datanode.

Blocks are periodically verified every three weeks to guard against disk errors over time
(this is controlled by the dfs.datanode.scan.period.hours property, which defaults to
504 hours). Corrupt blocks are reported to the namenode to be fixed.
You can get a block verification report for a datanode by visiting the datanode’s web
interface at http://datanode:50075/blockScannerReport. Here’s an example of a report,
which should be self-explanatory:
Total Blocks                 :    194
Verified in last hour        :      0
Verified in last day         :     67
Verified in last week        :     94
Verified in last four weeks  :    187
Verified in SCAN_PERIOD      :    187
Not yet verified             :      7
Verified since restart       :     70
Scans since restart          :      1
Scan errors since restart    :      0
Transient scan errors        :      0
Current scan rate limit KBps :   1024
Progress this period         :      0%
Time left in cur period      :  99.47%

By specifying the listblocks parameter, http://datanode:50075/blockScannerReport?listblocks, the report is preceded by a list of all the blocks on the datanode along with
their latest verification status. Here is a snippet of the block list (lines are split to fit the
page):
blk_2880642477235589345_1712 : status : ok     type : local  scan time : 1328684200718   2012-02-08 01:56:40,718
blk_6862384560101574248_3203 : status : ok     type : none   scan time : 0               not yet verified
blk_-6204923618707049613_3146 : status : ok     type : none   scan time : 0               not yet verified
blk_8096385507793977436_1470 : status : ok     type : local  scan time : 1328726026095   2012-02-08 13:33:46,095
blk_-8383560827245098225_1470 : status : ok     type : local  scan time : 1328736508026   2012-02-08 16:28:28,026
blk_-1634356630613489001_3191 : status : ok     type : none   scan time : 0               not yet verified
blk_-6752468218406655007_3201 : status : ok     type : none   scan time : 0               not yet verified
blk_-1692843323764239407_1772 : status : ok     type : local  scan time : 1328742671906   2012-02-08 18:11:11,906
blk_3849616369028352463_3200 : status : ok     type : none   scan time : 0               not yet verified
blk_-34525423848470829_1226 : status : ok     type : local  scan time : 1328747018814   2012-02-08 19:23:38,814
blk_6305423182925037634_1226 : status : ok     type : local  scan time : 1328753126642   2012-02-08 21:05:26,642
blk_543251317099843969_3202 : status : ok     type : none   scan time : 0               not yet verified
blk_6417698981647840069_1833 : status : ok     type : local  scan time : 1328779874402   2012-02-09 04:31:14,402
blk_-7222269942471718886_3199 : status : ok     type : none   scan time : 0               not yet verified

check DataBlockScanner.java for detail
分享到:
评论

相关推荐

    hadoop 源码解析-DataNode

    Hadoop 源码解析 - DataNode Hadoop 作为一个大数据处理框架,其核心组件之一是分布式文件系统(HDFS),而 DataNode 是 HDFS 中的重要组件之一。DataNode 负责存储和管理数据块,提供数据访问服务。本文将对 ...

    hadoop插件apache-hadoop-3.1.0-winutils-master.zip

    标题中的"apache-hadoop-3.1.0-winutils-master.zip"是一个针对Windows用户的Hadoop工具包,它包含了运行Hadoop所需的特定于Windows的工具和配置。`winutils.exe`是这个工具包的关键组件,它是Hadoop在Windows上的一...

    hadoop-common-2.6.0-bin-master.zip

    `hadoop-common-2.6.0-bin-master.zip` 是一个针对Hadoop 2.6.0版本的压缩包,特别适用于在Windows环境下进行本地开发和测试。这个版本的Hadoop包含了对Windows系统的优化,比如提供了`winutils.exe`,这是在Windows...

    hadoop最新版本3.1.1全量jar包

    hadoop-annotations-3.1.1.jar hadoop-common-3.1.1.jar hadoop-mapreduce-client-core-3.1.1.jar hadoop-yarn-api-3.1.1.jar hadoop-auth-3.1.1.jar hadoop-hdfs-3.1.1.jar hadoop-mapreduce-client-hs-3.1.1.jar ...

    hadoop-eclipse-plugin三个版本的插件都在这里了。

    hadoop-eclipse-plugin-2.7.4.jar和hadoop-eclipse-plugin-2.7.3.jar还有hadoop-eclipse-plugin-2.6.0.jar的插件都在这打包了,都可以用。

    hadoop-eclipse-plugin-3.1.3.jar

    hadoop-eclipse-plugin-3.1.3,eclipse版本为eclipse-jee-2020-03

    hadoop-mapreduce-examples-2.7.1.jar

    hadoop-mapreduce-examples-2.7.1.jar

    hadoop-eclipse-plugin-3.1.2.jar

    在eclipse中搭建hadoop环境,需要安装hadoop-eclipse-pulgin的插件,根据hadoop的版本对应jar包的版本,此为hadoop3.1.2版本的插件。

    hadoop-lzo-0.4.20.jar

    hadoop2 lzo 文件 ,编译好的64位 hadoop-lzo-0.4.20.jar 文件 ,在mac 系统下编译的,用法:解压后把hadoop-lzo-0.4.20.jar 放到你的hadoop 安装路径下的lib 下,把里面lib/Mac_OS_X-x86_64-64 下的所有文件 拷到 ...

    hadoop-common-2.2.0-bin-master.zip

    hadoop-common-2.2.0-bin-master(包含windows端开发Hadoop和Spark需要的winutils.exe),Windows下IDEA开发Hadoop和Spark程序会报错,原因是因为如果本机操作系统是windows,在程序中使用了hadoop相关的东西,比如写入...

    flink-shaded-hadoop-2-uber-2.7.5-10.0.jar

    flink-shaded-hadoop-2-uber-2.7.5-10.0.jar

    hadoop-common-2.7.2.jar

    hadoop-common-2.7.2.jar

    hadoop-eclipse-plugin-2.7.1.jar

    hadoop-eclipse-plugin-2.7.1.jar插件,直接放在eclipse插件目录中

    hadoop-eclipse-plugin-2.7.4.jar

    最新的hadoop-eclipse-plugin-2.7.4.jar 很好用的hadoop的eclipse插件。自己编译的。 经过测试,使用没有任何问题。 请各位放心使用

    hadoop-eclipse-plugin-2.9.2

    找不到与hadoop-2.9.2版本对应的插件,手动生成的hadoop-eclipse-plugin-2.9.2版本,

    apache-hadoop-3.1.0-winutils-master.zip

    标题中的"apache-hadoop-3.1.0-winutils-master.zip"是一个专门为解决Windows 10上Hadoop安装问题的压缩包,它包含了一些修改后的bin目录文件,使得Hadoop可以在Windows环境中正常运行。 在Hadoop的默认下载版本中...

    hadoop-eclipse-plugin-3.2.1.jar

    hadoop-eclipse-plugin.jar插件基于Ubuntu18.04和Hadoop-3.2.1编译的,最后可以在eclipse创建Map Reduce文件

    hadoop-lzo-master

    1.安装 Hadoop-gpl-compression 1.1 wget http://hadoop-gpl-compression.apache-extras.org.codespot.com/files/hadoop-gpl-compression-0.1.0-rc0.tar.gz 1.2 mv hadoop-gpl-compression-0.1.0/lib/native/Linux-...

    hadoop-eclipse-plugin-2.10.1.jar

    hadoop2x-eclipse-plugin-master,java1.8(64位)编译,可以使用。

Global site tag (gtag.js) - Google Analytics