`

hbase 报错gc wal.FSHLog: Error while AsyncSyncer sync, request close of hlog YouAr

阅读更多

 

 

一个很常见的报错log

 

2015-03-05 03:10:35,461 FATAL [regionserver60020-WAL.AsyncSyncer0] wal.FSHLog: Error while AsyncSyncer sync, request close of hlog
org.apache.hadoop.ipc.RemoteException(java.io.IOException): BP-1540478979-192.168.5.117-1409220943611:blk_1098635649_24898817 does not exist or is not under Constructionblk_1098635649_24900382
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkUCBlock(FSNamesystem.java:5956)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updateBlockForPipeline(FSNamesystem.java:6023)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updateBlockForPipeline(NameNodeRpcServer.java:645)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updateBlockForPipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:874)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$
2015-03-05 03:10:35,460 FATAL [regionserver60020-WAL.AsyncSyncer2] wal.FSHLog: Error while AsyncSyncer sync, request close of hlog
org.apache.hadoop.ipc.RemoteException(java.io.IOException): BP-1540478979-192.168.5.117-1409220943611:blk_1098635649_24898817 does not exist or is not under Constructionblk_1098635649_24900382
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkUCBlock(FSNamesystem.java:5956)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updateBlockForPipeline(FSNamesystem.java:6023)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updateBlockForPipeline(NameNodeRpcServer.java:645)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updateBlockForPipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:874)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server
2015-03-05 03:10:35,459 FATAL [regionserver60020-WAL.AsyncSyncer3] wal.FSHLog: Error while AsyncSyncer sync, request close of hlog
org.apache.hadoop.ipc.RemoteException(java.io.IOException): BP-1540478979-192.168.5.117-1409220943611:blk_1098635649_24898817 does not exist or is not under Constructionblk_1098635649_24900382
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkUCBlock(FSNamesystem.java:5956)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updateBlockForPipeline(FSNamesystem.java:6023)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updateBlockForPipeline(NameNodeRpcServer.java:645)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updateBlockForPipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:874)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
                                                                                                                                                                                  1072365,2-9   99%
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1986)
2015-03-05 03:10:35,463 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server hostxxx,60020,1424960304895: org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected
; currently processing hostxxx,60020,1424960304895 as dead server
        at org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:341)
        at org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:254)
        at org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:1357)
        at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:5087)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
        at org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:73)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)

org.apache.hadoop.hbase.YouAreDeadException: org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; currently processing hostxxx,60020,1424960304895 as dead server
        at org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:341)
        at org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:254)
        at org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:1357)
        at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:5087)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.YouAreDeadException): org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected; currently proces
sing hostxxx,60020,1424960304895 as dead server
        at org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:341)
        at org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:254)
        at org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:1357)
        at org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:5087)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)

 

        

        

        

再往前边看看会返现gc警告

2015-03-05 03:08:25,092 DEBUG [LruStats #0] hfile.LruBlockCache: Total=12.67 GB, free=125.43 MB, max=12.79 GB, blocks=203552, accesses=65532282, hits=33745890, hitRatio=51.50%, , cachingAccesses=3
4739624, cachingHits=28976512, cachingHitsRatio=83.41%, evictions=536, evicted=5540234, evictedPerRun=10336.2578125
2015-03-05 03:10:35,390 WARN  [regionserver60020.periodicFlusher] util.Sleeper: We slept 124408ms instead of 10000ms, this is likely due to a long garbage collecting pause and it's usually bad, se
e http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
2015-03-05 03:10:35,390 INFO  [regionserver60020-SendThread(host141:42181)] zookeeper.ClientCnxn: Client session timed out, have not heard from server in 127580ms for sessionid 0x34add7868858eeb,
closing socket connection and attempting reconnect
2015-03-05 03:10:35,390 WARN  [regionserver60020.compactionChecker] util.Sleeper: We slept 124408ms instead of 10000ms, this is likely due to a long garbage collecting pause and it's usually bad,
see http://hbase.apache.org/book.html#trouble.rs.runtime.zkexpired
2015-03-05 03:10:35,390 WARN  [RpcServer.handler=14,port=60020] ipc.RpcServer: RpcServer.respondercallId: 3046 service: ClientService methodName: Scan size: 23 connection: 192.168.5.186:3366: outp
ut error
2015-03-05 03:10:35,390 WARN  [regionserver60020] util.Sleeper: We slept 121547ms instead of 3000ms, this is likely due to a long garbage collecting pause and it's usually bad, see http://hbase.ap
ache.org/book.html#trouble.rs.runtime.zkexpired

 

 

总结原因:

gc时间过长(可能是gc问题,cpu问题使得gc得不到线程执行)

使得zk认为rs已经dead,

zk返回deadregion到master,master就让其他rs负责dead rs下的regions

其他rs会读取wal进行恢复region,处理完的wal,会把wal文件删除

dead rs的gc完成,恢复之后,找不到wal产生报错,

dead rs从zk得知自己dead了,就close了

 

 

 

我的解决方法:

设置SurvivorRatio=2,增大survivor大小,减少老生带增加速度,减少cms触发几率

设置-XX:CMSInitiatingOccupancyFraction=60,提早cms进行老生带的回收,减少cms的时间

这样避免老生带在回收的时候占满,触发full gc(避免promotion failed和concurrent mode failure)

 

最后hbase rs jvm:

        export HBASE_REGIONSERVER_OPTS="-Xmx33g -Xms33g -Xmn2g -XX:SurvivorRatio=1 -XX:PermSize=128M -XX:MaxPermSize=128M -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv6Addresses=false -XX:MaxTenuringThreshold=15 -XX:+CMSParallelRemarkEnabled -XX:+UseFastAccessorMethods -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=60 -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseCMSCompactAtFullCollection -XX:CMSFullGCsBeforeCompaction=0 -XX:+HeapDumpOnOutOfMemoryError -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -Xloggc:$HBASE_HOME/hbase-0.98.1-cdh5.1.0/logs/gc-hbase.log"
        

 

 

 

分享到:
评论
1 楼 guokaiwhu 2017-12-14  
赞啊!今晚遇到相同的问题,正追根溯源,就找到了博主!

相关推荐

    python3使用thrift操作hbase hbase-thirft报错解决

    python连接hbase需要用到hbase-thirft类库,但是hbase-thirft只在python2下能正常使用,如果在python3下,会报错,主要有一下几个错误 except IOError, io: SyntaxError: invalid syntax ModuleNotFoundError: No ...

    hbase-site.xml.doc

    * hbase.master.logcleaner.plugins:org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner, org.apache.hadoop.hbase.master.cleaner.TimeToLiveProcedureWALCleaner,这个参数指定了预写日志的清理插件。...

    hbase_py3.zip

    无奈没有积分,逐个问题修复,免费给大家,喜欢的点个赞,python3读取Hbase通过Thrift操作时用到hbase-thrift包,但是运行时报错IOError,ttypes,xrange,iteritems等错误。

    hbase-2.0.0.3.0.0.0-1634-bin.tar.gz

    《HBase 2.0.0.3.0.0.0-1634 在 Ambari 2.7.x 下的编译与使用详解》 HBase,全称Apache HBase,是一款构建在Hadoop文件系统之上的分布式、版本化、列族式存储系统,主要用于处理大规模数据。它提供了高度可靠性和高...

    pinpoint的hbase初始化脚本hbase-create.hbase

    搭建pinpoint需要的hbase初始化脚本hbase-create.hbase

    hbase-2.0.2.3.1.4.0-315-bin.tar.gz

    ambari-2.7.5 编译过程中四个大包下载很慢,所以需要提前下载,包含:hbase-2.0.2.3.1.4.0-315-bin.tar.gz ,hadoop-3.1.1.3.1.4.0-315.tar.gz , grafana-6.4.2.linux-amd64.tar.gz ,phoenix-5.0.0.3.1.4.0-315....

    hbase 启动regionserver日志报错: Wrong FS: hdfs:// .regioninfo, expected: file:///

    NULL 博文链接:https://bnmnba.iteye.com/blog/2322332

    hbase-0.90.5.tar.gz与hadoop0.20.2版本匹配

    HBase是Apache软件基金会开发的一个开源分布式数据库,它是基于Google的Bigtable模型设计的,用于存储大规模结构化数据。HBase构建在Hadoop之上,两者都是Apache Hadoop生态系统的重要组成部分。Hadoop是一个分布式...

    hbase-1.2.7-bin.tar.gz

    4. 启动HBase:通过bin/start-hbase.sh命令启动HBase集群,包括Master和RegionServer。 四、HBase使用 1. 创建表:使用HBase Shell或Java API创建表,指定列族。 2. 插入数据:通过行键、列族和列标识插入数据,可...

    hbase-0.98.9-src.tar

    section and is where you should being your exploration of the hbase project. The latest HBase can be downloaded from an Apache Mirror [4]. The source code can be found at [5] The HBase issue ...

    hbase-1.2.6-bin.tar.gz

    5. 启动HBase:执行start-hbase.sh脚本启动服务,验证安装是否成功。 四、HBase操作与API 1. 创建表:使用HBase Shell或Java API创建表,定义列族和列。 2. 插入数据:通过Put操作向表中插入数据,指定行键和列族...

    hbase-common-1.4.3-API文档-中文版.zip

    赠送jar包:hbase-common-1.4.3.jar; 赠送原API文档:hbase-common-1.4.3-javadoc.jar; 赠送源代码:hbase-common-1.4.3-sources.jar; 赠送Maven依赖信息文件:hbase-common-1.4.3.pom; 包含翻译后的API文档:...

    最新版linux hbase-2.3.3-bin.tar.gz

    例如,设置HDFS为默认的Hadoop安装:`<property><name>hbase.rootdir</name><value>hdfs://namenode_host:port/hbase</value></property>`。 启动HBase集群,需要启动Zookeeper和HBase守护进程: ```bash # 启动...

    第8章 HBase组件安装配置.pdf

    HBase 组件安装配置知识点总结 本章节主要讲述 HBase 组件的安装和配置过程。下面是对应的知识点总结: 1. HBase 组件安装的实验目的: * 掌握 HBase 安装与配置 * 掌握 HBase 常用 Shell 命令 2. HBase 组件...

    藏经阁-HBase Coprocessor-22.pdf

    HBase Coprocessor 是什么? HBase Coprocessor 是一种基于 HBase 的 coprocessor 机制,可以实现对 HBase 的扩展和自定义。它提供了一个灵活的方式来实现数据的处理和分析,例如二级索引、聚合计算、数据排序等。 ...

    hbase.tar.gz 已经配置完成拿来即用

    这个“hbase.tar.gz”压缩包可能是预配置好的HBase环境,用户下载后可以直接解压使用,无需繁琐的配置步骤。下面将详细介绍HBase的核心概念、工作原理以及如何部署和使用。 一、HBase核心概念 1. 表(Table):...

    hbase-1.2.1-bin.tar.gz.zip

    HBase,全称为Hadoop Distributed File System上的基础结构(HBase on Hadoop Distributed File System),是一种分布式的、面向列的开源数据库,它构建在Apache Hadoop文件系统(HDFS)之上,提供高可靠性、高性能...

    Hbase权威指南(HBase: The Definitive Guide)

    ### HBase权威指南知识点概述 #### 一、引言与背景 - **大数据时代的来临**:随着互联网技术的发展,人类社会产生了前所未为的数据量。这些数据不仅数量巨大,而且种类繁多,传统的数据库系统难以应对这样的挑战。 ...

    hbase资料_hbase-default.xml.zip

    `hbase.regionserver.maxlogs`则是RegionServer可持有的WAL日志的最大数量,防止 RegionServer因日志过多而影响性能。 3. **内存管理**:`hbase.regionserver.global.memstore.upperLimit`和`hbase.regionserver....

Global site tag (gtag.js) - Google Analytics