HBase Region在两个RegionServer出现分析

punishzhou

浏览: 143718 次

最近访客更多访客>>

perfect6566

irisYU

TieMushan

lujisen

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

HBase Server端

HBase的操作一般都是以region为粒度的，如split，compact，move等操作。因此对于每个region在集群的唯一性就需要得到保证。若region在两个regionserver中出现显然会有各种各样的bug出现。

这里我们通过以下的分析来看看region在上吗情况下会出现分配到两个rs中。

对于一个RS1上的Region A将其move到RS2中，调用HBaseAdmin的move操作：

1.创建一个到B的RegionPlan

2.然后unassigned Region A，即将RS1上的Region A的region关闭，主要是关闭region上每个storefile的读数据流，在这个过程中会把memstore内容刷一次磁盘

3.然后在RS2中open Region A，主要操作是对Region A的初始化，将A的rs地址信息更新Meta表中

假设在move的过程中RS1 down掉了，那么master会调用servershutdownhandler来处理该事件。主要处理步骤如下：

public void process() throws IOException { 
    final String serverName = this.hsi.getServerName();

拆分日志，把regionserver的hlog按照region的不同进行拆分，分别写入各个region目录下

LOG.info(&quot;Splitting logs for &quot; + serverName); 
    this.services.getMasterFileSystem().splitLog(serverName);

在master的内存中清除rs中的region和server，并返回在master的RIT队列上的该RS的online region

 // Clean out anything in regions in transition.  Being conservative and 
    // doing after log splitting.  Could do some states before -- OPENING? 
    // OFFLINE? -- and then others after like CLOSING that depend on log 
    // splitting. 
    List regionsInTransition = 
      this.services.getAssignmentManager().processServerShutdown(this.hsi);

[如该RS包含ROOT或META则需首先分配之

 // Assign root and meta if we were carrying them. 
    if (isCarryingRoot()) { // -ROOT- 
      try { 
    this.services.getAssignmentManager().assignRoot(); 
      } catch (KeeperException e) { 
        this.server.abort(&quot;In server shutdown processing, assigning root&quot;, e); 
        throw new IOException(&quot;Aborting&quot;, e); 
      } 
    } 

    // Carrying meta? 
    if (isCarryingMeta())this.services.getAssignmentManager().assignMeta();

从meta表中获得RS1上的的region

// Wait on meta to come online; we need it to progress. 
    // TODO: Best way to hold strictly here?  We should build this retry logic 
    //       into the MetaReader operations themselves. 
    NavigableMap hris = null; 
    while (!this.server.isStopped()) { 
      try { 
        this.server.getCatalogTracker().waitForMeta(); 
        hris = MetaReader.getServerUserRegions(this.server.getCatalogTracker(), 
            this.hsi); 
        break; 
      } catch (InterruptedException e) { 
        Thread.currentThread().interrupt(); 
        throw new IOException(&quot;Interrupted&quot;, e); 
      } catch (IOException ioe) { 
        LOG.info(&quot;Received exception accessing META during server shutdown of &quot; + 
            serverName + &quot;, retrying META read&quot;); 
      } 
    }

移除RIT中状态为Closing或是PendingClose的region，得到的就是需要重新分配的regions[

 // Skip regions that were in transition unless CLOSING or PENDING_CLOSE 
    for (RegionState rit : regionsInTransition) { 
      if (!rit.isClosing() &amp;&amp; !rit.isPendingClose()) { 
        LOG.debug(&quot;Removed &quot; + rit.getRegion().getRegionNameAsString() + 
          &quot; from list of regions to assign because in RIT&quot;); 
        hris.remove(rit.getRegion()); 
      } 
    }

我们可以看到，在move过程中，若region尚未上线，此时master的RIT队列中region的状态是OFFLINE，而该region在master的角度来看是offline的，然而在severshutdown的处理中认为该region是要重新分配的。若在此时region在RS2上线了，那么而master依然要对该region进行分配这就导致了region的两次分配过程[code="java"]

LOG.info(&quot;Reassigning &quot; + hris.size() + &quot; region(s) that &quot; + serverName + 
      &quot; was carrying (skipping &quot; + regionsInTransition.size() + 
      &quot; regions(s) that are already in transition)&quot;); 

    // Iterate regions that were on this server and assign them 
    for (Map.Entry e: hris.entrySet()) { 
      if (processDeadRegion(e.getKey(), e.getValue(), 
          this.services.getAssignmentManager(), 
          this.server.getCatalogTracker())) { 
        this.services.getAssignmentManager().assign(e.getKey(), true); 
      } 
    } 
    this.deadServers.finish(serverName); 
    LOG.info(&quot;Finished processing of shutdown of &quot; + serverName); 
  }

整个过程起始是在rs shundown以后master需要对rs上的region重新assign到其他rs中去，但是在需要assign的region的序曲上出现了一些问题，将在master中已经offline的但是依然在RIT队列中的region重新分配，由于这些region可能已经在其他地方正在分配，但是还没有上线。当master开始assign region的时候此时region上线了，master依旧会继续分配重而导致region的两次分配

分享到：

HBase的数据的update | HBase的get过程(一)

2011-11-17 21:23
浏览 5755
评论(0)
分类:数据库
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论