`

Hama: Architecture

    博客分类:
  • Hama
阅读更多

http://wiki.apache.org/hama/Architecture

 

Components

 

Apache Hama, based on Bulk Synchronous Parallel model[1], comprises three major components:

It is very similar with Hadoop architecture, only except the portion of communication and synchronization mechanisms.

In a normal usecase the user submits a so called "Job" which is a definition of how to run a computation. A job once submitted will have multiple tasks that are launched across the cluster.

 

BSPMaster

 

BSPMaster is responsible for the following:

  • Maintaining its own state.

  • Maintaining groom server status.
  • Maintaining supersteps and other counters in a cluster.
  • Maintaining jobs and tasks.

  • Scheduling Jobs and assigning tasks to groom servers
  • Distributing execution classes and configuration across groom servers.
  • Providing users with the cluster control interface (web and console based).

A BSP Master and multiple grooms are started by the script. Then, the bsp master starts up with a RPC server to which groom servers can dynamically register itself. Groom servers starts up with a BSPPeer instance - later, BSPPeer needs to be integrated with GroomServer - and a RPC proxy to contact the bsp master. After started, each groom periodically sends a heartbeat message that encloses its groom server status, including maximum task capacity, unused memory, and so on.

Each time the bsp master receives a heartbeat message, it brings up-to-date groom server status - the bsp master makes use of groom servers' status in order to effectively assign tasks to idle groom servers - and returns a heartbeat response that contains assigned tasks and others actions that a groom server has to do. For now, we have a FIFO job scheduler and very simple task assignment algorithms.

 

GroomServer

 

A Groom Server (shortly referred to as groom) is a process that manages life cycle of bsp tasks assigned by BSPMaster. Each groom contacts the BSPMaster, and reports task statuses by means of periodical piggybacks with BSPMaster. Each groom is designed to run with HDFS or other distributed storages. Basically, a groom server and a data node should run on one physical node to get the best performance for data-locality. Note that in a massive parallel environment, the benefit of data locality is lost when large amount of virtual processes must be multiplexed onto physical processes[2].

 

Zookeeper

 

A Zookeeper is used to manage the efficient barrier synchronization of the BSPPeers. Later, it will also be used for the area of a fault tolerance system.

 

Communication and Synchronization Process

 

Each BSP task has a set of Outgoing Message Manager and Incoming Queue.

Outgoing Message Manager collects the message to be sent, serializes it, compresses it and puts it in a bundles. At barrier synchronization phase, each BSP task exchanges the bundles, deserializes it, decompresses it and puts it into the Incoming Queue.

 

System Diagram

 

alt Hama Cluster Startup Procedure

  1. BSPMaster starts up
  2. GroomServer starts up

  3. ZooKeeper cluster starts up

  4. GroomServer dynamically registers itself to BSPMaster

  5. GroomServer forks/ manages BSPPeer(s)

  6. BSPPeers communicate/ perform barrier synchronization through ZooKeeper cluster.

 

Reference

 

[1]. Valiant, Leslie G., A bridging model for parallel computation.

[2]. David B. Skillicorn, Jonathan M. D. Hill, and W. F. McColl. Questions and Answers about BSP. Scientific Programming, 6(3):249-274, Fall 1997.

分享到:
评论

相关推荐

    Hama 安装笔记

    分布式模式的Hama安装笔记,内容如下: 1.参考“hadoop安装.txt”,完成hadoop的安装。节点信息如下: 192.168.1.160 hadoop-1 192.168.1.161 hadoop-2 192.168.1.162 hadoop-3 2.添加环境变量 在/etc/profile...

    hama-core-0.7.1.zip

    【标题】"Hama Core 0.7.1" 是一个开源项目的压缩包,它包含了Hama框架的核心组件。Hama是一个基于Java开发的大数据处理框架,它主要应用于大规模的并行计算,尤其是在图计算和矩阵运算方面。这个0.7.1版本的发布...

    汉密顿焦虑量表(HAMA)(Hamilton Anxiety Scale.doc

    ### 汉密顿焦虑量表(HAMA)详解 #### 一、概述 汉密顿焦虑量表(Hamilton Anxiety Scale,简称HAMA)是由M. Hamilton于1959年编制而成的精神医学评估工具,旨在量化评价个体的焦虑水平。作为一种广泛应用于临床实践...

    Hama-0.6.0

    Hama-0.6.0是一个重要的开源项目,专为大数据处理设计,尤其在处理大规模图计算任务时表现出色。这个版本包含了安装版和源码,为用户提供了解析、编译以及自定义扩展的可能性。其核心是基于 Bulk Synchronous ...

    汉密顿焦虑量表(HAMA).doc

    汉密顿焦虑量表(HAMA)是精神科领域广泛应用的一种心理评估工具,由Max Hamilton于1959年制定。该量表主要用于评估患者焦虑症状的严重程度,它包含14个项目,每个项目采用0至4分的评分法,对应不同的症状严重程度。...

    基于Hama并行计算框架的多层级作业调度算法的研究及实现

    基于Hama并行计算框架的多层级作业调度算法的研究及实现 胡月胜

    HAMA焦虑量表.doc

    汉密尔顿焦虑量表(HAMA)是一种用于评估焦虑症状严重程度的心理测量工具,广泛应用于临床心理学和精神科领域。该量表由14个条目组成,每个条目针对一个特定的焦虑症状,通过评分来判断患者的情况。下面我们将详细...

    2021年HAMA焦虑量表.docx

    汉密尔顿焦虑量表(HAMA)自推出以来,在心理评估领域发挥了不可替代的作用,它是一种精准的心理测量工具,用于评估个体的焦虑症状严重程度,尤其在临床环境中对患者的心理健康状况进行量化。HAMA作为医学心理学和...

    HAMA抑郁量表.pdf

    HAMA抑郁量表.pdf

    基于Hama并行计算框架的多层级作业调度算法的研究及实现.pdf

    【基于Hama并行计算框架的多层级作业调度算法研究及实现】 Hama是一个基于Bulk Synchronous Parallel (BSP)模型的分布式并行计算框架,主要用于大规模科学计算。Hama弥补了Hadoop平台的局限性,特别是在图计算领域...

    Hama图计算模型 Pi计算编译文件

    Hama图计算模型 Pi计算编译文件

    汉密尔顿焦虑量表HAMA项打印版.pdf

    "汉密尔顿焦虑量表HAMA项打印版.pdf" 汉密尔顿焦虑量表(HAMA)是一种常用的评估工具,用于评估个体的焦虑水平。该量表由十四个项目组成,涵盖了焦虑的多个方面,包括情绪、认知、躯体性症状、生殖泌尿神经系统症状...

    综合护理干预对无肝素血液透析患者HAMA凝血程度及不良反应率的影响分析

    2. HAMA焦虑量表:HAMA(Hamilton Anxiety Rating Scale)是评估焦虑程度的临床量表,通过一系列问题或观察,来确定患者焦虑症状的严重程度。在本研究中,使用HAMA量表来评估护理干预前后患者焦虑水平的变化。 3. ...

    hamabeads:Hama Beads 编辑器

    哈马珠,又称Perler Beads或Hama Beads,是一种流行的创意手工活动材料,尤其是对于儿童和手工爱好者。它们是一系列彩色的小珠子,通过在铁板上排列出特定图案,然后用熨斗热熔成形,可以创造出各种精美的像素艺术...

    汉密尔顿焦虑量表HAMA(14项打印版)-2页.pdf

    汉密尔顿焦虑量表HAMA(Hamilton Anxiety Rating Scale) 汉密尔顿焦虑量表HAMA是一种常用的评估工具,用于评估焦虑症状的严重性和变化。该量表由 Max Hamilton 在1959年开发,包含14个项目,评定员通过对被评定者...

    hardoop学习

    12. Apache Hama:是一个基于 HDFS 的 BSP(Bulk Synchronous Parallel)计算框架,提供了一个高性能的分布式计算平台。 通过学习这些产品,可以充分发挥 Hadoop 家族产品的威力,进行高速运算和存储,从而实现大...

    WeatherStationDataRx:Arduino库,用于从Venus W174W132(经测试),Auriol H13726,Hama EWS 1500,Meteoscan W155W160读取天气数据

    Arduino库,用于从Ventus W174 / W132(已测试),Auriol H13726,Hama EWS 1500,Meteoscan W155 / W160读取天气数据 此处描述了气象站的传输协议: : 可以通过接收器模块RXB6 / MX-RM-5V进行通信,也可以直接...

Global site tag (gtag.js) - Google Analytics