`
gaojingsong
  • 浏览: 1202434 次
  • 性别: Icon_minigender_1
  • 来自: 深圳
文章分类
社区版块
存档分类
最新评论

【Apache Accumulo 介绍】

阅读更多

Apache Accumulo 是一个可靠的、可伸缩的、高性能的排序分布式的 Key-Value 存储解决方案,基于单元访问控制以及可定制的服务器端处理。使用 Google BigTable 设计思路,基于 Apache Hadoop、Zookeeper 和 Thrift 构建.



 

Leveldb是Google开发的一个非常高效的kv数据库,支持billion级别的数据量,在这个数量级别下还有着非常高的性能,主要归功于它的良好的设计,特别是LSM算法。Leveldb已经作为存储引擎被Riak和Kyoto Tycoon所支持,在国内淘宝的Tair开源key-value存储也已经将LevelDB作为其持久化存储引擎,并部署在线上使用。

 

Apache Accumulo is based on the design of Google's BigTable and is powered by Apache Hadoop, Apache Zookeeper, and Apache Thrift.

 

 

Accumulo has several novel features such as cell-based access control and a server-side programming mechanism that can modify key/value pairs at various points in the data management process.

 

 

Accumulo is a distributed data storage and retrieval system and as such consists of several architectural

components, some of which run on many individual servers. Much of the work Accumulo

does involves maintaining certain properties of the data, such as organization, availability, and

integrity, across many commodity-class machines.

 



 

二、组成部分介绍---Accumulo Components

An instance of Accumulo includes many TabletServers, one Garbage Collector process, one

Master server and many Clients.

2.3.1 Tablet Server

The TabletServer manages some subset of all the tablets (partitions of tables). This includes

receiving writes from clients, persisting writes to a write-ahead log, sorting new key-value pairs

in memory, periodically flushing sorted key-value pairs to new files in HDFS, and responding to

reads from clients, forming a merge-sorted view of all keys and values from all the files it has

created and the sorted in-memory store.

TabletServers also perform recovery of a tablet that was previously on a server that failed,

reapplying any writes found in the write-ahead log to the tablet.

 

2.3.2 Garbage Collector

Accumulo processes will share files stored in HDFS. Periodically, the Garbage Collector will

identify files that are no longer needed by any process, and delete them. Multiple garbage

collectors can be run to provide hot-standby support. They will perform leader election among

themselves to choose a single active instance.

 

2.3.3 Master

The Accumulo Master is responsible for detecting and responding to TabletServer failure. It tries

to balance the load across TabletServer by assigning tablets carefully and instructing TabletServers

to unload tablets when necessary. The Master ensures all tablets are assigned to one

TabletServer each, and handles table creation, alteration, and deletion requests from clients.

The Master also coordinates startup, graceful shutdown and recovery of changes in write-ahead

logs when Tablet servers fail.

Multiple masters may be run. The masters will choose among themselves a single master, and

the others will become backups if the master should fail.

 

2.3.4 Tracer

The Accumulo Tracer process supports the distributed timing API provided by Accumulo. One

to many of these processes can be run on a cluster which will write the timing information to a

given Accumulo table for future reference. Seeing the section on Tracing for more information

on this support.

 

2.3.5 Monitor

The Accumulo Monitor is a web application that provides a wealth of information about the

state of an instance. The Monitor shows graphs and tables which contain information about

read/write rates, cache hit/miss rates, and Accumulo table information such as scan rate and

active/queued compactions. Additionally, the Monitor should always be the first point of entry

when attempting to debug an Accumulo problem as it will show high-level problems in addition

to aggregated errors from all nodes in the cluster. See the section on Monitoring for more

information.

Multiple Monitors can be run to provide hot-standby support in the face of failure. Due to the

forwarding of logs from remote hosts to the Monitor, only one Monitor process should be active

at one time. Leader election will be performed internally to choose the active Monitor.

 

2.3.6 Client

Accumulo includes a client library that is linked to every application. The client library contains

logic for finding servers managing a particular tablet, and communicating with TabletServers to

write and retrieve key-value pairs.

 

  • 大小: 48.5 KB
  • 大小: 9.4 KB
0
0
分享到:
评论

相关推荐

    Apache Accumulo for Developers

    Apache Accumulo是一款基于Google BigTable设计,建立在Apache Hadoop、Zookeeper和Thrift之上的有序分布式键/值存储系统,它能够处理大量数据,并具备高度的健壮性和可扩展性。Apache Accumulo for Developers一书...

    基于Apache Accumulo的Apache Fluo设计源码解析

    该项目深入解析了基于Apache Accumulo的Apache Fluo设计源码,包含346个文件,涵盖291个Java源文件、22个XML配置文件、11个属性文件、4个Shell脚本、3个SVG图形文件以及少量其他类型文件。Apache Fluo作为Google ...

    Apache Hadoop---Accumulo.docx

    Apache Accumulo是一个强大的分布式Key-Value存储系统,它源于Google BigTable的设计理念,专为大规模数据存储和处理而构建。Accumulo的核心特性包括可靠性、可扩展性、高性能以及细粒度的访问控制,这些都是通过...

    accumulo-1.10.2-src.tar.gz

    Apache Accumulo排序的分布式键/值存储基于Google的BigTable设计。 它基于Apache Hadoop,Apache Zookeeper和Apache Thrift构建。 它以单元级访问标签和服务器端编程机制的形式对BigTable设计进行了一些新颖的改进,...

    accumulo-1.10.2-bin.tar.gz

    Apache Accumulo排序的分布式键/值存储基于Google的BigTable设计。 它基于Apache Hadoop,Apache Zookeeper和Apache Thrift构建。 它以单元级访问标签和服务器端编程机制的形式对BigTable设计进行了一些新颖的改进,...

    hdp-accumulo:运行 Hortonworks Data Platform 和 Apache Accumulo 的虚拟开发环境

    运行 Hortonworks Data Platform 和 Apache Accumulo 的虚拟开发环境 使用 Vagrant 启动并运行。 安装 Vagrant 安装 Virtual Box git clone cd hdp-accumulo vagrant up 您可以从本地机器上的代码访问 ...

    helloaccumulo:测试连接和写入任何内容到 Apache Accumulo 的简单示例

    Apache Accumulo 是一个分布式键值存储系统,由美国国家安全局(NSA)开发,并贡献给了Apache软件基金会。它基于Google的Bigtable设计,提供了一个安全、可扩展且高性能的数据存储和检索平台。在这个"helloaccumulo...

    accumulo-formula:设置Apache Accumulo-基于Apache Hadoop和Zookeeper的安全键值存储

    哪些服务最终会在给定的主机上运行,​​将取决于配置目录中类似hadoop的文本列表文件,进而取决于通过盐粒定义的角色: accumulo_master将运行master,monitor和gc(如果还存在开发角色,则运行示踪剂) accumulo_...

    accumulo-upgrade-test:测试 Apache Accumulo 升级

    标题:Apache Accumulo 升级测试通知:版权所有 (c) 2014,Cloudera, Inc. 保留所有权利。 Cloudera, Inc. licenses this file to you under the Apache License, Version 2.0 (the "License"). You may not use ...

    accumulo-saltstack:用于 SaltStack 集群管理工具的 Apache Accumulo 集成 https

    SaltStack 集群管理工具的 Apache Accumulo 集成 ( ) 注意以下内容不完整,但列出了该项目的发展方向。 这是一个工作示例项目,将帮助您启动 accumulo 集群 - 通常在 AWS 和 Openstack 等云中。 它也可以与为本地...

    accumulo-website:Apache累积网站

    Apache Accumulo网站 Apache Accumulo的网站由Markdown来源(特别是风格)使用,并使用管理其gem依赖项。 发展 自定义液体标签 Jekyll在解释Markdown内容之前使用处理文件。 我们使用其插件机制扩展了Jekyll,以创建...

    accumulo-2.0.0-alpha-2-bin.tar.gz

    Apache Accumulo 是一个可靠的、可伸缩的、高性能的排序分布式的 Key-Value 存储解决方案,基于单元访问控制以及可定制的服务器端处理。使用 Google BigTable 设计思路,基于 Apache Hadoop、Zookeeper 和 Thrift ...

    accumulo:Apache累积

    使用Apache Accumulo,用户可以跨集群存储和管理大型数据集。 Accumulo使用的HDFS来存储其数据,并使用达成共识。 请访问以获取新闻和常规信息。 入门 按照入门安装和运行Accumulo 阅读 运行以了解如何编写...

    Accumulo - Application Development, Table Design, and Best Practices

    Accumulo - Apache Accumulo is a highly scalable, distributed, open source data store modeled after Google’s Bigtable design.

    Accumulo Application Development, Table Design, and Best Practices

    的联合创始人,曾在NSA担任计算机系统研究员,并启动和领导了Apache Accumulo项目。 - **Billie Rinaldi**:Hortonworks, Inc.的高级技术员工,曾是NSA计算机科学研究团队的领导者,负责Accumulo的实现。 - **...

    jupiter:Jupiter是Apache Accumulo的易于使用的存储层

    木星 该项目在顶部实现了3个数据存储:一个用于 ,一个用于和事实(一个事实是对问题域进行声明性声明的谓词表达式)和一个用于。 这些数据存储并不意味着效率高,而是易于使用。 将木星添加到您的构建中 ...

    presto-accumulo:Presto Accumulo集成

    一个Presto连接器,用于读取和写入由Apache Accumulo支持的数据。 寻找PrestoBatchWriter吗? 签出0.178分支。 该存储库包含四个子项目: presto-accumulo-iterators-要在TabletServer上安装的Accumulo迭代器的...

    accumulo

    Apache Accumulo排序的分布式键/值存储基于Google的BigTable设计。它基于Apache Hadoop,Apache Zookeeper和Apache Thrift构建。它以单元级访问标签和服务器端编程机制的形式对BigTable设计进行了一些新颖的改进,该...

    datawave:DataWave是一个摄取查询框架,利用Apache Accumulo提供快速,安全的数据访问

    DataWave是基于Java的提取和查询框架,它利用提供对数据的快速,安全访问。 DataWave支持各种用例,包括但不限于... 跨结构化和非结构化数据集的数据融合 分布图的构建与分析 多租户数据体系结构,其中租户具有不同...

Global site tag (gtag.js) - Google Analytics