What Is Hadoop?
The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. Hadoop includes these subprojects:
* Hadoop Common: The common utilities that support the other Hadoop subprojects.
* Chukwa: A data collection system for managing large distributed systems.
* HBase: A scalable, distributed database that supports structured data storage for large tables.
* HDFS: A distributed file system that provides high throughput access to application data.
* Hive: A data warehouse infrastructure that provides data summarization and ad hoc querying.
* MapReduce: A software framework for distributed processing of large data sets on compute clusters.
* Pig: A high-level data-flow language and execution framework for parallel computation.
* ZooKeeper: A high-performance coordination service for distributed applications.
Hadoop的是什么?
Apache的Hadoop项目的发展提供可靠,可扩展,分布式计算开放源码软件。
Hadoop的包括以下子项目:
* Hadoop的共同点:即支持其他Hadoop的子项目的共同事业。
* Chukwa:一个管理大型分布式系统的数据收集系统。
* HBase:一种可扩展,分布式数据库,支持结构化数据大表存储。
* HDFS:一个分布式的文件系统,提供了高吞吐量的应用程序数据访问。
*
Hive:一种数据仓库基础设施,提供数据汇总和特设查询。
* MapReduce的:一个大型的分布式数据处理软件框架集的计算集群。
*
Pig:一种高层次的数据流语言和并行计算的执行框架。
*
ZooKeeper:一个分布式应用的高性能的协调服务。
分享到:
相关推荐
### Hadoop:大数据处理的关键技术 #### 引言:数据爆炸与挑战 随着信息技术的飞速发展,全球数据量呈指数级增长。根据国际数据公司(IDC)的报告,2011年产生的数字信息量是2006年的十倍之多,达到了惊人的1,800...
该文档来自2013中国大数据技术大会上,Member of the Project Management Committee at Apache Hadoop,Nicholas关于《HDFS: What is New in Hadoop 2》主题的演讲。
Are you confused by the choice of platforms and are unsure about what to look for in a Hadoop platform? Then this guide is for you. Increasing numbers of enterprises are turning to Hadoop as an ...
The Hadoop 2 release series is the current active release series and contains the most stable versions of Hadoop. There are new chapters covering YARN (Chapter 4), Parquet (Chapter 13), Flume ...
Starting with understanding what deep learning is and what the various models associated with deep learning are, this book will then show you how to set up the Hadoop environment for deep learning....
007 What is Apache Hadoop讲解 008 Hadoop 的发展史和版本发展与区别 009 Hadoop 生态系统介绍讲解 010 Hadoop 生态系统介绍讲解 011 Hadoop 服务讲解 012 HDFS 架构的讲解 013 MapReduce 架构讲解和MapReduce思想...
Who this book is for Conventions Reader feedback Customer support Downloading the example code Downloading the color images of this book Errata Piracy Questions 1. Setting Up Environment ...
In addition to giving you deeper insight into how big data processing works, learning about the fundamentals of MapReduce and Hadoop first will help you really appreciate how much easier Spark is to ...
Elton Stoneman’s Hadoop Succinctly explains how Hadoop works, what goes on in the cluster, demonstrates how to move data in and out of Hadoop, and how to query it efficiently. It also walks through ...
The what, why, and how of building big data analytic systems with the Hadoop ecosystem Libraries, toolkits, and algorithms to make development easier and more effective Best practices to use when ...
Contents Noticesv Trademarksvi IBM Redbooks promotions vii Prefaceix Authorsix Now you can become a published author, too!x Comments welcome xi Stay connected to IBM Redbooks ...13 What is Apache Spark6
Starting with understanding what deep learning is, and what the various models associated with deep neural networks are, this book will then show you how to set up the Hadoop environment for deep ...