- 浏览: 980097 次
文章分类
- 全部博客 (428)
- Hadoop (2)
- HBase (1)
- ELK (1)
- ActiveMQ (13)
- Kafka (5)
- Redis (14)
- Dubbo (1)
- Memcached (5)
- Netty (56)
- Mina (34)
- NIO (51)
- JUC (53)
- Spring (13)
- Mybatis (17)
- MySQL (21)
- JDBC (12)
- C3P0 (5)
- Tomcat (13)
- SLF4J-log4j (9)
- P6Spy (4)
- Quartz (12)
- Zabbix (7)
- JAVA (9)
- Linux (15)
- HTML (9)
- Lucene (0)
- JS (2)
- WebService (1)
- Maven (4)
- Oracle&MSSQL (14)
- iText (11)
- Development Tools (8)
- UTILS (4)
- LIFE (8)
最新评论
-
Donald_Draper:
Donald_Draper 写道刘落落cici 写道能给我发一 ...
DatagramChannelImpl 解析三(多播) -
Donald_Draper:
刘落落cici 写道能给我发一份这个类的源码吗Datagram ...
DatagramChannelImpl 解析三(多播) -
lyfyouyun:
请问楼主,执行消息发送的时候,报错:Transport sch ...
ActiveMQ连接工厂、连接详解 -
ezlhq:
关于 PollArrayWrapper 状态含义猜测:参考 S ...
WindowsSelectorImpl解析一(FdMap,PollArrayWrapper) -
flyfeifei66:
打算使用xmemcache作为memcache的客户端,由于x ...
Memcached分布式客户端(Xmemcached)
系统环境:
Ubuntu15.10
Hadoop:2.7.1
java:1.7.0_79
1.安装SSH 并产生公私钥
2.安装同步工具:
3.下载jdk1.7.0_79
解压到/usr/lib/java/下:
4.下Hadoop2.7.1
解压到/hadoop下:
donald_draper@rain:/hadoop$ tar -zxvf hadoop-2.7.1
5.配置环境变量:
在文件尾部添加:
:wq
保存退出
6.配置hadoop
hadoop2.7.1的所有配置文件从存在/hadoop/hadoop-2.7.1/etc/hadoop之中。
cd /hadoop/hadoop-2.7.1/etc/hadoop
1)修改hadoop-env.sh 加入jdk家目录
2)修改core-site.xml
3)修改hdfs-site.xml
4)修改mapred-site.xml
5)修改yarn-site.xml
6)修改slaves
slaves是指定子节点的位置,因为要在name上启动HDFS、在amrm启动yarn,所以name上的slaves文件指定的是datanode的位置,amrm上的slaves文件指定的是nodemanager的位置
cd /hadoop/hadoop-2.7.1/etc/hadoop/
vim slaves
rain
6.格式化HDFS,执行格式化命令 bin/
7.启动HDFS,
cd /hadoop/hadoop-2.7.1/sbin/
8.启动历史服务器
9.启动YARN
10.查看hdfs及yarn启动情况:
11.执行job
执行过程:
查看结果
5)
6)
备注:另外一种查看结果的方式
12.关闭hadoop
访问地址:
[url]http://192.168.126.136:50070 namenode[/url]
[url]http://192.168.126.136:8088 resourcemanager [/url]
[url]http://192.168.126.136:19888 jobhistroysever [/url]
相关错误:
2016-08-15 11:28:50,625 FATAL org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer: Error starting JobHistoryServer
org.apache.hadoop.yarn.webapp.WebAppException: Error starting http server
at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:279)
at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService.initializeWebApp(HistoryClientService.java:156)
at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService.serviceStart(HistoryClientService.java:121)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.serviceStart(JobHistoryServer.java:195)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.launchJobHistoryServer(JobHistoryServer.java:222)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(JobHistoryServer.java:231)
Caused by: java.net.SocketException: Unresolved address
问题解决:
查看mapred-site.xml的服务器地址,及web地址配置
Ubuntu15.10
Hadoop:2.7.1
java:1.7.0_79
1.安装SSH 并产生公私钥
sudo apt-get install ssh ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
2.安装同步工具:
sudo apt-get install rsync
3.下载jdk1.7.0_79
解压到/usr/lib/java/下:
4.下Hadoop2.7.1
解压到/hadoop下:
donald_draper@rain:/hadoop$ tar -zxvf hadoop-2.7.1
5.配置环境变量:
vim ./bashrc
在文件尾部添加:
export JAVA_HOME=/usr/lib/java/jdk1.7.0_79 export JRE_HOME=${JAVA_HOME}/jre export CLASS_PATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib export HADOOP_HOME=/hadoop/hadoop-2.7.1 export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:${PATH}
:wq
保存退出
6.配置hadoop
hadoop2.7.1的所有配置文件从存在/hadoop/hadoop-2.7.1/etc/hadoop之中。
cd /hadoop/hadoop-2.7.1/etc/hadoop
1)修改hadoop-env.sh 加入jdk家目录
export JAVA_HOME=/usr/lib/java/jdk1.7.0_79
2)修改core-site.xml
donald_draper@rain:/hadoop/hadoop-2.7.1/etc/hadoop$ cat core-site.xml <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://rain:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/hadoop/tmp</value> </property> </configuration>
3)修改hdfs-site.xml
donald_draper@rain:/hadoop/hadoop-2.7.1/etc/hadoop$ cat hdfs-site.xml <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
4)修改mapred-site.xml
donald_draper@rain:/hadoop/hadoop-2.7.1/etc/hadoop$ cat mapred-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <!-- 启动historyserver --> <property> <name>mapreduce.jobhistory.address</name> <value>rain:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>rain:19888</value> </property> <!--dir为分布式文件系统中的文件目录,启动时先启动dfs,在启动historyserver --> <property> <name>mapreduce.jobhistory.intermediate-done-dir</name> <value>/history/indone</value> </property> <!--dir为分布式文件系统中的文件目录,启动时先启动dfs,在启动historyserver --> <property> <name>mapreduce.jobhistory.done-dir</name> <value>/history/done</value> </property> </configuration>
5)修改yarn-site.xml
donald_draper@rain:/hadoop/hadoop-2.7.1/etc/hadoop$ cat yarn-site.xml <?xml version="1.0"?> <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration>
6)修改slaves
slaves是指定子节点的位置,因为要在name上启动HDFS、在amrm启动yarn,所以name上的slaves文件指定的是datanode的位置,amrm上的slaves文件指定的是nodemanager的位置
cd /hadoop/hadoop-2.7.1/etc/hadoop/
vim slaves
rain
6.格式化HDFS,执行格式化命令 bin/
hdfs namenode -format
7.启动HDFS,
cd /hadoop/hadoop-2.7.1/sbin/
donald_draper@rain:/hadoop/hadoop-2.7.1/sbin$ ./start-dfs.sh Starting namenodes on [rain] rain: starting namenode, logging to /hadoop/hadoop-2.7.1/logs/hadoop-donald_draper-namenode-rain.out localhost: starting datanode, logging to /hadoop/hadoop-2.7.1/logs/hadoop-donald_draper-datanode-rain.out Starting secondary namenodes [0.0.0.0] 0.0.0.0: starting secondarynamenode, logging to /hadoop/hadoop-2.7.1/logs/hadoop-donald_draper-secondarynamenode-rain.out
8.启动历史服务器
donald_draper@rain:/hadoop/hadoop-2.7.1/sbin$ ./mr-jobhistory-daemon.sh start historyserver starting historyserver, logging to /hadoop/hadoop-2.7.1/logs/mapred-donald_draper-historyserver-rain.out
9.启动YARN
cd /hadoop/hadoop-2.7.1/sbin/ donald_draper@rain:/hadoop/hadoop-2.7.1/sbin$ ./start-yarn.sh starting yarn daemons starting resourcemanager, logging to /hadoop/hadoop-2.7.1/logs/yarn-donald_draper-resourcemanager-rain.out localhost: starting nodemanager, logging to /hadoop/hadoop-2.7.1/logs/yarn-donald_draper-nodemanager-rain.out
10.查看hdfs及yarn启动情况:
donald_draper@rain:/hadoop/hadoop-2.7.1/logs$ jps 7114 DataNode 7743 NodeManager 8921 Jps 7607 ResourceManager 7319 SecondaryNameNode 8779 JobHistoryServer 6984 NameNode
11.执行job
1)hdfs dfs -mkdir /test 2)hdfs dfs -mkdir /test/input 3)hdfs dfs -put etc/hadoop/*.xml /test/input 4)donald_draper@rain:/hadoop/hadoop-2.7.1$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar grep /test/input /test/output 'dfs[a-z.]+'
执行过程:
16/08/15 11:37:50 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 16/08/15 11:37:52 INFO input.FileInputFormat: Total input paths to process : 9 16/08/15 11:37:52 INFO mapreduce.JobSubmitter: number of splits:9 16/08/15 11:37:52 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1471230621598_0001 16/08/15 11:37:53 INFO impl.YarnClientImpl: Submitted application application_1471230621598_0001 16/08/15 11:37:53 INFO mapreduce.Job: The url to track the job: http://rain:8088/proxy/application_1471230621598_0001/ 16/08/15 11:37:53 INFO mapreduce.Job: Running job: job_1471230621598_0001 16/08/15 11:38:16 INFO mapreduce.Job: Job job_1471230621598_0001 running in uber mode : false 16/08/15 11:38:16 INFO mapreduce.Job: map 0% reduce 0% 16/08/15 11:45:11 INFO mapreduce.Job: map 67% reduce 0% 16/08/15 11:48:06 INFO mapreduce.Job: map 74% reduce 22% 16/08/15 11:48:22 INFO mapreduce.Job: map 89% reduce 22% 16/08/15 11:48:23 INFO mapreduce.Job: map 100% reduce 22% 16/08/15 11:48:49 INFO mapreduce.Job: map 100% reduce 30% 16/08/15 11:48:51 INFO mapreduce.Job: map 100% reduce 33% 16/08/15 11:48:54 INFO mapreduce.Job: map 100% reduce 67% 16/08/15 11:49:03 INFO mapreduce.Job: map 100% reduce 100% 16/08/15 11:49:25 INFO mapreduce.Job: Job job_1471230621598_0001 completed successfully 16/08/15 11:49:45 INFO mapreduce.Job: Counters: 50 File System Counters FILE: Number of bytes read=51 FILE: Number of bytes written=1156955 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=28205 HDFS: Number of bytes written=143 HDFS: Number of read operations=30 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Killed map tasks=2 Launched map tasks=11 Launched reduce tasks=1 Data-local map tasks=11 Total time spent by all maps in occupied slots (ms)=3308143 Total time spent by all reduces in occupied slots (ms)=227199 Total time spent by all map tasks (ms)=3308143 Total time spent by all reduce tasks (ms)=227199 Total vcore-seconds taken by all map tasks=3308143 Total vcore-seconds taken by all reduce tasks=227199 Total megabyte-seconds taken by all map tasks=3387538432 Total megabyte-seconds taken by all reduce tasks=232651776 Map-Reduce Framework Map input records=781 Map output records=2 Map output bytes=41 Map output materialized bytes=99 Input split bytes=969 Combine input records=2 Combine output records=2 Reduce input groups=2 Reduce shuffle bytes=99 Reduce input records=2 Reduce output records=2 Spilled Records=4 Shuffled Maps =9 Failed Shuffles=0 Merged Map outputs=9 GC time elapsed (ms)=213752 CPU time spent (ms)=39770 Physical memory (bytes) snapshot=1636868096 Virtual memory (bytes) snapshot=7041122304 Total committed heap usage (bytes)=1388314624 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=27236 File Output Format Counters Bytes Written=143 16/08/15 11:49:47 INFO ipc.Client: Retrying connect to server: rain/192.168.126.136:45795. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 16/08/15 11:49:48 INFO ipc.Client: Retrying connect to server: rain/192.168.126.136:45795. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 16/08/15 11:49:49 INFO ipc.Client: Retrying connect to server: rain/192.168.126.136:45795. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS) 16/08/15 11:49:50 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 16/08/15 11:50:49 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 16/08/15 11:50:51 INFO input.FileInputFormat: Total input paths to process : 1 16/08/15 11:50:51 INFO mapreduce.JobSubmitter: number of splits:1 16/08/15 11:50:53 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1471230621598_0002 16/08/15 11:50:53 INFO impl.YarnClientImpl: Submitted application application_1471230621598_0002 16/08/15 11:50:53 INFO mapreduce.Job: The url to track the job: [color=red]http://rain:8088/proxy/application_1471230621598_0002/[/color] 16/08/15 11:50:53 INFO mapreduce.Job: Running job: job_1471230621598_0002 16/08/15 11:51:29 INFO mapreduce.Job: Job job_1471230621598_0002 running in uber mode : false 16/08/15 11:51:29 INFO mapreduce.Job: map 0% reduce 0% 16/08/15 11:51:39 INFO mapreduce.Job: map 100% reduce 0% 16/08/15 11:51:48 INFO mapreduce.Job: map 100% reduce 100% 16/08/15 11:51:51 INFO mapreduce.Job: Job job_1471230621598_0002 completed successfully 16/08/15 11:51:51 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=51 FILE: Number of bytes written=230397 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=276 HDFS: Number of bytes written=29 HDFS: Number of read operations=7 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=6533 Total time spent by all reduces in occupied slots (ms)=8187 Total time spent by all map tasks (ms)=6533 Total time spent by all reduce tasks (ms)=8187 Total vcore-seconds taken by all map tasks=6533 Total vcore-seconds taken by all reduce tasks=8187 Total megabyte-seconds taken by all map tasks=6689792 Total megabyte-seconds taken by all reduce tasks=8383488 Map-Reduce Framework Map input records=2 Map output records=2 Map output bytes=41 Map output materialized bytes=51 Input split bytes=133 Combine input records=0 Combine output records=0 Reduce input groups=1 Reduce shuffle bytes=51 Reduce input records=2 Reduce output records=2 Spilled Records=4 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=59 CPU time spent (ms)=1660 Physical memory (bytes) snapshot=467501056 Virtual memory (bytes) snapshot=1429606400 Total committed heap usage (bytes)=276299776 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=143 File Output Format Counters Bytes Written=29
查看结果
5)
donald_draper@rain:/hadoop/hadoop-2.7.1$ hdfs dfs -get /test/output output 16/08/15 11:52:19 WARN hdfs.DFSClient: DFSInputStream has been closed already 16/08/15 11:52:19 WARN hdfs.DFSClient: DFSInputStream has been closed already
6)
donald_draper@rain:/hadoop/hadoop-2.7.1$ cat output/* 1 dfsadmin 1 dfs.replication
备注:另外一种查看结果的方式
hdfs dfs -cat /test/output/*
12.关闭hadoop
stop-yarn.sh mr-jobhistory-daemon.sh stop historyserver stop-dfs.sh
访问地址:
[url]http://192.168.126.136:50070 namenode[/url]
[url]http://192.168.126.136:8088 resourcemanager [/url]
[url]http://192.168.126.136:19888 jobhistroysever [/url]
相关错误:
2016-08-15 11:28:50,625 FATAL org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer: Error starting JobHistoryServer
org.apache.hadoop.yarn.webapp.WebAppException: Error starting http server
at org.apache.hadoop.yarn.webapp.WebApps$Builder.start(WebApps.java:279)
at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService.initializeWebApp(HistoryClientService.java:156)
at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService.serviceStart(HistoryClientService.java:121)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.serviceStart(JobHistoryServer.java:195)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.launchJobHistoryServer(JobHistoryServer.java:222)
at org.apache.hadoop.mapreduce.v2.hs.JobHistoryServer.main(JobHistoryServer.java:231)
Caused by: java.net.SocketException: Unresolved address
问题解决:
查看mapred-site.xml的服务器地址,及web地址配置
相关推荐
本教程详细指导如何在Ubuntu 14.04 64位系统上配置Hadoop的伪分布式环境。虽然教程是基于Ubuntu 14.04,但同样适用于Ubuntu 12.04、16.04以及32位系统,同时也适用于CentOS/RedHat系统的类似配置。教程经过验证,...
本篇将详细阐述如何在Hadoop 2.7.1环境下搭建HBase 1.2.1集群,并进行性能优化,以提升系统效率。 首先,我们需要了解Hadoop和HBase的基本概念。Hadoop是基于分布式文件系统HDFS(Hadoop Distributed File System)...
标题中的"hadoop2.7.1.rar"表明这是一个关于Apache Hadoop的压缩文件,具体版本为2.7.1。Hadoop是一个开源框架,主要用于分布式存储和计算,它由Apache软件基金会开发,广泛应用于大数据处理领域。这个压缩包可能是...
在这个特定的压缩包中,我们关注的是Hadoop 2.7.1版本的Windows 64位DLL(动态链接库)文件,这些文件对于在Windows环境下运行Hadoop生态系统至关重要。 1. **winutils.exe**: 这是Hadoop在Windows上运行的关键组件...
本压缩包提供了这些组件的安装部署资源,便于快速搭建一个完整的Hadoop2.7.1、ZK3.5、HBase2.1和Phoenix5.1.0的基础环境。 首先,Hadoop是Apache开源项目,它提供了分布式文件系统(HDFS)和MapReduce计算框架,...
在开始搭建Hadoop完全分布式集群之前,需要确保已经准备好相应的硬件资源和软件环境。本篇指南旨在为初学者提供一个全面且详细的Hadoop集群搭建流程,以便更好地理解和掌握大数据处理的基本架构。 #### 二、硬件...
在这个文档中,我们将详细讲解如何在Linux Ubuntu 64位操作系统环境下搭建Hadoop 2.7.1集群。\n\n**环境准备**\n\n首先,我们需要在VirtualBox中创建Linux Ubuntu 64位虚拟机作为系统环境。Hadoop 2.7.1适用于任何...
【虚拟机搭建Hadoop伪分布式及Hbase】的文档主要涉及了如何在虚拟机环境下配置Hadoop和Hbase。下面将详细阐述整个过程的关键步骤和相关知识点。 首先,我们需要准备必要的软件,包括虚拟机软件VMware 16.0、Ubuntu ...
首先,我们来看标题——"基于虚拟机集群hadoop2.7.1配置文件"。这意味着我们要在多台虚拟机上建立一个Hadoop集群,使用的是Hadoop 2.7.1版本。这个版本是Hadoop的稳定版本,包含了YARN(Yet Another Resource ...
总之,这份“hadoop2.7.1 Windows安装依赖文件”集合为在Windows环境下搭建Hadoop提供了必要的组件,通过正确配置和使用这些文件,开发者可以在Windows系统上顺利运行Hadoop,进行大数据处理和分析任务。
总的来说,`hadoop-2.7.1.tar.gz` 包含了搭建、配置和运行一个功能齐全的Hadoop环境所需的所有文件,为大数据处理提供了强大的基础。无论是初学者还是经验丰富的开发者,都能从中学习到关于Hadoop分布式计算框架的...
标题 "hadoop2.7.1+hbase2.1.4+zookeeper3.6.2.rar" 提供的信息表明这是一个包含Hadoop 2.7.1、HBase 2.1.4和ZooKeeper 3.6.2的软件集合。这个压缩包可能包含了这些分布式系统的安装文件、配置文件、文档以及其他...
总之,Hadoop 2.7.1版本的Winutils.exe和hadoop.dll在Windows环境下提供了对Hadoop的基本支持。通过正确配置环境变量,用户可以在Windows上搭建和运行Hadoop,进行大数据处理。在实际操作中,理解这些组件的功能和...
本文档旨在提供在 Windows 7 操作系统上搭建 Hadoop 的详细步骤,帮助读者顺利搭建并测试 Hadoop 环境。 #### 系统与软件准备 - **操作系统**: Windows 7 (64位) - **Hadoop 版本**: 2.7.1 - **JDK 版本**: 1.7 - ...
### Hadoop2.7.1 + HBase1.3.5 在 CentOS6.5 虚拟机环境下的安装配置指南 #### 准备工作 为了确保 Hadoop 和 HBase 的顺利安装,需要提前做好一系列准备工作,包括安装 VMware、设置虚拟机、配置 CentOS 操作系统等...
### Hadoop开发环境搭建知识点详解 #### 一、Hadoop简介及重要性 Hadoop是一个开源的分布式计算框架,能够高效地处理大规模数据集。它主要由两大部分组成:Hadoop Distributed File System (HDFS) 和 MapReduce。...