`

[spark-src-core] 4.1 spark on yarn

 
阅读更多

  as the officials statements,spark is a computation framework,ie u can use it anywhere on which supplys a platform (eg yarn ,mesos) to run .

  so in this cluster manager,the all spark's daemons are unnecessary to run the app.feel free to stop all of them.

 



 

  

hadoop@xx:~/spark/spark-1.4.1-bin-hadoop2.4$   ./bin/spark-submit  --class org.apache.spark.examples.JavaWordCount --deploy-mode client --master yarn lib/spark-examples-1.4.1-hadoop2.4.0.jar hdfs://host02:/user/hadoop/input.txt
Spark Command: /usr/local/jdk/jdk1.6.0_31/bin/java -cp /home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/conf/:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-assembly-1.4.1-hadoop2.4.0.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar:/usr/local/hadoop/hadoop-2.5.2/etc/hadoop/ -Xms6g -Xmx6g -XX:MaxPermSize=256m org.apache.spark.deploy.SparkSubmit --master yarn --deploy-mode client --class org.apache.spark.examples.JavaWordCount lib/spark-examples-1.4.1-hadoop2.4.0.jar hdfs://host02:/user/hadoop/input.txt
========================================
-executed cmd retruned by Main.java:/usr/local/jdk/jdk1.6.0_31/bin/java -cp /home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/conf/:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-assembly-1.4.1-hadoop2.4.0.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar:/usr/local/hadoop/hadoop-2.5.2/etc/hadoop/ -Xms6g -Xmx6g -XX:MaxPermSize=256m org.apache.spark.deploy.SparkSubmit --master yarn --deploy-mode client --class org.apache.spark.examples.JavaWordCount lib/spark-examples-1.4.1-hadoop2.4.0.jar hdfs://host02:/user/hadoop/input.txt
16/09/27 11:56:38 INFO spark.SparkContext: Running Spark version 1.4.1
16/09/27 11:56:38 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/09/27 11:56:38 INFO spark.SecurityManager: Changing view acls to: hadoop
16/09/27 11:56:38 INFO spark.SecurityManager: Changing modify acls to: hadoop
16/09/27 11:56:38 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
16/09/27 11:56:39 INFO slf4j.Slf4jLogger: Slf4jLogger started
16/09/27 11:56:39 INFO Remoting: Starting remoting
16/09/27 11:56:39 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.100.4:53607]
16/09/27 11:56:39 INFO util.Utils: Successfully started service 'sparkDriver' on port 53607.
16/09/27 11:56:39 INFO spark.SparkEnv: Registering MapOutputTracker
16/09/27 11:56:39 INFO spark.SparkEnv: Registering BlockManagerMaster
16/09/27 11:56:39 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-9f52040d-df65-4fbe-bfa5-cf5f5bf44310/blockmgr-3580d24a-a56c-4c5e-9df6-2961dcf6aba3
16/09/27 11:56:39 INFO storage.MemoryStore: MemoryStore started with capacity 2.6 GB
16/09/27 11:56:40 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-9f52040d-df65-4fbe-bfa5-cf5f5bf44310/httpd-a03804d5-6a99-4154-ad2f-bb7d62026c14
16/09/27 11:56:40 INFO spark.HttpServer: Starting HTTP Server
16/09/27 11:56:40 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/09/27 11:56:40 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:34454
16/09/27 11:56:40 INFO util.Utils: Successfully started service 'HTTP file server' on port 34454.
16/09/27 11:56:40 INFO spark.SparkEnv: Registering OutputCommitCoordinator
16/09/27 11:56:40 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/09/27 11:56:40 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:7106
16/09/27 11:56:40 INFO util.Utils: Successfully started service 'SparkUI' on port 7106.
16/09/27 11:56:40 INFO ui.SparkUI: Started SparkUI at http://192.168.100.4:7106
16/09/27 11:56:40 INFO spark.SparkContext: Added JAR file:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-examples-1.4.1-hadoop2.4.0.jar at http://192.168.100.4:34454/jars/spark-examples-1.4.1-hadoop2.4.0.jar with timestamp 1474948600441
16/09/27 11:56:40 INFO client.RMProxy: Connecting to ResourceManager at host02/192.168.100.4:8032
16/09/27 11:56:40 INFO yarn.Client: Requesting a new application from cluster with 10 NodeManagers
16/09/27 11:56:40 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
16/09/27 11:56:40 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/09/27 11:56:40 INFO yarn.Client: Setting up container launch context for our AM
16/09/27 11:56:40 INFO yarn.Client: Preparing resources for our AM container
16/09/27 11:56:41 INFO yarn.Client: Uploading resource file:/home/hadoop/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-assembly-1.4.1-hadoop2.4.0.jar -> hdfs://mycluster/user/hadoop/.sparkStaging/application_1441038159113_0028/spark-assembly-1.4.1-hadoop2.4.0.jar
16/09/27 11:56:43 INFO yarn.Client: Uploading resource file:/tmp/spark-9f52040d-df65-4fbe-bfa5-cf5f5bf44310/__hadoop_conf__4006207311540644288.zip -> hdfs://mycluster/user/hadoop/.sparkStaging/application_1441038159113_0028/__hadoop_conf__4006207311540644288.zip
16/09/27 11:56:43 INFO yarn.Client: Setting up the launch environment for our AM container
16/09/27 11:56:43 INFO spark.SecurityManager: Changing view acls to: hadoop
16/09/27 11:56:43 INFO spark.SecurityManager: Changing modify acls to: hadoop
16/09/27 11:56:43 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
16/09/27 11:56:43 INFO yarn.Client: Submitting application 28 to ResourceManager
16/09/27 11:56:43 INFO impl.YarnClientImpl: Submitted application application_1441038159113_0028
16/09/27 11:56:44 INFO yarn.Client: Application report for application_1441038159113_0028 (state: ACCEPTED)
16/09/27 11:56:44 INFO yarn.Client: 
	 client token: N/A
	 diagnostics: N/A
	 ApplicationMaster host: N/A
	 ApplicationMaster RPC port: -1
	 queue: default
	 start time: 1474948603234
	 final status: UNDEFINED
	 tracking URL: http://host02:7104/proxy/application_1441038159113_0028/
	 user: hadoop
16/09/27 11:56:45 INFO yarn.Client: Application report for application_1441038159113_0028 (state: ACCEPTED)
16/09/27 11:56:46 INFO yarn.Client: Application report for application_1441038159113_0028 (state: ACCEPTED)
16/09/27 11:56:47 INFO yarn.Client: Application report for application_1441038159113_0028 (state: ACCEPTED)
16/09/27 11:56:48 INFO yarn.Client: Application report for application_1441038159113_0028 (state: ACCEPTED)
16/09/27 11:56:49 INFO yarn.Client: Application report for application_1441038159113_0028 (state: ACCEPTED)
16/09/27 11:56:49 INFO cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as AkkaRpcEndpointRef(Actor[akka.tcp://sparkYarnAM@192.168.100.13:41286/user/YarnAM#-2120904576])
16/09/27 11:56:49 INFO cluster.YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> host02, PROXY_URI_BASES -> http://host02:7104/proxy/application_1441038159113_0028), /proxy/application_1441038159113_0028
16/09/27 11:56:49 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
16/09/27 11:56:50 INFO yarn.Client: Application report for application_1441038159113_0028 (state: RUNNING)
16/09/27 11:56:50 INFO yarn.Client: 
	 client token: N/A
	 diagnostics: N/A
	 ApplicationMaster host: 192.168.100.13
	 ApplicationMaster RPC port: 0
	 queue: default
	 start time: 1474948603234
	 final status: UNDEFINED
	 tracking URL: http://host02:7104/proxy/application_1441038159113_0028/
	 user: hadoop
16/09/27 11:56:50 INFO cluster.YarnClientSchedulerBackend: Application application_1441038159113_0028 has started running.
16/09/27 11:56:50 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 56308.
16/09/27 11:56:50 INFO netty.NettyBlockTransferService: Server created on 56308
16/09/27 11:56:50 INFO storage.BlockManagerMaster: Trying to register BlockManager
16/09/27 11:56:50 INFO storage.BlockManagerMasterEndpoint: Registering block manager 192.168.100.4:56308 with 2.6 GB RAM, BlockManagerId(driver, 192.168.100.4, 56308)
16/09/27 11:56:50 INFO storage.BlockManagerMaster: Registered BlockManager
16/09/27 11:56:50 INFO scheduler.EventLoggingListener: Logging events to hdfs://host02:8020/user/hadoop/spark-eventlog/application_1441038159113_0028
16/09/27 11:57:01 INFO cluster.YarnClientSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@gzsw-05:56319/user/Executor#-661801508]) with ID 1
16/09/27 11:57:01 INFO cluster.YarnClientSchedulerBackend: Registered executor: AkkaRpcEndpointRef(Actor[akka.tcp://sparkExecutor@gzsw-04:43642/user/Executor#599344668]) with ID 2
16/09/27 11:57:01 INFO cluster.YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
16/09/27 11:57:01 INFO storage.BlockManagerMasterEndpoint: Registering block manager gzsw-05:34014 with 906.2 MB RAM, BlockManagerId(1, gzsw-05, 34014)
16/09/27 11:57:01 INFO storage.BlockManagerMasterEndpoint: Registering block manager gzsw-04:45279 with 906.2 MB RAM, BlockManagerId(2, gzsw-04, 45279)
16/09/27 11:57:01 INFO storage.MemoryStore: ensureFreeSpace(228992) called with curMem=0, maxMem=2778306969
16/09/27 11:57:01 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 223.6 KB, free 2.6 GB)
16/09/27 11:57:01 INFO storage.MemoryStore: ensureFreeSpace(18203) called with curMem=228992, maxMem=2778306969
16/09/27 11:57:01 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 17.8 KB, free 2.6 GB)
16/09/27 11:57:01 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.100.4:56308 (size: 17.8 KB, free: 2.6 GB)
16/09/27 11:57:01 INFO spark.SparkContext: Created broadcast 0 from textFile at JavaWordCount.java:45
16/09/27 11:57:01 INFO mapred.FileInputFormat: Total input paths to process : 1
16/09/27 11:57:01 INFO spark.SparkContext: Starting job: collect at JavaWordCount.java:68
16/09/27 11:57:01 INFO scheduler.DAGScheduler: Registering RDD 3 (mapToPair at JavaWordCount.java:54)
16/09/27 11:57:01 INFO scheduler.DAGScheduler: Got job 0 (collect at JavaWordCount.java:68) with 1 output partitions (allowLocal=false)
16/09/27 11:57:01 INFO scheduler.DAGScheduler: Final stage: ResultStage 1(collect at JavaWordCount.java:68)
16/09/27 11:57:01 INFO scheduler.DAGScheduler: Parents of final stage: List(ShuffleMapStage 0)
16/09/27 11:57:01 INFO scheduler.DAGScheduler: Missing parents: List(ShuffleMapStage 0)
16/09/27 11:57:02 INFO scheduler.DAGScheduler: Submitting ShuffleMapStage 0 (MapPartitionsRDD[3] at mapToPair at JavaWordCount.java:54), which has no missing parents
16/09/27 11:57:02 INFO storage.MemoryStore: ensureFreeSpace(4760) called with curMem=247195, maxMem=2778306969
16/09/27 11:57:02 INFO storage.MemoryStore: Block broadcast_1 stored as values in memory (estimated size 4.6 KB, free 2.6 GB)
16/09/27 11:57:02 INFO storage.MemoryStore: ensureFreeSpace(2666) called with curMem=251955, maxMem=2778306969
16/09/27 11:57:02 INFO storage.MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.6 KB, free 2.6 GB)
16/09/27 11:57:02 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.100.4:56308 (size: 2.6 KB, free: 2.6 GB)
16/09/27 11:57:02 INFO spark.SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:874
16/09/27 11:57:02 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[3] at mapToPair at JavaWordCount.java:54)
16/09/27 11:57:02 INFO cluster.YarnScheduler: Adding task set 0.0 with 1 tasks
16/09/27 11:57:02 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, gzsw-04, RACK_LOCAL, 1474 bytes)
16/09/27 11:57:03 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in memory on gzsw-04:45279 (size: 2.6 KB, free: 906.2 MB)
16/09/27 11:57:03 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on gzsw-04:45279 (size: 17.8 KB, free: 906.2 MB)
16/09/27 11:57:04 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 2716 ms on gzsw-04 (1/1)
16/09/27 11:57:04 INFO cluster.YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool 
16/09/27 11:57:04 INFO scheduler.DAGScheduler: ShuffleMapStage 0 (mapToPair at JavaWordCount.java:54) finished in 2.728 s
16/09/27 11:57:04 INFO scheduler.DAGScheduler: looking for newly runnable stages
16/09/27 11:57:04 INFO scheduler.DAGScheduler: running: Set()
16/09/27 11:57:04 INFO scheduler.DAGScheduler: waiting: Set(ResultStage 1)
16/09/27 11:57:04 INFO scheduler.DAGScheduler: failed: Set()
16/09/27 11:57:04 INFO scheduler.DAGScheduler: Missing parents for ResultStage 1: List()
16/09/27 11:57:04 INFO scheduler.DAGScheduler: Submitting ResultStage 1 (ShuffledRDD[4] at reduceByKey at JavaWordCount.java:61), which is now runnable
16/09/27 11:57:04 INFO storage.MemoryStore: ensureFreeSpace(2408) called with curMem=254621, maxMem=2778306969
16/09/27 11:57:04 INFO storage.MemoryStore: Block broadcast_2 stored as values in memory (estimated size 2.4 KB, free 2.6 GB)
16/09/27 11:57:04 INFO storage.MemoryStore: ensureFreeSpace(1458) called with curMem=257029, maxMem=2778306969
16/09/27 11:57:04 INFO storage.MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 1458.0 B, free 2.6 GB)
16/09/27 11:57:04 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.100.4:56308 (size: 1458.0 B, free: 2.6 GB)
16/09/27 11:57:04 INFO spark.SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:874
16/09/27 11:57:04 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (ShuffledRDD[4] at reduceByKey at JavaWordCount.java:61)
16/09/27 11:57:04 INFO cluster.YarnScheduler: Adding task set 1.0 with 1 tasks
16/09/27 11:57:04 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, gzsw-05, PROCESS_LOCAL, 1243 bytes)
16/09/27 11:57:06 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in memory on gzsw-05:34014 (size: 1458.0 B, free: 906.2 MB)
16/09/27 11:57:06 INFO spark.MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 0 to gzsw-05:56319
16/09/27 11:57:06 INFO spark.MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 136 bytes
16/09/27 11:57:06 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 1859 ms on gzsw-05 (1/1)
16/09/27 11:57:06 INFO scheduler.DAGScheduler: ResultStage 1 (collect at JavaWordCount.java:68) finished in 1.860 s
16/09/27 11:57:06 INFO cluster.YarnScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool 
16/09/27 11:57:06 INFO scheduler.DAGScheduler: Job 0 finished: collect at JavaWordCount.java:68, took 4.736512 s
are: 1
back: 1
is: 3
ERROR: 1
a: 2
on: 1
content: 2
bad: 2
with: 1
some: 1
INFO: 4
to: 1
: 2
This: 3
more: 1
message: 1
More: 1
thing: 1
warning: 1
WARN: 2
normal: 1
Something: 1
happened: 1
other: 1
messages: 2
details: 1
the: 1
Here: 1
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/api,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
16/09/27 11:57:06 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
16/09/27 11:57:06 INFO ui.SparkUI: Stopped Spark web UI at http://192.168.100.4:7106
16/09/27 11:57:06 INFO scheduler.DAGScheduler: Stopping DAGScheduler
16/09/27 11:57:06 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
16/09/27 11:57:06 INFO cluster.YarnClientSchedulerBackend: Interrupting monitor thread
16/09/27 11:57:06 INFO cluster.YarnClientSchedulerBackend: Asking each executor to shut down
16/09/27 11:57:06 INFO cluster.YarnClientSchedulerBackend: Stopped
16/09/27 11:57:07 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/09/27 11:57:07 INFO util.Utils: path = /tmp/spark-9f52040d-df65-4fbe-bfa5-cf5f5bf44310/blockmgr-3580d24a-a56c-4c5e-9df6-2961dcf6aba3, already present as root for deletion.
16/09/27 11:57:07 INFO storage.MemoryStore: MemoryStore cleared
16/09/27 11:57:07 INFO storage.BlockManager: BlockManager stopped
16/09/27 11:57:07 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
16/09/27 11:57:07 INFO spark.SparkContext: Successfully stopped SparkContext
16/09/27 11:57:07 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/09/27 11:57:07 INFO util.Utils: Shutdown hook called
16/09/27 11:57:07 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/09/27 11:57:07 INFO util.Utils: Deleting directory /tmp/spark-9f52040d-df65-4fbe-bfa5-cf5f5bf44310
16/09/27 11:57:07 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.

 

ref:

[spark-src]-source reading

  • 大小: 96.1 KB
0
0
分享到:
评论

相关推荐

    spark-2.2.1-bin-2.6.0-cdh5.14.2.tar.gz

    CDH 5.14.2是Cloudera的一个重要发行版,包含了Hadoop 2.6.0,这使得Spark可以充分利用YARN(Yet Another Resource Negotiator)资源管理器来调度任务,提高集群资源利用率。在CDH中部署Spark,用户可以享受到更完善...

    hadoop&spark安装.md

    ### Hadoop & Spark 安装指南详解 #### 一、环境准备 在开始安装 Hadoop 和 Spark 之前,首先需要确保满足以下环境条件: - **操作系统**:本指南基于 Ubuntu 20.04 LTS 操作系统。如果使用其他版本的操作系统,...

    oozie-branch-4.1.zip

    - **Hadoop 兼容性**:Oozie 4.1 支持 Hadoop 2.x 版本,这意味着它可以与 YARN 集成,利用 YARN 的资源管理和调度能力。 - **改进的性能**:在这一版本中,Oozie 对作业提交和调度进行了优化,减少了延迟,提高了...

    Hadoop 和 Spark 的安装、环境配置、使用教程以及一个分布式机器学习项目示例.docx

    ### Hadoop 和 Spark 技术知识点详解 #### 一、Hadoop 安装与配置 **1.1 前提条件** 为了确保Hadoop能够正常运行,首先需要安装Java JDK 8或更高版本。这是因为Hadoop是基于Java编写的,因此需要相应的JDK支持。...

    Spark大数据内核天机解密- to 丁立清.pdf

    ### Spark大数据内核天机解密 #### 一、部署模式彻底解析 ##### 1. 部署模式概述 部署模式是Spark集群部署的重要组成部分,它决定了Spark应用程序如何在集群中运行。根据官方文档...

    hadoop&spark安装、环境配置、使用教程.docx

    - `yarn-site.xml`: 配置YARN相关参数。 - **步骤4**: 启动Hadoop。 - 执行`start-all.sh`脚本启动整个Hadoop集群。 - 或者手动启动各个组件: - `start-dfs.sh`: 启动HDFS。 - `start-yarn.sh`: 启动YARN。 - ...

    大数据学习笔记

    - **4.1 Spark Shell**:提供了交互式的Shell环境,支持Java、Scala和Python语言。 - **4.2 RDD Transformations**:介绍了RDD的各种转换操作,如map、filter、reduceByKey等。 - **4.3 Actions**:介绍了RDD的操作...

    hadoop2.7.3+hive1.2.1+spark2.0.1性能测试

    **2.3 配置 yarn-env.sh 文件** - 类似于`hadoop-env.sh`,在此文件中也需指定JDK路径。 **2.4 配置 core-site.xml 文件** - 指定HDFS的名称节点和临时文件存储位置等关键配置项。 **2.5 配置 hdfs-site.xml ...

    面试大数据岗位 spark相关问题汇总

    Spark Streaming是一种处理实时数据流的扩展,通过将输入数据流分割成一系列小批次数据,然后使用Spark Core的API处理这些小批次数据。 **2.2 Spark MLlib** MLlib是Spark提供的机器学习库,包含各种常用的机器...

    HCIA-Big Data V2.0视频.zip

    4.1 Spark概述及核心Spark Core 4.2 Spark体系结构梳理 5.1 Hbase的功能和架构 5.2 Hbase的关键流程和特性 6.1 Hive的概述和架构 6.2 Hive功能与架构-Hive基本操作 7.1 Streaming的概述和架构 7.2 Streaming...

    云服务器上搭建大数据伪分布式环境

    编辑`/usr/local/hadoop/etc/hadoop/yarn-site.xml`文件,设置YARN的具体参数: ```xml <name>yarn.resourcemanager.hostname <value>localhost ``` ###### 5.3 格式化NameNode: ```bash hdfs namenode -...

    spark:spark学习笔记

    4.1 Spark Streaming介绍:Spark Streaming是Spark处理实时流数据的模块,它将数据流分割为微批次,然后使用Spark Core进行处理。 4.2 DStream:DStream是Spark Streaming中的核心抽象,表示连续的数据流。它是RDD...

    bigdata bench 用户手册

    - **配置文件编辑**:修改 `core-site.xml`、`hdfs-site.xml`、`mapred-site.xml` 和 `yarn-site.xml` 文件来设置 Hadoop 的核心参数。 - **格式化文件系统**:运行 `hdfs namenode -format` 来格式化 HDFS 文件系统...

    Hadoop实战中文版

    - **yarn-site.xml**:管理YARN资源调度的相关设置。 ### 三、Hadoop数据处理流程 #### 3.1 MapReduce工作原理 - **Mapper阶段**:将输入数据分割成小块,并进行初步处理。 - **Shuffle阶段**:对中间结果进行排序...

    大数据开发面试题,吐血整理

    - **core-site.xml**: 配置Hadoop核心参数,如文件系统默认方案等。 - **hdfs-site.xml**: 配置HDFS的相关参数,如副本数量等。 - **mapred-site.xml**: 配置MapReduce相关参数。 以上知识点全面覆盖了大数据开发...

    Hadoop实战

    - **Spark**:虽然不是Hadoop的一部分,但Spark可以作为计算引擎运行在Hadoop之上,提供更快速的数据处理能力。 #### 七、Hadoop最佳实践 - **数据分片**:合理规划数据的分片策略,以提高并行处理效率。 - **优化...

    Hadoop权威指南(第3版)(Hadoop: The Definitive Guide,3rd)

    利用Hadoop与实时数据处理框架(如Spark Streaming)相结合,可以实现对大量实时数据的快速处理和分析。 #### 六、Hadoop与其他技术的整合 **6.1 Hadoop与Hive** Hive是一个基于Hadoop的数据仓库工具,允许用户...

Global site tag (gtag.js) - Google Analytics