`
sillycat
  • 浏览: 2542862 次
  • 性别: Icon_minigender_1
  • 来自: 成都
社区版块
存档分类
最新评论

Spark(6)Upgrade to 1.0.2 Version again with YARN

 
阅读更多

Spark(6)Upgrade to 1.0.2 Version again with YARN

Download the prebuilt version 
>wget http://d3kbcqa49mib13.cloudfront.net/spark-1.0.2-bin-hadoop2.tgz

Setup and Make hadoop2 running on my VMs
http://sillycat.iteye.com/blog/2090186

Prepare the file on hadoop
>hdfs dfs -mkdir /user/sillycat
>hdfs dfs -put /opt/spark/log.txt /user/sillycat/

Login on the shell.
>MASTER=spark://ubuntu-master1:7077 bin/spark-shell
>val file = sc.textFile("hdfs://ubuntu-master1:9000/user/sillycat/log.txt")
>file.first()

Error Message:
Server IPC version 9 cannot communicate with client version 4

Solution:
Version error, I am using spark-hadoop1 to connect to hadoop 2.4.1

It works in the shell.

Go on and configure the YARN.
>sbin/start-dfs.sh
>sbin/start-yarn.sh
>sbin/mr-jobhistory-daemon.sh start historyserver

YARN is running now. Then we can get info from these URL
http://ubuntu-master1:50070/dfshealth.html#tab-overview
http://ubuntu-master1:8088/cluster/nodes
http://ubuntu-master1:19888/jobhistory

Running Spark Shell on YARN
Change the configuration file of spark
HADOOP_CONF_DIR=/opt/hadoop/etc/hadoop

>MASTER=yarn-client bin/spark-shell

Submit the task as follow
>bin/spark-submit --class com.cloudera.sparkwordcount.SparkWordCount --master yarn /Users/carl/work/current/simplesparkapp/target/sparkwordcount-0.0.1-SNAPSHOT.jar hdfs://ubuntu-master1:9000/user/sillycat/log.txt 2

Spark works great on YARN
>bin/spark-submit --class com.sillycat.spark.app.ClusterComplexJob --master yarn /Users/carl/work/sillycat/sillycat-spark/target/scala-2.10/sillycat-spark-assembly-1.0.jar book1

The sample project is in sillycat-spark

Even the standalone cluster is working
>bin/spark-submit --class com.sillycat.spark.app.FindWordJob --master spark://ubuntu-master1:7077 /Users/carl/work/sillycat/sillycat-spark/target/scala-2.10/sillycat-spark-assembly-1.0.jar book1

The command to start the master and slave
>sbin/start-master.sh 
>bin/spark-class org.apache.spark.deploy.worker.Worker spark://ubuntu-master1:7077

Configuration on master1
>cat conf/spark-env.sh
#!/usr/bin/env bash

export SPARK_LOCAL_IP=ubuntu-master1

#export SPARK_EXECUTOR_MEMORY=1G

export SPARK_MASTER_IP=ubuntu-master1

export SPARK_WORKER_MEMORY=1024M

Configuration on slave1
>cat conf/spark-env.sh
#!/usr/bin/env bash

export SPARK_LOCAL_IP=ubuntu-slave1

#export SPARK_EXECUTOR_MEMORY=1G

export SPARK_MASTER_IP=ubuntu-master1

export SPARK_WORKER_MEMORY=1024M

Tips
Spark Job runs
>bin/spark-submit --class com.cloudera.sparkwordcount.SparkWordCount --master local /Users/carl/work/current/simplesparkapp/target/sparkwordcount-0.0.1-SNAPSHOT.jar /opt/spark/README.md 2

Error Message
java.lang.OutOfMemoryError: Java heap space

Solution:
Change the Memory configuration from here
>vi bin/spark-class


References:
http://spark.apache.org/docs/latest/running-on-yarn.html
https://github.com/snowplow/spark-example-project

http://blog.cloudera.com/blog/2014/04/how-to-run-a-simple-apache-spark-app-in-cdh-5/
http://parambirs.wordpress.com/2014/05/20/running-spark-1-0-0-snapshot-on-hadoopyarn-2-4-0/
http://parambirs.wordpress.com/2014/05/20/install-hadoopyarn-2-4-0-on-ubuntu-virtualbox/
http://parambirs.wordpress.com/2014/05/20/building-and-running-spark-1-0-0-snapshot-on-ubuntu/

分享到:
评论

相关推荐

    腾达Tenda w311r H1-3.3.6d最新固件

    the firmware version begin with H1 can't upgrade to the firmware verson begin with H3 or V3.2.4.02 the firmware version begin with H3 can't upgrade to the firmware verson begin with H1 or V3.2.4.02 ...

    46 upgrade to 600

    46 upgrade to 60046 upgrade to 60046 upgrade to 60046 upgrade to 600

    yarn1.22.4与1.22.5最新安装包windows

    - **更新依赖**:`yarn upgrade <package>`更新指定包到最新版本。 - **删除依赖**:`yarn remove <package>`从项目中移除指定包。 3. 特性: Yarn的主要特性包括: - **一致性**:无论何时何地安装相同依赖,...

    46 upgrade to 60 0二

    46upgrade to 60 0二46upgrade to 60 0二46upgrade to 60 0二

    Android代码-spark

    Please upgrade to the latest version. For documentation please go to: http://sparkjava.com/documentation For usage questions, please use stack overflow with the “spark-java” tag Javadoc: ...

    openssl-1.0.2o rpm包

    **OpenSSL 1.0.2o RPM 包详解** OpenSSL 是一个强大的安全套接层(SSL)和传输层安全(TLS)协议实现库,同时也包含了一系列加密、哈希和伪随机数生成器等算法。它在互联网上广泛应用于服务器的安全认证、数据加密...

    yarn 1.6 WIN安装文件

    打开命令提示符(或PowerShell),输入`yarn --version`,如果成功安装,你应该能看到类似`1.6.x`的版本号输出,这表明Yarn已经正确安装并可以使用。 5. **开始使用Yarn** - 创建新项目:在项目目录下,运行`yarn...

    yarn-v1.22.5.tar.gz

    Yarn 的使用方法基本与 npm 类似,比如 `yarn init` 创建新项目,`yarn add` 添加依赖,`yarn remove` 移除依赖,`yarn upgrade` 升级依赖,`yarn install` 安装项目依赖等。然而,由于 Yarn 的特性,这些操作的执行...

    How to do a manual kernel upgrade[PDF version]

    在IT领域,操作系统内核是计算机系统的核心部分,它负责管理硬件资源,提供...《How to do a Manual Kernel Upgrade of an SAP Server》这篇PDF文档应能提供更详细的指导,对于SAP管理员来说,是一份宝贵的参考资料。

    yarn-1.22.4.msi和yarn-1.22.5.msi

    另外,"yarn add"、"yarn remove"和"yarn upgrade"等命令简化了添加、移除和更新依赖的过程。"yarn cache clean"可以清除本地的Yarn缓存,而"yarn info"则用于查看包的详细信息。 当官网或其他常规下载源无法找到...

    Dassault system Enovia V6 upgrade to 3Dexperience R2017 Whitepapers

    This White Paper is dedicated to have a single source of information on general aspects related to upgrading from older V6 releases to 3DEXPERIENCE platform

    Aironet-AP-to-LWAPP-Upgrade-Tool.zip

    《Cisco网络设备升级工具——Aironet AP to LWAPP Upgrade Tool详解》 在现代网络环境中,Cisco设备,如路由器、交换机、接入点(AP)和无线局域网控制器(WLC),扮演着至关重要的角色。为了确保网络的稳定性和...

    Why Upgrade to Visio 2013.pptx

    Why Upgrade to Visio 2013.pptx

    Why Upgrade to Project2013.pptx

    Why Upgrade to Project2013.pptx

    yarn-1.22.0.msi

    此外,Yarn提供了`yarn upgrade`、`yarn remove`、`yarn info`等命令,用于升级依赖、移除依赖和查看依赖信息。还有`yarn install --offline`命令,可以在没有网络的情况下利用本地缓存安装依赖。 总之,Yarn作为一...

    Apache Hadoop 3.x state of the union and upgrade guidance

    Apache Hadoop YARN is the modern distributed operating system for big data applications....And you’ll leave with all the knowledge of how to upgrade painlessly from 2.x to 3.x to get all the benefits.

    gevent 1.0.2

    pip install --upgrade gevent==1.0.2 ``` 总之,`gevent 1.0.2`是一个强大且高效的网络库,它的核心是利用`greenlet`实现协程化,结合`libev`或`libuv`的事件驱动模型,为Python开发者提供了强大的并发能力,尤其...

    phplist-2.10.12

    For example, we haven't really tested upgrade to this version from the latest stable. If you want to help out, please play with it, and report your findings in mantis , marking the issue for version ...

    CiscoAironet-AP-to-LWAPP-Upgrade-Tool-v34

    Cisco Aironet AP to LWAPP Upgrade Tool v34是一款专为Cisco路由器、交换机、接入点(Access Point, AP)以及无线局域网控制器(Wireless LAN Controller, WLC)设计的软件升级工具。此工具的主要功能是确保网络...

Global site tag (gtag.js) - Google Analytics