Data Solution 2019(12)Flink Processing Data

sillycat

浏览: 2564380 次
性别:
来自: 成都

最近访客更多访客>>

huageng520

learnmore

u012363178

ymgjava

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Summary

Data Solution 2019(12)Flink Processing Data

Master and Slaves Mode
> java -version
java version "1.8.0_221"

Start from here
https://ci.apache.org/projects/flink/flink-docs-release-1.9/getting-started/tutorials/local_setup.html
https://juejin.im/post/5d6610c65188257573636a86

> wget https://www-eu.apache.org/dist/flink/flink-1.9.1/flink-1.9.1-bin-scala_2.12.tgz
> tar zxvf flink-1.9.1-bin-scala_2.12.tgz
> mv flink-1.9.1 ~/tool/
> sudo ln -s /home/carl/tool/flink-1.9.1 /opt/flink-1.9.1
> sudo ln -s /opt/flink-1.9.1 /opt/flink

Start alone mode
> bin/start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host rancher-home.
Starting taskexecutor daemon on host rancher-home.

Visit the UI Page
http://rancher-home:8081/#/overview

Add this to PATH
PATH=$PATH:/opt/flink/bin

Submit the task to single node
> flink run -m rancher-home:8081 ./examples/batch/WordCount.jar --input ./README.txt

Do the download and path on worker machine as well
> wget https://www-eu.apache.org/dist/flink/flink-1.9.1/flink-1.9.1-bin-scala_2.12.tgz
> sudo ln -s /opt/flink-1.9.1 /opt/flink
> cd /opt/flink

Try to join the cluster as a task manager
> bin/jobmanager.sh start rancher-home
Starting standalonesession daemon on host rancher-worker1.

> bin/taskmanager.sh start
Starting taskexecutor daemon on host rancher-worker1.

> jps
13617 StandaloneSessionClusterEntrypoint
14388 Jps
14312 TaskManagerRunner

No, this does not work.

Zeppelin can connect to my cluster
https://zeppelin.apache.org/docs/0.5.5-incubating/interpreter/flink.html

Error
INFO [2019-10-30 23:20:15,968] ({flink-akka.actor.default-dispatcher-3} JobClientActor.java[tryToSubmitJob]:406) - Could not submit job Flink Java Job at Wed Oct 30 23:20:13 CDT 2019 (2c6bcfffb9d3bc0f5c12a72e16797080), because there is no connection to a JobManager.

Solution:
https://stackoverflow.com/questions/52274020/apache-zeppelin-flink-interpretor-can-not-connect-flink-1-5-2
It seems it is the support versions issues.

Check the libraries here
/opt/zeppelin/interpreter/flink
Maybe the version is just too low
flink-java-1.1.3.jar

Some explanation here
https://zeppelin.apache.org/docs/0.9.0-SNAPSHOT/setup/deployment/flink_and_spark_cluster.html

Build Zeppelin
> git clone https://github.com/apache/zeppelin.git
Build command
> mvn clean package -DskipTests -Pspark-2.4 -Dflink.version=1.9.1 -Pscala-2.12

How to build
https://zeppelin.apache.org/docs/0.9.0-SNAPSHOT/setup/basics/how_to_build.html
> mvn clean package -Pbuild-distr -Pspark-2.3 -Dflink.version=1.9.1 -Phadoop3 -Pscala-2.11

Build on CentOS7
> mvn clean package -Pbuild-distr -Pspark-2.3 -Dflink.version=1.9.1 -Phadoop3 -Pscala-2.11 -rf :zengine-plugins-parent

Here is the command if these men build command failure with packages not found
>mvn install:install-file -DgroupId=org.jetbrains.pty4j -DartifactId=pty4j -Dversion=0.9.3 -Dpackaging=jar -Dfile=/home/carl/install/pty4j-0.9.3.jar

com.google.code.findbugs:jsr305:1.3.9

>mvn install:install-file -DgroupId=com.google.code.findbugs -DartifactId=jsr305 -Dversion=1.3.9 -Dpackaging=jar -Dfile=/home/carl/install/jsr305-1.3.9.jar

com.google.code.findbugs:jsr305:3.0.0

>mvn install:install-file -DgroupId=com.google.code.findbugs -DartifactId=jsr305 -Dversion=3.0.0 -Dpackaging=jar -Dfile=/home/carl/install/jsr305-3.0.0.jar

After build, the binary should be here
/Users/hluo/install/zeppelin/zeppelin-distribution/target/zeppelin-0.9.0-SNAPSHOT.tar.gz

That is quick not stable, so I try to list the tag
> git tag
v0.8.1-docker
v0.8.1-rc1
v0.8.2
v0.8.2-docker
v0.8.2-rc1

> git checkout v0.8.2

It does not work well. I will downgrade the link version and try again.

Related versions and other softwares
https://flink.apache.org/ecosystem.html

This old version may work
> wget https://archive.apache.org/dist/flink/flink-1.1.3/flink-1.1.3-bin-hadoop2-scala_2.11.tgz
> sudo ln -s /home/carl/tool/flink-1.1.3 /opt/flink-1.1.3
> sudo ln -s /opt/flink-1.1.3 /opt/flink

Take references from 1.9.1 configuration
On Master
> cat conf/flink-conf.yaml
jobmanager.rpc.address: rancher-home
jobmanager.rpc.port: 6123
jobmanager.heap.size: 1024m
taskmanager.heap.size: 1024m
taskmanager.numberOfTaskSlots: 2
parallelism.default: 2

> cat conf/masters
rancher-home:8081

Make sure slaves is empty
> cat conf/slaves

Start master
> bin/start-cluster.sh
Starting cluster.
Starting jobmanager daemon on host rancher-home.

> jps
7536 Jps
7427 JobManager

Visit the console UI
http://rancher-home:8081/#/overview

On the Slave Machine
> wget https://archive.apache.org/dist/flink/flink-1.1.3/flink-1.1.3-bin-hadoop2-scala_2.11.tgz
> sudo ln -s /home/carl/tool/flink-1.1.3 /opt/flink-1.1.3
> sudo ln -s /opt/flink-1.1.3 /opt/flink

Check the config
> cat conf/masters
rancher-home:8081

> cat conf/slaves

> cat conf/flink-conf.yaml
> jobmanager.rpc.address: rancher-home
> jobmanager.rpc.port: 6123
> jobmanager.heap.mb: 1024
> taskmanager.heap.mb: 1024

> taskmanager.numberOfTaskSlots: 2
> parallelism.default: 2

Start the Service
> bin/taskmanager.sh start
Starting taskmanager daemon on host rancher-worker1.

> jps
6632 TaskManager
6703 Jps

Refresh the console UI, we can see the 2 TaskManager joined the cluster
http://rancher-home:8081/#/overview

Run a local test, it works well.
> flink run -m rancher-home:6123 ./examples/batch/WordCount.jar --input ./README.txt

Zeppelin Configuration as follow:
Host: rancher-home
Port: 6123

Zeppelin 0.8.2 works well with Flink Cluster 1.1.3

References:
https://flink.apache.org/zh/usecases.html
https://flink.apache.org/
https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/deployment/cluster_setup.html
http://wuchong.me/blog/2018/11/07/5-minutes-build-first-flink-application/
https://www.infoq.cn/article/zbBAGroBgtytDiBs*Xq9

Installation
https://www.cnblogs.com/frankdeng/p/9400627.html
https://juejin.im/post/5d6610c65188257573636a86

分享到：

Data Solution 2019(13)Docker Zeppelin No ... | Data Solution 2019(10)Spark Cluster Solu ...

2019-11-02 02:15
浏览 323
评论(0)
分类:企业架构
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论