- 浏览: 2557923 次
- 性别:
- 来自: 成都
文章分类
最新评论
-
nation:
你好,在部署Mesos+Spark的运行环境时,出现一个现象, ...
Spark(4)Deal with Mesos -
sillycat:
AMAZON Relatedhttps://www.godad ...
AMAZON API Gateway(2)Client Side SSL with NGINX -
sillycat:
sudo usermod -aG docker ec2-use ...
Docker and VirtualBox(1)Set up Shared Disk for Virtual Box -
sillycat:
Every Half an Hour30 * * * * /u ...
Build Home NAS(3)Data Redundancy -
sillycat:
3 List the Cron Job I Have>c ...
Build Home NAS(3)Data Redundancy
Data Solution 2019(12)Flink Processing Data
Master and Slaves Mode
> java -version
java version "1.8.0_221"
Start from here
https://ci.apache.org/projects/flink/flink-docs-release-1.9/getting-started/tutorials/local_setup.html
https://juejin.im/post/5d6610c65188257573636a86
> wget https://www-eu.apache.org/dist/flink/flink-1.9.1/flink-1.9.1-bin-scala_2.12.tgz
> tar zxvf flink-1.9.1-bin-scala_2.12.tgz
> mv flink-1.9.1 ~/tool/
> sudo ln -s /home/carl/tool/flink-1.9.1 /opt/flink-1.9.1
> sudo ln -s /opt/flink-1.9.1 /opt/flink
Start alone mode
> bin/start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host rancher-home.
Starting taskexecutor daemon on host rancher-home.
Visit the UI Page
http://rancher-home:8081/#/overview
Add this to PATH
PATH=$PATH:/opt/flink/bin
Submit the task to single node
> flink run -m rancher-home:8081 ./examples/batch/WordCount.jar --input ./README.txt
Do the download and path on worker machine as well
> wget https://www-eu.apache.org/dist/flink/flink-1.9.1/flink-1.9.1-bin-scala_2.12.tgz
> sudo ln -s /opt/flink-1.9.1 /opt/flink
> cd /opt/flink
Try to join the cluster as a task manager
> bin/jobmanager.sh start rancher-home
Starting standalonesession daemon on host rancher-worker1.
> bin/taskmanager.sh start
Starting taskexecutor daemon on host rancher-worker1.
> jps
13617 StandaloneSessionClusterEntrypoint
14388 Jps
14312 TaskManagerRunner
No, this does not work.
Zeppelin can connect to my cluster
https://zeppelin.apache.org/docs/0.5.5-incubating/interpreter/flink.html
Error
INFO [2019-10-30 23:20:15,968] ({flink-akka.actor.default-dispatcher-3} JobClientActor.java[tryToSubmitJob]:406) - Could not submit job Flink Java Job at Wed Oct 30 23:20:13 CDT 2019 (2c6bcfffb9d3bc0f5c12a72e16797080), because there is no connection to a JobManager.
Solution:
https://stackoverflow.com/questions/52274020/apache-zeppelin-flink-interpretor-can-not-connect-flink-1-5-2
It seems it is the support versions issues.
Check the libraries here
/opt/zeppelin/interpreter/flink
Maybe the version is just too low
flink-java-1.1.3.jar
Some explanation here
https://zeppelin.apache.org/docs/0.9.0-SNAPSHOT/setup/deployment/flink_and_spark_cluster.html
Build Zeppelin
> git clone https://github.com/apache/zeppelin.git
Build command
> mvn clean package -DskipTests -Pspark-2.4 -Dflink.version=1.9.1 -Pscala-2.12
How to build
https://zeppelin.apache.org/docs/0.9.0-SNAPSHOT/setup/basics/how_to_build.html
> mvn clean package -Pbuild-distr -Pspark-2.3 -Dflink.version=1.9.1 -Phadoop3 -Pscala-2.11
Build on CentOS7
> mvn clean package -Pbuild-distr -Pspark-2.3 -Dflink.version=1.9.1 -Phadoop3 -Pscala-2.11 -rf :zengine-plugins-parent
Here is the command if these men build command failure with packages not found
>mvn install:install-file -DgroupId=org.jetbrains.pty4j -DartifactId=pty4j -Dversion=0.9.3 -Dpackaging=jar -Dfile=/home/carl/install/pty4j-0.9.3.jar
com.google.code.findbugs:jsr305:1.3.9
>mvn install:install-file -DgroupId=com.google.code.findbugs -DartifactId=jsr305 -Dversion=1.3.9 -Dpackaging=jar -Dfile=/home/carl/install/jsr305-1.3.9.jar
com.google.code.findbugs:jsr305:3.0.0
>mvn install:install-file -DgroupId=com.google.code.findbugs -DartifactId=jsr305 -Dversion=3.0.0 -Dpackaging=jar -Dfile=/home/carl/install/jsr305-3.0.0.jar
After build, the binary should be here
/Users/hluo/install/zeppelin/zeppelin-distribution/target/zeppelin-0.9.0-SNAPSHOT.tar.gz
That is quick not stable, so I try to list the tag
> git tag
v0.8.1-docker
v0.8.1-rc1
v0.8.2
v0.8.2-docker
v0.8.2-rc1
> git checkout v0.8.2
It does not work well. I will downgrade the link version and try again.
Related versions and other softwares
https://flink.apache.org/ecosystem.html
This old version may work
> wget https://archive.apache.org/dist/flink/flink-1.1.3/flink-1.1.3-bin-hadoop2-scala_2.11.tgz
> sudo ln -s /home/carl/tool/flink-1.1.3 /opt/flink-1.1.3
> sudo ln -s /opt/flink-1.1.3 /opt/flink
Take references from 1.9.1 configuration
On Master
> cat conf/flink-conf.yaml
jobmanager.rpc.address: rancher-home
jobmanager.rpc.port: 6123
jobmanager.heap.size: 1024m
taskmanager.heap.size: 1024m
taskmanager.numberOfTaskSlots: 2
parallelism.default: 2
> cat conf/masters
rancher-home:8081
Make sure slaves is empty
> cat conf/slaves
Start master
> bin/start-cluster.sh
Starting cluster.
Starting jobmanager daemon on host rancher-home.
> jps
7536 Jps
7427 JobManager
Visit the console UI
http://rancher-home:8081/#/overview
On the Slave Machine
> wget https://archive.apache.org/dist/flink/flink-1.1.3/flink-1.1.3-bin-hadoop2-scala_2.11.tgz
> sudo ln -s /home/carl/tool/flink-1.1.3 /opt/flink-1.1.3
> sudo ln -s /opt/flink-1.1.3 /opt/flink
Check the config
> cat conf/masters
rancher-home:8081
> cat conf/slaves
> cat conf/flink-conf.yaml
> jobmanager.rpc.address: rancher-home
> jobmanager.rpc.port: 6123
> jobmanager.heap.mb: 1024
> taskmanager.heap.mb: 1024
> taskmanager.numberOfTaskSlots: 2
> parallelism.default: 2
Start the Service
> bin/taskmanager.sh start
Starting taskmanager daemon on host rancher-worker1.
> jps
6632 TaskManager
6703 Jps
Refresh the console UI, we can see the 2 TaskManager joined the cluster
http://rancher-home:8081/#/overview
Run a local test, it works well.
> flink run -m rancher-home:6123 ./examples/batch/WordCount.jar --input ./README.txt
Zeppelin Configuration as follow:
Host: rancher-home
Port: 6123
Zeppelin 0.8.2 works well with Flink Cluster 1.1.3
References:
https://flink.apache.org/zh/usecases.html
https://flink.apache.org/
https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/deployment/cluster_setup.html
http://wuchong.me/blog/2018/11/07/5-minutes-build-first-flink-application/
https://www.infoq.cn/article/zbBAGroBgtytDiBs*Xq9
Installation
https://www.cnblogs.com/frankdeng/p/9400627.html
https://juejin.im/post/5d6610c65188257573636a86
Master and Slaves Mode
> java -version
java version "1.8.0_221"
Start from here
https://ci.apache.org/projects/flink/flink-docs-release-1.9/getting-started/tutorials/local_setup.html
https://juejin.im/post/5d6610c65188257573636a86
> wget https://www-eu.apache.org/dist/flink/flink-1.9.1/flink-1.9.1-bin-scala_2.12.tgz
> tar zxvf flink-1.9.1-bin-scala_2.12.tgz
> mv flink-1.9.1 ~/tool/
> sudo ln -s /home/carl/tool/flink-1.9.1 /opt/flink-1.9.1
> sudo ln -s /opt/flink-1.9.1 /opt/flink
Start alone mode
> bin/start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host rancher-home.
Starting taskexecutor daemon on host rancher-home.
Visit the UI Page
http://rancher-home:8081/#/overview
Add this to PATH
PATH=$PATH:/opt/flink/bin
Submit the task to single node
> flink run -m rancher-home:8081 ./examples/batch/WordCount.jar --input ./README.txt
Do the download and path on worker machine as well
> wget https://www-eu.apache.org/dist/flink/flink-1.9.1/flink-1.9.1-bin-scala_2.12.tgz
> sudo ln -s /opt/flink-1.9.1 /opt/flink
> cd /opt/flink
Try to join the cluster as a task manager
> bin/jobmanager.sh start rancher-home
Starting standalonesession daemon on host rancher-worker1.
> bin/taskmanager.sh start
Starting taskexecutor daemon on host rancher-worker1.
> jps
13617 StandaloneSessionClusterEntrypoint
14388 Jps
14312 TaskManagerRunner
No, this does not work.
Zeppelin can connect to my cluster
https://zeppelin.apache.org/docs/0.5.5-incubating/interpreter/flink.html
Error
INFO [2019-10-30 23:20:15,968] ({flink-akka.actor.default-dispatcher-3} JobClientActor.java[tryToSubmitJob]:406) - Could not submit job Flink Java Job at Wed Oct 30 23:20:13 CDT 2019 (2c6bcfffb9d3bc0f5c12a72e16797080), because there is no connection to a JobManager.
Solution:
https://stackoverflow.com/questions/52274020/apache-zeppelin-flink-interpretor-can-not-connect-flink-1-5-2
It seems it is the support versions issues.
Check the libraries here
/opt/zeppelin/interpreter/flink
Maybe the version is just too low
flink-java-1.1.3.jar
Some explanation here
https://zeppelin.apache.org/docs/0.9.0-SNAPSHOT/setup/deployment/flink_and_spark_cluster.html
Build Zeppelin
> git clone https://github.com/apache/zeppelin.git
Build command
> mvn clean package -DskipTests -Pspark-2.4 -Dflink.version=1.9.1 -Pscala-2.12
How to build
https://zeppelin.apache.org/docs/0.9.0-SNAPSHOT/setup/basics/how_to_build.html
> mvn clean package -Pbuild-distr -Pspark-2.3 -Dflink.version=1.9.1 -Phadoop3 -Pscala-2.11
Build on CentOS7
> mvn clean package -Pbuild-distr -Pspark-2.3 -Dflink.version=1.9.1 -Phadoop3 -Pscala-2.11 -rf :zengine-plugins-parent
Here is the command if these men build command failure with packages not found
>mvn install:install-file -DgroupId=org.jetbrains.pty4j -DartifactId=pty4j -Dversion=0.9.3 -Dpackaging=jar -Dfile=/home/carl/install/pty4j-0.9.3.jar
com.google.code.findbugs:jsr305:1.3.9
>mvn install:install-file -DgroupId=com.google.code.findbugs -DartifactId=jsr305 -Dversion=1.3.9 -Dpackaging=jar -Dfile=/home/carl/install/jsr305-1.3.9.jar
com.google.code.findbugs:jsr305:3.0.0
>mvn install:install-file -DgroupId=com.google.code.findbugs -DartifactId=jsr305 -Dversion=3.0.0 -Dpackaging=jar -Dfile=/home/carl/install/jsr305-3.0.0.jar
After build, the binary should be here
/Users/hluo/install/zeppelin/zeppelin-distribution/target/zeppelin-0.9.0-SNAPSHOT.tar.gz
That is quick not stable, so I try to list the tag
> git tag
v0.8.1-docker
v0.8.1-rc1
v0.8.2
v0.8.2-docker
v0.8.2-rc1
> git checkout v0.8.2
It does not work well. I will downgrade the link version and try again.
Related versions and other softwares
https://flink.apache.org/ecosystem.html
This old version may work
> wget https://archive.apache.org/dist/flink/flink-1.1.3/flink-1.1.3-bin-hadoop2-scala_2.11.tgz
> sudo ln -s /home/carl/tool/flink-1.1.3 /opt/flink-1.1.3
> sudo ln -s /opt/flink-1.1.3 /opt/flink
Take references from 1.9.1 configuration
On Master
> cat conf/flink-conf.yaml
jobmanager.rpc.address: rancher-home
jobmanager.rpc.port: 6123
jobmanager.heap.size: 1024m
taskmanager.heap.size: 1024m
taskmanager.numberOfTaskSlots: 2
parallelism.default: 2
> cat conf/masters
rancher-home:8081
Make sure slaves is empty
> cat conf/slaves
Start master
> bin/start-cluster.sh
Starting cluster.
Starting jobmanager daemon on host rancher-home.
> jps
7536 Jps
7427 JobManager
Visit the console UI
http://rancher-home:8081/#/overview
On the Slave Machine
> wget https://archive.apache.org/dist/flink/flink-1.1.3/flink-1.1.3-bin-hadoop2-scala_2.11.tgz
> sudo ln -s /home/carl/tool/flink-1.1.3 /opt/flink-1.1.3
> sudo ln -s /opt/flink-1.1.3 /opt/flink
Check the config
> cat conf/masters
rancher-home:8081
> cat conf/slaves
> cat conf/flink-conf.yaml
> jobmanager.rpc.address: rancher-home
> jobmanager.rpc.port: 6123
> jobmanager.heap.mb: 1024
> taskmanager.heap.mb: 1024
> taskmanager.numberOfTaskSlots: 2
> parallelism.default: 2
Start the Service
> bin/taskmanager.sh start
Starting taskmanager daemon on host rancher-worker1.
> jps
6632 TaskManager
6703 Jps
Refresh the console UI, we can see the 2 TaskManager joined the cluster
http://rancher-home:8081/#/overview
Run a local test, it works well.
> flink run -m rancher-home:6123 ./examples/batch/WordCount.jar --input ./README.txt
Zeppelin Configuration as follow:
Host: rancher-home
Port: 6123
Zeppelin 0.8.2 works well with Flink Cluster 1.1.3
References:
https://flink.apache.org/zh/usecases.html
https://flink.apache.org/
https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/deployment/cluster_setup.html
http://wuchong.me/blog/2018/11/07/5-minutes-build-first-flink-application/
https://www.infoq.cn/article/zbBAGroBgtytDiBs*Xq9
Installation
https://www.cnblogs.com/frankdeng/p/9400627.html
https://juejin.im/post/5d6610c65188257573636a86
发表评论
-
Stop Update Here
2020-04-28 09:00 320I will stop update here, and mo ... -
NodeJS12 and Zlib
2020-04-01 07:44 482NodeJS12 and Zlib It works as ... -
Docker Swarm 2020(2)Docker Swarm and Portainer
2020-03-31 23:18 373Docker Swarm 2020(2)Docker Swar ... -
Docker Swarm 2020(1)Simply Install and Use Swarm
2020-03-31 07:58 373Docker Swarm 2020(1)Simply Inst ... -
Traefik 2020(1)Introduction and Installation
2020-03-29 13:52 342Traefik 2020(1)Introduction and ... -
Portainer 2020(4)Deploy Nginx and Others
2020-03-20 12:06 433Portainer 2020(4)Deploy Nginx a ... -
Private Registry 2020(1)No auth in registry Nginx AUTH for UI
2020-03-18 00:56 441Private Registry 2020(1)No auth ... -
Docker Compose 2020(1)Installation and Basic
2020-03-15 08:10 379Docker Compose 2020(1)Installat ... -
VPN Server 2020(2)Docker on CentOS in Ubuntu
2020-03-02 08:04 461VPN Server 2020(2)Docker on Cen ... -
Buffer in NodeJS 12 and NodeJS 8
2020-02-25 06:43 391Buffer in NodeJS 12 and NodeJS ... -
NodeJS ENV Similar to JENV and PyENV
2020-02-25 05:14 484NodeJS ENV Similar to JENV and ... -
Prometheus HA 2020(3)AlertManager Cluster
2020-02-24 01:47 428Prometheus HA 2020(3)AlertManag ... -
Serverless with NodeJS and TencentCloud 2020(5)CRON and Settings
2020-02-24 01:46 340Serverless with NodeJS and Tenc ... -
GraphQL 2019(3)Connect to MySQL
2020-02-24 01:48 253GraphQL 2019(3)Connect to MySQL ... -
GraphQL 2019(2)GraphQL and Deploy to Tencent Cloud
2020-02-24 01:48 454GraphQL 2019(2)GraphQL and Depl ... -
GraphQL 2019(1)Apollo Basic
2020-02-19 01:36 330GraphQL 2019(1)Apollo Basic Cl ... -
Serverless with NodeJS and TencentCloud 2020(4)Multiple Handlers and Running wit
2020-02-19 01:19 317Serverless with NodeJS and Tenc ... -
Serverless with NodeJS and TencentCloud 2020(3)Build Tree and Traverse Tree
2020-02-19 01:19 323Serverless with NodeJS and Tenc ... -
Serverless with NodeJS and TencentCloud 2020(2)Trigger SCF in SCF
2020-02-19 01:18 298Serverless with NodeJS and Tenc ... -
Serverless with NodeJS and TencentCloud 2020(1)Running with Component
2020-02-19 01:17 314Serverless with NodeJS and Tenc ...
相关推荐
### Declarative Data Processing with Java in Apache Flink #### Apache Flink简介 Apache Flink 是一个分布式的流处理引擎,支持大规模数据处理任务。它提供了一系列丰富的API,包括Java、Scala以及Python等语言...
a core contributor to Flink’s graph processing API (Gelly), explains the fundamental concepts of parallel stream processing and shows you how streaming analytics differs from traditional batch data ...
Stream Processing With Apache Flink See how to get started with writing stream processing algorithms using Apache Flink. by reading a stream of Wikipedia edits and getting some meaningful data out of ...
使用Flink交互式大数据分析资料Interactive Data Analysis with Apache Flink
综上所述,2019年5月11日的Apache Flink China Meetup - 上海站活动,聚焦于Flink的实时处理能力、事件时间模型、API设计、状态管理和容错机制,以及其在各种业务场景的应用。这样的交流活动对于提升开发者对Flink的...
This book will be your definitive guide to batch and stream data processing with Apache Flink. The book begins by introducing the Apache Flink ecosystem, setting it up and using the DataSet and ...
在实战中,`kknf-dataanalysis`可能包含了具体的代码示例,帮助开发者理解并实现这个过程。学习这些示例,可以帮助我们更好地理解和运用Flink、Kafka以及Greenplum的集成,提升实时数据处理的能力。 总之,结合...
《Stream Processing with Apache Flink》是由Apache Flink项目管理委员会成员Vasiliki Kalavri和Fabian Hueske于2019年合著的一本权威性书籍,它深入介绍了Flink这一流行的开源流处理框架。这本书对于理解Flink的...
本文将围绕“flink-sql-demo-data-part2.tar.gz”这一压缩包,深入探讨其中包含的测试数据,以及这些数据如何在Flink SQL中发挥作用。 首先,我们关注到这个压缩包的名字“flink-sql-demo-data-part2.tar.gz”,这...
"Stream Processing with Apache Flink (Early Release)"很可能是一本深入介绍Flink技术的书籍,它可能包含了Flink的基本概念、核心特性、实战应用以及最新版本的功能。 1. **Flink基础**:Flink设计的核心理念是...
现在大数据处理里面比较公认的流处理框架,Stream Processing with Apache Flink;
Flink FFA Flink Towards Streaming Data Warehouse
Goals for Processing Continuous Event Data Evolution of Stream Processing Technologies First Look at Apache Flink Flink in Production Where Flink Fits Chapter 2 Stream-First Architecture ...
### I Heart Logs: Event Data, Stream Processing, and Data Integration #### 概述 《I Heart Logs: Event Data, Stream Processing, and Data Integration》是一本由Jay Kreps编写的书籍,该书聚焦于日志(Logs...