- 浏览: 2550805 次
- 性别:
- 来自: 成都
文章分类
最新评论
-
nation:
你好,在部署Mesos+Spark的运行环境时,出现一个现象, ...
Spark(4)Deal with Mesos -
sillycat:
AMAZON Relatedhttps://www.godad ...
AMAZON API Gateway(2)Client Side SSL with NGINX -
sillycat:
sudo usermod -aG docker ec2-use ...
Docker and VirtualBox(1)Set up Shared Disk for Virtual Box -
sillycat:
Every Half an Hour30 * * * * /u ...
Build Home NAS(3)Data Redundancy -
sillycat:
3 List the Cron Job I Have>c ...
Build Home NAS(3)Data Redundancy
Data Solution 2019(13)Docker Zeppelin Notebook and Memory Configuration
On my MAC, I run into this error when I build my docker image
Disk Requirements:
At least 187MB more space needed on the / filesystem.
I check my disk space, I do have disk on MAC. So maybe it caused by I build too many docker images on my MAC, so here is the command to clean up them
Remove all the containers
> docker rm $(docker ps -qa)
Remove all the images
> docker rmi $(docker image ls -qa)
Memory and Cores Settings
Partitions: split the large data
Task: run in one single Executor. All tasks can be parallel.
Executor: JVM in one worker node, one node can run multiple executors
Cores:
Cluster Manager:
Driver: SparkContext connect tot he cluster manager ( Standalone )
Cluster Manager: manage all resources, like executors
Spark get all executors, send our packages/codes to all executor
SparkContext send all tasks to executors
Core: number of parallel per executor, eg 5
Executors: number of executers, CPU cores/ 5 = num
Memory: Memory / Executors
Executor Total Memory = ExecutorMemory + MemoryOverhead
MemoryOverhead = max( 384M, 0.07 x spark.executor.memory)
Finally, I made it working with ZeppelinBook, Spark Master, Spark Slaves. For example
192.168.56.110 rancher-home Zeppelin Book, Spark Master
192.168.56.111 rancher-worker1 Spark Slave
192.168.56.112 rancher-worker2. Spark Slave
Spark Master on rancher-home
Dockerfile including R and Python ENV
#Set up spark master in Docker
#Prepre the OS
FROM centos:7
MAINTAINER Yiyi Kang <yiyikangrachel@gmail.com>
RUN yum -y update
RUN yum install -y wget
#install java
RUN yum -y install java-1.8.0-openjdk.x86_64
RUN echo ‘export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk’ | tee -a /etc/profile
#prepare python
RUN yum groupinstall -y "Development tools"
RUN yum -y install git freetype-devel openssl-devel libffi-devel
RUN git clone https://github.com/pyenv/pyenv.git ~/.pyenv
ENV HOME /root
ENV PYENV_ROOT $HOME/.pyenv
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
RUN pyenv install 3.7.5
RUN pyenv global 3.7.5
#prepare R
RUN yum install -y epel-release
RUN yum install -y R
RUN mkdir /tool/
WORKDIR /tool/
#add the software spark
RUN wget --no-verbose http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
RUN tar -xvzf spark-2.4.4-bin-hadoop2.7.tgz
RUN ln -s /tool/spark-2.4.4-bin-hadoop2.7 /tool/spark
ADD conf/spark-env.sh /tool/spark/conf/
#python libraries
RUN pip install --upgrade pip
RUN pip install pandas
RUN pip install -U pandasql
RUN pip install matplotlib
#R libraries
RUN R -e "install.packages('data.table',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('knitr',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('googleVis',dependencies=TRUE, repos='http://cran.rstudio.com/')"
#set up the app
EXPOSE 8088 7077
RUN mkdir -p /app/
ADD start.sh /app/
WORKDIR /app/
CMD [ "./start.sh” ]
Makefile to support memory parameter and hostname parameter
HOSTNAME=rancher-home
MEMORY=2g
IMAGE=sillycat/public
TAG=sillycat-sparkmaster-1.0
NAME=sillycat-sparkmaster-1.0
docker-context:
build: docker-context
docker build -t $(IMAGE):$(TAG) .
run:
docker run -d \
-e "SPARK_LOCAL_HOSTNAME=$(HOSTNAME)" \
-e "SPARK_IDENT_STRING=$(HOSTNAME)" \
-e "SPARK_PUBLIC_DNS=$(HOSTNAME)" \
-e "SPARK_DAEMON_MEMORY=$(MEMORY)" \
--network host \
--name $(NAME) $(IMAGE):$(TAG)
clean:
docker stop ${NAME}
docker rm ${NAME}
logs:
docker logs ${NAME}
publish:
docker push ${IMAGE}
start.sh to start the Spark master
#!/bin/sh -ex
#prepare ENV
#start the service
cd /tool/spark
sbin/start-master.sh
Settings in conf/spark-env.sh to support the port number
SPARK_MASTER_PORT=7077
SPARK_MASTER_WEBUI_PORT=8088
SPARK_NO_DAEMONIZE=true
I use this command to start the container
>make run HOSTNAME=rancher-home MEMROY=1g
Zeppelin on the rancher-home machine
Dockerfile containers all the libraries and softwares
#Set up Zeppelin Notebook
#Prepre the OS
FROM centos:7
MAINTAINER Yiyi Kang <yiyikangrachel@gmail.com>
RUN yum -y update
RUN yum install -y wget
#java
RUN yum -y install java-1.8.0-openjdk.x86_64
RUN echo ‘export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk’ | tee -a /etc/profile
#prepare python
RUN yum groupinstall -y "Development tools"
RUN yum -y install git freetype-devel openssl-devel libffi-devel
RUN git clone https://github.com/pyenv/pyenv.git ~/.pyenv
ENV HOME /root
ENV PYENV_ROOT $HOME/.pyenv
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
RUN pyenv install 3.7.5
RUN pyenv global 3.7.5
#prepare R
RUN yum install -y epel-release
RUN yum install -y R
RUN mkdir /tool/
WORKDIR /tool/
#add the software spark
RUN wget --no-verbose http://www.gtlib.gatech.edu/pub/apache/zeppelin/zeppelin-0.8.2/zeppelin-0.8.2-bin-all.tgz
RUN tar -xvzf zeppelin-0.8.2-bin-all.tgz
RUN ln -s /tool/zeppelin-0.8.2-bin-all /tool/zeppelin
#add the software spark
RUN wget --no-verbose http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
RUN tar -xvzf spark-2.4.4-bin-hadoop2.7.tgz
RUN ln -s /tool/spark-2.4.4-bin-hadoop2.7 /tool/spark
#python libraries
RUN pip install --upgrade pip
RUN pip install pandas
RUN pip install -U pandasql
RUN pip install matplotlib
#R libraries
RUN R -e "install.packages('data.table',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('knitr',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('googleVis',dependencies=TRUE, repos='http://cran.rstudio.com/')"
#set up the app
EXPOSE 8080 4040
RUN mkdir -p /app/
ADD start.sh /app/
WORKDIR /app/
CMD [ "./start.sh" ]
Makefile to start the container with host network
IMAGE=sillycat/public
TAG=sillycat-zeppelinbook-1.0
NAME=sillycat-zeppelinbook-1.0
docker-context:
build: docker-context
docker build -t $(IMAGE):$(TAG) .
run:
docker run -d --privileged=true \
-v $(shell pwd)/zeppelin/notebook:/tool/zeppelin/notebook \
-v $(shell pwd)/zeppelin/conf:/tool/zeppelin/conf \
--network host \
--name $(NAME) \
$(IMAGE):$(TAG)
clean:
docker stop ${NAME}
docker rm ${NAME}
logs:
docker logs ${NAME}
publish:
docker push ${IMAGE}
Script to start.sh
#!/bin/sh -ex
#start the service
cd /tool/zeppelin
bin/zeppelin.sh
Settings in zeppelin/conf/zeppelin-env.sh
export SPARK_HOME=/tool/spark
export MASTER=spark://rancher-home:7077
Very important thing is this - How to add Dependencies
In the interpreter settings
Add Dependencies in
Artifact: mysql:mysql-connector-java:5.1.47
That is only for driver and notebook, but we need add that this as well to make it working on all the slaves.
spark.jars.packages: mysql:mysql-connector-java:5.1.47
The Spark Slave will be similar to Master
Dockerfile
#Set up spark slave in Docker
#Prepre the OS
FROM centos:7
MAINTAINER Yiyi Kang <yiyikangrachel@gmail.com>
RUN yum -y update
RUN yum install -y wget
#install jdk
RUN yum -y install java-1.8.0-openjdk.x86_64
RUN echo ‘export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk’ | tee -a /etc/profile
#prepare python
RUN yum groupinstall -y "Development tools"
RUN yum -y install git freetype-devel openssl-devel libffi-devel
RUN git clone https://github.com/pyenv/pyenv.git ~/.pyenv
ENV HOME /root
ENV PYENV_ROOT $HOME/.pyenv
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
RUN pyenv install 3.7.5
RUN pyenv global 3.7.5
#prepare R
RUN yum install -y epel-release
RUN yum install -y R
RUN mkdir /tool/
WORKDIR /tool/
#add the software spark
RUN wget --no-verbose http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
RUN tar -xvzf spark-2.4.4-bin-hadoop2.7.tgz
RUN ln -s /tool/spark-2.4.4-bin-hadoop2.7 /tool/spark
ADD conf/spark-env.sh /tool/spark/conf/
#python libraries
RUN pip install --upgrade pip
RUN pip install pandas
RUN pip install -U pandasql
RUN pip install matplotlib
#r libraries
RUN R -e "install.packages('data.table',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('knitr',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('googleVis',dependencies=TRUE, repos='http://cran.rstudio.com/')"
#set up the app
EXPOSE 8188 7177
RUN mkdir -p /app/
ADD start.sh /app/
WORKDIR /app/
CMD [ "./start.sh” ]
Makefile need to connect to master machine
HOSTNAME=rancher-worker1
MASTER=rancher-home
SPARK_WORKER_CORES=2
SPARK_WORKER_MEMORY=2g
IMAGE=sillycat/public
TAG=sillycat-sparkslave-1.0
NAME=sillycat-sparkslave-1.0
docker-context:
build: docker-context
docker build -t $(IMAGE):$(TAG) .
run:
docker run -d \
-e "SPARK_PUBLIC_DNS=$(HOSTNAME)" \
-e "SPARK_LOCAL_HOSTNAME=$(HOSTNAME)" \
-e "SPARK_IDENT_STRING=$(HOSTNAME)" \
-e "SPARK_MASTER=$(MASTER)" \
-e "SPARK_WORKER_CORES=$(SPARK_WORKER_CORES)" \
-e "SPARK_WORKER_MEMORY=$(SPARK_WORKER_MEMORY)" \
--name $(NAME) \
--network host \
$(IMAGE):$(TAG)
clean:
docker stop ${NAME}
docker rm ${NAME}
logs:
docker logs ${NAME}
publish:
docker push ${IMAGE}
Shell script to start.sh
#!/bin/sh -ex
#start the service
cd /tool/spark
sbin/start-slave.sh spark://${SPARK_MASTER}:7077
Settings in conf/spark-env.sh
SPARK_WORKER_PORT=7177
SPARK_WORKER_WEBUI_PORT=8188
SPARK_IDENT_STRING=rancher-worker1
SPARK_NO_DAEMONIZE=true
Command to start will be similar to this
>make run MASTER=rancher-home HOSTNAME=rancher-worker1 SPARK_WORKER_CORES=2 SPARK_WORKER_MEMORY=2g
References:
https://stackoverflow.com/questions/38820979/docker-image-error-downloading-package
Memory
https://www.jianshu.com/p/a8b61f14309f
https://blog.51cto.com/10120275/2364992
https://taoistwar.gitbooks.io/spark-operationand-maintenance-management/content/spark_install/spark_standalone_configuration.html
Zeppelin Login Issue
https://stackoverflow.com/questions/46685400/login-to-zeppelin-issues-with-docker
Zeppelin Dependencies Issue
http://zeppelin.apache.org/docs/0.8.2/interpreter/spark.html#dependencyloading
On my MAC, I run into this error when I build my docker image
Disk Requirements:
At least 187MB more space needed on the / filesystem.
I check my disk space, I do have disk on MAC. So maybe it caused by I build too many docker images on my MAC, so here is the command to clean up them
Remove all the containers
> docker rm $(docker ps -qa)
Remove all the images
> docker rmi $(docker image ls -qa)
Memory and Cores Settings
Partitions: split the large data
Task: run in one single Executor. All tasks can be parallel.
Executor: JVM in one worker node, one node can run multiple executors
Cores:
Cluster Manager:
Driver: SparkContext connect tot he cluster manager ( Standalone )
Cluster Manager: manage all resources, like executors
Spark get all executors, send our packages/codes to all executor
SparkContext send all tasks to executors
Core: number of parallel per executor, eg 5
Executors: number of executers, CPU cores/ 5 = num
Memory: Memory / Executors
Executor Total Memory = ExecutorMemory + MemoryOverhead
MemoryOverhead = max( 384M, 0.07 x spark.executor.memory)
Finally, I made it working with ZeppelinBook, Spark Master, Spark Slaves. For example
192.168.56.110 rancher-home Zeppelin Book, Spark Master
192.168.56.111 rancher-worker1 Spark Slave
192.168.56.112 rancher-worker2. Spark Slave
Spark Master on rancher-home
Dockerfile including R and Python ENV
#Set up spark master in Docker
#Prepre the OS
FROM centos:7
MAINTAINER Yiyi Kang <yiyikangrachel@gmail.com>
RUN yum -y update
RUN yum install -y wget
#install java
RUN yum -y install java-1.8.0-openjdk.x86_64
RUN echo ‘export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk’ | tee -a /etc/profile
#prepare python
RUN yum groupinstall -y "Development tools"
RUN yum -y install git freetype-devel openssl-devel libffi-devel
RUN git clone https://github.com/pyenv/pyenv.git ~/.pyenv
ENV HOME /root
ENV PYENV_ROOT $HOME/.pyenv
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
RUN pyenv install 3.7.5
RUN pyenv global 3.7.5
#prepare R
RUN yum install -y epel-release
RUN yum install -y R
RUN mkdir /tool/
WORKDIR /tool/
#add the software spark
RUN wget --no-verbose http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
RUN tar -xvzf spark-2.4.4-bin-hadoop2.7.tgz
RUN ln -s /tool/spark-2.4.4-bin-hadoop2.7 /tool/spark
ADD conf/spark-env.sh /tool/spark/conf/
#python libraries
RUN pip install --upgrade pip
RUN pip install pandas
RUN pip install -U pandasql
RUN pip install matplotlib
#R libraries
RUN R -e "install.packages('data.table',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('knitr',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('googleVis',dependencies=TRUE, repos='http://cran.rstudio.com/')"
#set up the app
EXPOSE 8088 7077
RUN mkdir -p /app/
ADD start.sh /app/
WORKDIR /app/
CMD [ "./start.sh” ]
Makefile to support memory parameter and hostname parameter
HOSTNAME=rancher-home
MEMORY=2g
IMAGE=sillycat/public
TAG=sillycat-sparkmaster-1.0
NAME=sillycat-sparkmaster-1.0
docker-context:
build: docker-context
docker build -t $(IMAGE):$(TAG) .
run:
docker run -d \
-e "SPARK_LOCAL_HOSTNAME=$(HOSTNAME)" \
-e "SPARK_IDENT_STRING=$(HOSTNAME)" \
-e "SPARK_PUBLIC_DNS=$(HOSTNAME)" \
-e "SPARK_DAEMON_MEMORY=$(MEMORY)" \
--network host \
--name $(NAME) $(IMAGE):$(TAG)
clean:
docker stop ${NAME}
docker rm ${NAME}
logs:
docker logs ${NAME}
publish:
docker push ${IMAGE}
start.sh to start the Spark master
#!/bin/sh -ex
#prepare ENV
#start the service
cd /tool/spark
sbin/start-master.sh
Settings in conf/spark-env.sh to support the port number
SPARK_MASTER_PORT=7077
SPARK_MASTER_WEBUI_PORT=8088
SPARK_NO_DAEMONIZE=true
I use this command to start the container
>make run HOSTNAME=rancher-home MEMROY=1g
Zeppelin on the rancher-home machine
Dockerfile containers all the libraries and softwares
#Set up Zeppelin Notebook
#Prepre the OS
FROM centos:7
MAINTAINER Yiyi Kang <yiyikangrachel@gmail.com>
RUN yum -y update
RUN yum install -y wget
#java
RUN yum -y install java-1.8.0-openjdk.x86_64
RUN echo ‘export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk’ | tee -a /etc/profile
#prepare python
RUN yum groupinstall -y "Development tools"
RUN yum -y install git freetype-devel openssl-devel libffi-devel
RUN git clone https://github.com/pyenv/pyenv.git ~/.pyenv
ENV HOME /root
ENV PYENV_ROOT $HOME/.pyenv
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
RUN pyenv install 3.7.5
RUN pyenv global 3.7.5
#prepare R
RUN yum install -y epel-release
RUN yum install -y R
RUN mkdir /tool/
WORKDIR /tool/
#add the software spark
RUN wget --no-verbose http://www.gtlib.gatech.edu/pub/apache/zeppelin/zeppelin-0.8.2/zeppelin-0.8.2-bin-all.tgz
RUN tar -xvzf zeppelin-0.8.2-bin-all.tgz
RUN ln -s /tool/zeppelin-0.8.2-bin-all /tool/zeppelin
#add the software spark
RUN wget --no-verbose http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
RUN tar -xvzf spark-2.4.4-bin-hadoop2.7.tgz
RUN ln -s /tool/spark-2.4.4-bin-hadoop2.7 /tool/spark
#python libraries
RUN pip install --upgrade pip
RUN pip install pandas
RUN pip install -U pandasql
RUN pip install matplotlib
#R libraries
RUN R -e "install.packages('data.table',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('knitr',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('googleVis',dependencies=TRUE, repos='http://cran.rstudio.com/')"
#set up the app
EXPOSE 8080 4040
RUN mkdir -p /app/
ADD start.sh /app/
WORKDIR /app/
CMD [ "./start.sh" ]
Makefile to start the container with host network
IMAGE=sillycat/public
TAG=sillycat-zeppelinbook-1.0
NAME=sillycat-zeppelinbook-1.0
docker-context:
build: docker-context
docker build -t $(IMAGE):$(TAG) .
run:
docker run -d --privileged=true \
-v $(shell pwd)/zeppelin/notebook:/tool/zeppelin/notebook \
-v $(shell pwd)/zeppelin/conf:/tool/zeppelin/conf \
--network host \
--name $(NAME) \
$(IMAGE):$(TAG)
clean:
docker stop ${NAME}
docker rm ${NAME}
logs:
docker logs ${NAME}
publish:
docker push ${IMAGE}
Script to start.sh
#!/bin/sh -ex
#start the service
cd /tool/zeppelin
bin/zeppelin.sh
Settings in zeppelin/conf/zeppelin-env.sh
export SPARK_HOME=/tool/spark
export MASTER=spark://rancher-home:7077
Very important thing is this - How to add Dependencies
In the interpreter settings
Add Dependencies in
Artifact: mysql:mysql-connector-java:5.1.47
That is only for driver and notebook, but we need add that this as well to make it working on all the slaves.
spark.jars.packages: mysql:mysql-connector-java:5.1.47
The Spark Slave will be similar to Master
Dockerfile
#Set up spark slave in Docker
#Prepre the OS
FROM centos:7
MAINTAINER Yiyi Kang <yiyikangrachel@gmail.com>
RUN yum -y update
RUN yum install -y wget
#install jdk
RUN yum -y install java-1.8.0-openjdk.x86_64
RUN echo ‘export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk’ | tee -a /etc/profile
#prepare python
RUN yum groupinstall -y "Development tools"
RUN yum -y install git freetype-devel openssl-devel libffi-devel
RUN git clone https://github.com/pyenv/pyenv.git ~/.pyenv
ENV HOME /root
ENV PYENV_ROOT $HOME/.pyenv
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH
RUN pyenv install 3.7.5
RUN pyenv global 3.7.5
#prepare R
RUN yum install -y epel-release
RUN yum install -y R
RUN mkdir /tool/
WORKDIR /tool/
#add the software spark
RUN wget --no-verbose http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
RUN tar -xvzf spark-2.4.4-bin-hadoop2.7.tgz
RUN ln -s /tool/spark-2.4.4-bin-hadoop2.7 /tool/spark
ADD conf/spark-env.sh /tool/spark/conf/
#python libraries
RUN pip install --upgrade pip
RUN pip install pandas
RUN pip install -U pandasql
RUN pip install matplotlib
#r libraries
RUN R -e "install.packages('data.table',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('knitr',dependencies=TRUE, repos='http://cran.rstudio.com/')"
RUN R -e "install.packages('googleVis',dependencies=TRUE, repos='http://cran.rstudio.com/')"
#set up the app
EXPOSE 8188 7177
RUN mkdir -p /app/
ADD start.sh /app/
WORKDIR /app/
CMD [ "./start.sh” ]
Makefile need to connect to master machine
HOSTNAME=rancher-worker1
MASTER=rancher-home
SPARK_WORKER_CORES=2
SPARK_WORKER_MEMORY=2g
IMAGE=sillycat/public
TAG=sillycat-sparkslave-1.0
NAME=sillycat-sparkslave-1.0
docker-context:
build: docker-context
docker build -t $(IMAGE):$(TAG) .
run:
docker run -d \
-e "SPARK_PUBLIC_DNS=$(HOSTNAME)" \
-e "SPARK_LOCAL_HOSTNAME=$(HOSTNAME)" \
-e "SPARK_IDENT_STRING=$(HOSTNAME)" \
-e "SPARK_MASTER=$(MASTER)" \
-e "SPARK_WORKER_CORES=$(SPARK_WORKER_CORES)" \
-e "SPARK_WORKER_MEMORY=$(SPARK_WORKER_MEMORY)" \
--name $(NAME) \
--network host \
$(IMAGE):$(TAG)
clean:
docker stop ${NAME}
docker rm ${NAME}
logs:
docker logs ${NAME}
publish:
docker push ${IMAGE}
Shell script to start.sh
#!/bin/sh -ex
#start the service
cd /tool/spark
sbin/start-slave.sh spark://${SPARK_MASTER}:7077
Settings in conf/spark-env.sh
SPARK_WORKER_PORT=7177
SPARK_WORKER_WEBUI_PORT=8188
SPARK_IDENT_STRING=rancher-worker1
SPARK_NO_DAEMONIZE=true
Command to start will be similar to this
>make run MASTER=rancher-home HOSTNAME=rancher-worker1 SPARK_WORKER_CORES=2 SPARK_WORKER_MEMORY=2g
References:
https://stackoverflow.com/questions/38820979/docker-image-error-downloading-package
Memory
https://www.jianshu.com/p/a8b61f14309f
https://blog.51cto.com/10120275/2364992
https://taoistwar.gitbooks.io/spark-operationand-maintenance-management/content/spark_install/spark_standalone_configuration.html
Zeppelin Login Issue
https://stackoverflow.com/questions/46685400/login-to-zeppelin-issues-with-docker
Zeppelin Dependencies Issue
http://zeppelin.apache.org/docs/0.8.2/interpreter/spark.html#dependencyloading
发表评论
-
Update Site will come soon
2021-06-02 04:10 1677I am still keep notes my tech n ... -
Stop Update Here
2020-04-28 09:00 315I will stop update here, and mo ... -
NodeJS12 and Zlib
2020-04-01 07:44 475NodeJS12 and Zlib It works as ... -
Docker Swarm 2020(2)Docker Swarm and Portainer
2020-03-31 23:18 367Docker Swarm 2020(2)Docker Swar ... -
Docker Swarm 2020(1)Simply Install and Use Swarm
2020-03-31 07:58 367Docker Swarm 2020(1)Simply Inst ... -
Traefik 2020(1)Introduction and Installation
2020-03-29 13:52 335Traefik 2020(1)Introduction and ... -
Portainer 2020(4)Deploy Nginx and Others
2020-03-20 12:06 429Portainer 2020(4)Deploy Nginx a ... -
Private Registry 2020(1)No auth in registry Nginx AUTH for UI
2020-03-18 00:56 434Private Registry 2020(1)No auth ... -
Docker Compose 2020(1)Installation and Basic
2020-03-15 08:10 373Docker Compose 2020(1)Installat ... -
VPN Server 2020(2)Docker on CentOS in Ubuntu
2020-03-02 08:04 454VPN Server 2020(2)Docker on Cen ... -
Buffer in NodeJS 12 and NodeJS 8
2020-02-25 06:43 384Buffer in NodeJS 12 and NodeJS ... -
NodeJS ENV Similar to JENV and PyENV
2020-02-25 05:14 475NodeJS ENV Similar to JENV and ... -
Prometheus HA 2020(3)AlertManager Cluster
2020-02-24 01:47 421Prometheus HA 2020(3)AlertManag ... -
Serverless with NodeJS and TencentCloud 2020(5)CRON and Settings
2020-02-24 01:46 336Serverless with NodeJS and Tenc ... -
GraphQL 2019(3)Connect to MySQL
2020-02-24 01:48 246GraphQL 2019(3)Connect to MySQL ... -
GraphQL 2019(2)GraphQL and Deploy to Tencent Cloud
2020-02-24 01:48 450GraphQL 2019(2)GraphQL and Depl ... -
GraphQL 2019(1)Apollo Basic
2020-02-19 01:36 325GraphQL 2019(1)Apollo Basic Cl ... -
Serverless with NodeJS and TencentCloud 2020(4)Multiple Handlers and Running wit
2020-02-19 01:19 312Serverless with NodeJS and Tenc ... -
Serverless with NodeJS and TencentCloud 2020(3)Build Tree and Traverse Tree
2020-02-19 01:19 317Serverless with NodeJS and Tenc ... -
Serverless with NodeJS and TencentCloud 2020(2)Trigger SCF in SCF
2020-02-19 01:18 291Serverless with NodeJS and Tenc ...
相关推荐
As a solution to this problem, Docker for Data Science proposes using Docker. You will learn how to use existing pre-compiled public images created by the major open-source technologies―Python, ...
在使用Docker的过程中,有时会遇到“systemctl status docker.service and journalctl -xe”这样的报错,这通常意味着Docker服务无法正常启动。在这种情况下,我们需要深入分析问题,找到原因并进行解决。以下是对这...
的Docker-SqlServer2019 Docker Compose for Sql Server 2019的实现/可用性 先决条件 安装在机器上的Docker 上网下载图片 在此项目的根文件夹下一级创建一个名为[data]的文件夹。 此文件夹将用于在卷之间共享,并...
Docker_Configuration_and_Usage_Guide Docker_Configuration_and_Usage_Guide Docker_Configuration_and_Usage_Guide
此存储库已弃用,请参考齐柏林飞艇一个基于debian:jessie的Spark和 ...简单用法要启动Zeppelin,请提取latest图像并运行容器: docker pull dylanmei/zeppelindocker run --rm -p 8080:8080 dylanmei/zeppelinZeppeli
Docker containers offer simpler, faster, and more robust methods for developing, distributing, and running software than previously available. With this hands-on guide, you’ll learn why containers ...
docker安装
Share data between the Docker host and containers Orchestrate multiple containers with Docker Compose Test and debug applications inside a Docker container Secure your Docker containers with SELinux ...
Docker Containers Build and Deploy with Kubernetes, Flannel, Cockpit, and Atomic 英文epub 本资源转载自网络,如有侵权,请联系上传者或csdn删除 查看此书详细信息请在美国亚马逊官网搜索此书
Explore advanced Docker topics, including deployment tools, networking, orchestration, security, and configuration Table of Contents Chapter 1. Introduction Chapter 2. Docker at a Glance Chapter 3. ...
当地的构建映像并在安装了数据量的本地模式下运行docker build -t zeppelin:1.5.0 .mkdir /data && chmod -R 777 /datadocker run -d -v /data:/zeppelin/data -p 8080:8080 -p 8081:8081 zeppelin:1.5.0Zeppelin将...
kartoza-docker-postgis,带postgis的postgresql13版本的docker镜像。 可直接通过docker导入镜像命令导入。 一个简单的 Docker 容器,用于运行 PostGIS 访问Docker Hub 的页面:...
Joshua Cook - Docker for Data Science_ Building Scalable and Extensible Data Infrastructure Around the Jupyter Notebook Server (2017, Apress)
It has the potential to impact every aspect of computing, from the application development process to how applications are deployed and scaled up and out across massive data centers. Despite its ...
Docker Docker Tutorial for Beginners Build Ship and Run 英文azw3 本资源转载自网络,如有侵权,请联系上传者或csdn删除 查看此书详细信息请在美国亚马逊官网搜索此书
You'll learn how to leverage Docker's volumes feature to share data between the Docker host and its containers – this data management feature is also useful for persistent data. This book also covers...