你好,在部署Mesos+Spark的运行环境时,出现一个现象, ...
Spark(4)Deal with Mesos -
AMAZON Relatedhttps://www.godad ...
AMAZON API Gateway(2)Client Side SSL with NGINX -
sudo usermod -aG docker ec2-use ...
Docker and VirtualBox(1)Set up Shared Disk for Virtual Box -
Every Half an Hour30 * * * * /u ...
Build Home NAS(3)Data Redundancy -
3 List the Cron Job I Have>c ...
Build Home NAS(3)Data Redundancy
Data Solution 2019(9)CentOS Installation
Try to Set Up on Host Machine on CentOS 7
I have JAVA there according to jenv
Need JAVA ENV JDK8, 11, 12
> sudo yum install git
> git clone https://github.com/gcuisinier/jenv.git ~/.jenv
> echo 'export PATH="$HOME/.jenv/bin:$PATH"' >> ~/.bash_profile
> echo 'eval "$(jenv init -)"' >> ~/.bash_profile
> . ~/.bash_profile
Check version
> jenv --version
jenv 0.5.2-12-gdcbfd48
Download JDK 8, 11, 12 from Official website
Unzip all of these files and place in working directory, link to /opt directory
> tar zxvf jdk-11.0.4_linux-x64_bin.tar.gz
> mv jdk-11.0.4 ~/tool/
> sudo ln -s /home/redis/tool/jdk-11.0.4 /opt/jdk-11.0.4
Add to JENV
> jenv add /opt/jdk-11.0.4
Check the installed versions
> jenv versions
* system (set by /home/redis/.jenv/version)
Try to set global to 11
> jenv global 11.0
> java -version
java version "11.0.4" 2019-07-16 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.4+10-LTS)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.4+10-LTS, mixed mode)
Prepare HADOOP
> wget http://apache-mirror.8birdsvideo.com/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz
> tar zxvf hadoop-3.2.1.tar.gz
> mv hadoop-3.2.1 ~/tool/
> sudo ln -s /home/carl/tool/hadoop-3.2.1 /opt/hadoop-3.2.1
> sudo ln -s /opt/hadoop-3.2.1 /opt/hadoop
Site Configuration
> vi etc/hadoop/core-site.xml
HDFS site configuration
> vi etc/hadoop/hdfs-site.xml
Shell Command ENV
Check JAVA_HOME before we configuration the file
> jenv doctor
[OK] Java binaries in path are jenv shims
[OK] Jenv is correctly loaded
> jenv enable-plugin export
Restart the service
> jenv global 11.0
> java -version
java version "11.0.4" 2019-07-16 LTS
> echo $JAVA_HOME
> vi etc/hadoop/hadoop-env.sh
export JAVA_HOME="/home/carl/.jenv/versions/11.0"
SSH to my localhost, promote for password
> ssh localhost
Generate the key pair
> ssh-keygen -q -t rsa -N '' -f ~/.ssh/id_rsa
> cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
SSH to localhost successful
> ssh localhost
Last login: Thu Oct 24 16:12:36 2019 from localhost
Start HDFS
> cd /opt/hadoop
> bin/hdfs namenode -format
> sbin/start-dfs.sh
Visit the web UI
Failed to retrieve data from /webhdfs/v1/?op=LISTSTATUS: Server Error
Find this in the logging
> grep "Error" ./*
./hadoop-carl-namenode-rancher-worker1.log:2019-10-24 16:15:59,844 WARN org.eclipse.jetty.servlet.ServletHandler: Error for /webhdfs/v1/
./hadoop-carl-namenode-rancher-worker1.log:java.lang.NoClassDefFoundError: javax/activation/DataSource
./hadoop-carl-namenode-rancher-worker1.log: at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
> vi etc/hadoop/hadoop-env.sh
export HADOOP_OPTS="--add-modules java.activation"
Not working at all, according to this web page
Currently it only support JDK8, I will need to change to JDK8 instead.
> jenv global 1.8
> java -version
java version "1.8.0_221"
> echo $JAVA_HOME
Prepare Spark
> wget http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
> tar zxvf spark-2.4.4-bin-hadoop2.7.tgz
> mv spark-2.4.4-bin-hadoop2.7 ~/tool/spark-2.4.4
> sudo ln -s /home/carl/tool/spark-2.4.4 /opt/spark-2.4.4
> sudo ln -s /opt/spark-2.4.4 /opt/spark
> cd /opt/spark
> cp conf/spark-env.sh.template conf/spark-env.sh
> vi conf/spark-env.sh
Prepare PYTHON 3.7 ENV
Since we want to migrate all the things to Python3, install and prepare python3
Install PYENV from the latest github
> git clone https://github.com/pyenv/pyenv.git ~/.pyenv
Add to the PATH
> vi ~/.bash_profile
eval "$(pyenv init -)"
> . ~/.bash_profile
Check installation
> pyenv -v
pyenv 1.2.14-8-g0e7cfc3
Check all the versions, latest is 3.7.5 and 3.8.0, install some other versions
Some warning and possible dependencies
WARNING: The Python bz2 extension was not compiled. Missing the bzip2 lib?
WARNING: The Python readline extension was not compiled. Missing the GNU readline lib?
WARNING: The Python sqlite3 extension was not compiled. Missing the SQLite3 lib?
> sudo yum install bzip2-devel
> sudo yum install sqlite-devel
> sudo yum install readline-devel
> pyenv install 3.8.0
> pyenv install 3.7.5
> pyenv versions
* system (set by /home/carl/.pyenv/version)
> pyenv global 3.8.0
> python -V
Python 3.8.0
More Python Libraries
> pip install --upgrade pip
> pip install pandas
> pip install -U pandasql
Failed with
ModuleNotFoundError: No module named '_ctypes'
> sudo yum install libffi-devel
It will solve the problem. Success install pandas and pandasql again.
> pip install pandas
> pip install -U pandasql
Prepare Zeppelin
> wget http://www.gtlib.gatech.edu/pub/apache/zeppelin/zeppelin-0.8.2/zeppelin-0.8.2-bin-all.tgz
> tar zxvf zeppelin-0.8.2-bin-all.tgz
> mv zeppelin-0.8.2-bin-all ~/tool/zeppelin-0.8.2
> sudo ln -s /home/carl/tool/zeppelin-0.8.2 /opt/zeppelin-0.8.2
> sudo ln -s /opt/zeppelin-0.8.2 /opt/zeppelin
Some Configuration for Zeppelin
> cp conf/zeppelin-site.xml.template conf/zeppelin-site.xml
> cp conf/shiro.ini.template conf/shiro.ini
> cp conf/zeppelin-env.sh.template conf/zeppelin-env.sh
Add user to the auth config
> vi conf/shiro.ini
carl = pass123, admin
kiko = pass123, admin
Site configuration
> vi conf/zeppelin-site.xml
<description>Server binding address</description>
<description>Anonymous user allowed by default</description>
ENV configuration
> vi conf/zeppelin-env.sh
export SPARK_HOME="/opt/spark"
export HADOOP_CONF_DIR="/opt/hadoop/etc/hadoop/"
Start the Service
> bin/zeppelin.sh
> sudo bin/zeppelin-daemon.sh stop
> sudo bin/zeppelin-daemon.sh start
Visit these UI
Docker Permission
