- 浏览: 2567790 次
- 性别:
- 来自: 成都
-
文章分类
最新评论
-
nation:
你好,在部署Mesos+Spark的运行环境时,出现一个现象, ...
Spark(4)Deal with Mesos -
sillycat:
AMAZON Relatedhttps://www.godad ...
AMAZON API Gateway(2)Client Side SSL with NGINX -
sillycat:
sudo usermod -aG docker ec2-use ...
Docker and VirtualBox(1)Set up Shared Disk for Virtual Box -
sillycat:
Every Half an Hour30 * * * * /u ...
Build Home NAS(3)Data Redundancy -
sillycat:
3 List the Cron Job I Have>c ...
Build Home NAS(3)Data Redundancy
Data Solution 2019(9)CentOS Installation
Join two DataFrame and get the count
import org.apache.spark.sql.functions._
val addressCountDF = addressesRawDF.groupBy("user_id").agg(count("user_id").as("times"))
val userWithCountDF = usersRawDF.join(
addressCountDF,
usersRawDF("id") <=> addressCountDF("user_id")
&& usersRawDF("id") <=> addressCountDF("user_id"),
"left"
)
userWithCountDF.select("id", "times").filter("times > 0").show(100)
Adjust the latest version for Ubuntu
prepare:
wget http://apache-mirror.8birdsvideo.com/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz -P install/
wget http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz -P install/
wget http://www.gtlib.gatech.edu/pub/apache/zeppelin/zeppelin-0.8.2/zeppelin-0.8.2-bin-all.tgz -P install/
On new ubuntu machine
> sudo apt install make
> sudo snap install docker
> sudo groupadd docker
> sudo gpasswd -a $USER docker
> sudo usermod -aG docker $USER
Here is the Dockerfile
#Run a kafka server side
#Prepare the OS
FROM ubuntu:16.04
MAINTAINER Yiyi Kang <yiyikangrachel@gmail.com>
ENV DEBIAN_FRONTEND noninteractive
ENV JAVA_HOME /usr/lib/jvm/java-8-openjdk-amd64
ENV LANG en_US.UTF-8
ENV LC_ALL en_US.UTF-8
RUN apt-get -qq update
RUN apt-get -qqy dist-upgrade
#Prepare the denpendencies
RUN apt-get install -qy wget unzip vim
RUN apt-get install -qy iputils-ping
#Install SUN JAVA
RUN apt-get update && \
apt-get install -y --no-install-recommends locales && \
locale-gen en_US.UTF-8 && \
apt-get dist-upgrade -y && \
apt-get install -qy openjdk-8-jdk
#Prepare for hadoop and spark
RUN apt-get install -y openssh-server
RUN mkdir /var/run/sshd
RUN ssh-keygen -q -t rsa -N '' -f /root/.ssh/id_rsa
RUN cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
#Prepare python env
RUN apt-get install -y git
RUN apt-get install -y build-essential zlib1g-dev libbz2-dev
RUN apt-get install -y libreadline6 libreadline6-dev sqlite3 libsqlite3-dev
RUN apt-get update --fix-missing
RUN apt-get install -y libssl-dev
RUN apt-get install -y software-properties-common vim
RUN add-apt-repository ppa:jonathonf/python-3.6
RUN apt-get update
RUN apt-get install -y build-essential python3.6 python3.6-dev python3-pip python3.6-venv
RUN ln -s /usr/bin/python3 /usr/bin/python
RUN ln -s /usr/bin/pip3 /usr/bin/pip
#pandas
RUN pip install pandas
RUN pip install -U pandasql
RUN mkdir /tool/
WORKDIR /tool/
#add the software hadoop
ADD install/hadoop-3.2.1.tar.gz /tool/
RUN ln -s /tool/hadoop-3.2.1 /tool/hadoop
ADD conf/core-site.xml /tool/hadoop/etc/hadoop/
ADD conf/hdfs-site.xml /tool/hadoop/etc/hadoop/
ADD conf/hadoop-env.sh /tool/hadoop/etc/hadoop/
#add the software spark
ADD install/spark-2.4.4-bin-hadoop2.7.tgz /tool/
RUN ln -s /tool/spark-2.4.4-bin-hadoop2.7 /tool/spark
ADD conf/spark-env.sh /tool/spark/conf/
#add the software zeppelin
ADD install/zeppelin-0.8.2-bin-all.tgz /tool/
RUN ln -s /tool/zeppelin-0.8.2-bin-all /tool/zeppelin
#set up the app
EXPOSE 9000 9870 8080 4040
RUN mkdir -p /app/
ADD start.sh /app/
WORKDIR /app/
CMD [ "./start.sh" ]
Try to Set Up on Host Machine on CentOS 7
I have JAVA there according to jenv
Need JAVA ENV JDK8, 11, 12
> sudo yum install git
> git clone https://github.com/gcuisinier/jenv.git ~/.jenv
> echo 'export PATH="$HOME/.jenv/bin:$PATH"' >> ~/.bash_profile
> echo 'eval "$(jenv init -)"' >> ~/.bash_profile
> . ~/.bash_profile
Check version
> jenv --version
jenv 0.5.2-12-gdcbfd48
Download JDK 8, 11, 12 from Official website
https://www.oracle.com/technetwork/java/javase/downloads/jdk12-downloads-5295953.html
jdk-11.0.4_linux-x64_bin.tar.gz
Unzip all of these files and place in working directory, link to /opt directory
> tar zxvf jdk-11.0.4_linux-x64_bin.tar.gz
> mv jdk-11.0.4 ~/tool/
> sudo ln -s /home/redis/tool/jdk-11.0.4 /opt/jdk-11.0.4
Add to JENV
> jenv add /opt/jdk-11.0.4
Check the installed versions
> jenv versions
* system (set by /home/redis/.jenv/version)
11
11.0
11.0.4
oracle64-11.0.4
Try to set global to 11
> jenv global 11.0
> java -version
java version "11.0.4" 2019-07-16 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.4+10-LTS)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.4+10-LTS, mixed mode)
Prepare HADOOP
> wget http://apache-mirror.8birdsvideo.com/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz
> tar zxvf hadoop-3.2.1.tar.gz
> mv hadoop-3.2.1 ~/tool/
> sudo ln -s /home/carl/tool/hadoop-3.2.1 /opt/hadoop-3.2.1
> sudo ln -s /opt/hadoop-3.2.1 /opt/hadoop
Site Configuration
> vi etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://0.0.0.0:9000</value>
</property>
</configuration>
HDFS site configuration
> vi etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
Shell Command ENV
Check JAVA_HOME before we configuration the file
> jenv doctor
[OK] No JAVA_HOME set
[OK] Java binaries in path are jenv shims
[OK] Jenv is correctly loaded
> jenv enable-plugin export
Restart the service
> jenv global 11.0
> java -version
java version "11.0.4" 2019-07-16 LTS
> echo $JAVA_HOME
/home/carl/.jenv/versions/11.0
> vi etc/hadoop/hadoop-env.sh
export JAVA_HOME="/home/carl/.jenv/versions/11.0"
SSH to my localhost, promote for password
> ssh localhost
Generate the key pair
> ssh-keygen -q -t rsa -N '' -f ~/.ssh/id_rsa
> cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
SSH to localhost successful
> ssh localhost
Last login: Thu Oct 24 16:12:36 2019 from localhost
Start HDFS
> cd /opt/hadoop
> bin/hdfs namenode -format
> sbin/start-dfs.sh
Visit the web UI
http://rancher-worker1:9870/dfshealth.html#tab-overview
Exceptions:
Failed to retrieve data from /webhdfs/v1/?op=LISTSTATUS: Server Error
Find this in the logging
> grep "Error" ./*
./hadoop-carl-namenode-rancher-worker1.log:2019-10-24 16:15:59,844 WARN org.eclipse.jetty.servlet.ServletHandler: Error for /webhdfs/v1/
./hadoop-carl-namenode-rancher-worker1.log:java.lang.NoClassDefFoundError: javax/activation/DataSource
./hadoop-carl-namenode-rancher-worker1.log: at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Solution:
https://github.com/highsource/jsonix-schema-compiler/issues/81
https://salmanzg.wordpress.com/2018/02/20/webhdfs-on-hadoop-3-with-java-9/
> vi etc/hadoop/hadoop-env.sh
export HADOOP_OPTS="--add-modules java.activation"
Not working at all, according to this web page
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Java+Versions
Currently it only support JDK8, I will need to change to JDK8 instead.
> jenv global 1.8
> java -version
java version "1.8.0_221"
> echo $JAVA_HOME
/home/carl/.jenv/versions/1.8
Prepare Spark
> wget http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
> tar zxvf spark-2.4.4-bin-hadoop2.7.tgz
> mv spark-2.4.4-bin-hadoop2.7 ~/tool/spark-2.4.4
> sudo ln -s /home/carl/tool/spark-2.4.4 /opt/spark-2.4.4
> sudo ln -s /opt/spark-2.4.4 /opt/spark
> cd /opt/spark
> cp conf/spark-env.sh.template conf/spark-env.sh
> vi conf/spark-env.sh
HADOOP_CONF_DIR=/opt/hadoop/etc/hadoop
Prepare PYTHON 3.7 ENV
Since we want to migrate all the things to Python3, install and prepare python3
Install PYENV from the latest github
> git clone https://github.com/pyenv/pyenv.git ~/.pyenv
Add to the PATH
> vi ~/.bash_profile
PATH=$PATH:$HOME/.pyenv/bin
eval "$(pyenv init -)"
> . ~/.bash_profile
Check installation
> pyenv -v
pyenv 1.2.14-8-g0e7cfc3
Check all the versions, latest is 3.7.5 and 3.8.0, install some other versions
https://www.python.org/downloads/
Some warning and possible dependencies
WARNING: The Python bz2 extension was not compiled. Missing the bzip2 lib?
WARNING: The Python readline extension was not compiled. Missing the GNU readline lib?
WARNING: The Python sqlite3 extension was not compiled. Missing the SQLite3 lib?
> sudo yum install bzip2-devel
> sudo yum install sqlite-devel
> sudo yum install readline-devel
> pyenv install 3.8.0
> pyenv install 3.7.5
> pyenv versions
* system (set by /home/carl/.pyenv/version)
3.7.5
3.8.0
> pyenv global 3.8.0
> python -V
Python 3.8.0
More Python Libraries
> pip install --upgrade pip
> pip install pandas
> pip install -U pandasql
Failed with
ModuleNotFoundError: No module named '_ctypes'
Solution:
> sudo yum install libffi-devel
It will solve the problem. Success install pandas and pandasql again.
> pip install pandas
> pip install -U pandasql
Prepare Zeppelin
> wget http://www.gtlib.gatech.edu/pub/apache/zeppelin/zeppelin-0.8.2/zeppelin-0.8.2-bin-all.tgz
> tar zxvf zeppelin-0.8.2-bin-all.tgz
> mv zeppelin-0.8.2-bin-all ~/tool/zeppelin-0.8.2
> sudo ln -s /home/carl/tool/zeppelin-0.8.2 /opt/zeppelin-0.8.2
> sudo ln -s /opt/zeppelin-0.8.2 /opt/zeppelin
Some Configuration for Zeppelin
> cp conf/zeppelin-site.xml.template conf/zeppelin-site.xml
> cp conf/shiro.ini.template conf/shiro.ini
> cp conf/zeppelin-env.sh.template conf/zeppelin-env.sh
Add user to the auth config
> vi conf/shiro.ini
[users]
carl = pass123, admin
kiko = pass123, admin
Site configuration
> vi conf/zeppelin-site.xml
<property>
<name>zeppelin.server.addr</name>
<value>0.0.0.0</value>
<description>Server binding address</description>
</property>
<property>
<name>zeppelin.anonymous.allowed</name>
<value>false</value>
<description>Anonymous user allowed by default</description>
</property>
ENV configuration
> vi conf/zeppelin-env.sh
export SPARK_HOME="/opt/spark"
export HADOOP_CONF_DIR="/opt/hadoop/etc/hadoop/"
Start the Service
> bin/zeppelin.sh
> sudo bin/zeppelin-daemon.sh stop
> sudo bin/zeppelin-daemon.sh start
Visit these UI
http://rancher-worker1:9870/explorer.html#/
http://rancher-worker1:8080/#/
References:
Docker Permission
https://askubuntu.com/questions/477551/how-can-i-use-docker-without-sudo
JAVA_HOME
https://www.jenv.be/
https://github.com/jenv/jenv
https://stackoverflow.com/questions/28615671/set-java-home-to-reflect-jenv-java-version
Join two DataFrame and get the count
import org.apache.spark.sql.functions._
val addressCountDF = addressesRawDF.groupBy("user_id").agg(count("user_id").as("times"))
val userWithCountDF = usersRawDF.join(
addressCountDF,
usersRawDF("id") <=> addressCountDF("user_id")
&& usersRawDF("id") <=> addressCountDF("user_id"),
"left"
)
userWithCountDF.select("id", "times").filter("times > 0").show(100)
Adjust the latest version for Ubuntu
prepare:
wget http://apache-mirror.8birdsvideo.com/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz -P install/
wget http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz -P install/
wget http://www.gtlib.gatech.edu/pub/apache/zeppelin/zeppelin-0.8.2/zeppelin-0.8.2-bin-all.tgz -P install/
On new ubuntu machine
> sudo apt install make
> sudo snap install docker
> sudo groupadd docker
> sudo gpasswd -a $USER docker
> sudo usermod -aG docker $USER
Here is the Dockerfile
#Run a kafka server side
#Prepare the OS
FROM ubuntu:16.04
MAINTAINER Yiyi Kang <yiyikangrachel@gmail.com>
ENV DEBIAN_FRONTEND noninteractive
ENV JAVA_HOME /usr/lib/jvm/java-8-openjdk-amd64
ENV LANG en_US.UTF-8
ENV LC_ALL en_US.UTF-8
RUN apt-get -qq update
RUN apt-get -qqy dist-upgrade
#Prepare the denpendencies
RUN apt-get install -qy wget unzip vim
RUN apt-get install -qy iputils-ping
#Install SUN JAVA
RUN apt-get update && \
apt-get install -y --no-install-recommends locales && \
locale-gen en_US.UTF-8 && \
apt-get dist-upgrade -y && \
apt-get install -qy openjdk-8-jdk
#Prepare for hadoop and spark
RUN apt-get install -y openssh-server
RUN mkdir /var/run/sshd
RUN ssh-keygen -q -t rsa -N '' -f /root/.ssh/id_rsa
RUN cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
#Prepare python env
RUN apt-get install -y git
RUN apt-get install -y build-essential zlib1g-dev libbz2-dev
RUN apt-get install -y libreadline6 libreadline6-dev sqlite3 libsqlite3-dev
RUN apt-get update --fix-missing
RUN apt-get install -y libssl-dev
RUN apt-get install -y software-properties-common vim
RUN add-apt-repository ppa:jonathonf/python-3.6
RUN apt-get update
RUN apt-get install -y build-essential python3.6 python3.6-dev python3-pip python3.6-venv
RUN ln -s /usr/bin/python3 /usr/bin/python
RUN ln -s /usr/bin/pip3 /usr/bin/pip
#pandas
RUN pip install pandas
RUN pip install -U pandasql
RUN mkdir /tool/
WORKDIR /tool/
#add the software hadoop
ADD install/hadoop-3.2.1.tar.gz /tool/
RUN ln -s /tool/hadoop-3.2.1 /tool/hadoop
ADD conf/core-site.xml /tool/hadoop/etc/hadoop/
ADD conf/hdfs-site.xml /tool/hadoop/etc/hadoop/
ADD conf/hadoop-env.sh /tool/hadoop/etc/hadoop/
#add the software spark
ADD install/spark-2.4.4-bin-hadoop2.7.tgz /tool/
RUN ln -s /tool/spark-2.4.4-bin-hadoop2.7 /tool/spark
ADD conf/spark-env.sh /tool/spark/conf/
#add the software zeppelin
ADD install/zeppelin-0.8.2-bin-all.tgz /tool/
RUN ln -s /tool/zeppelin-0.8.2-bin-all /tool/zeppelin
#set up the app
EXPOSE 9000 9870 8080 4040
RUN mkdir -p /app/
ADD start.sh /app/
WORKDIR /app/
CMD [ "./start.sh" ]
Try to Set Up on Host Machine on CentOS 7
I have JAVA there according to jenv
Need JAVA ENV JDK8, 11, 12
> sudo yum install git
> git clone https://github.com/gcuisinier/jenv.git ~/.jenv
> echo 'export PATH="$HOME/.jenv/bin:$PATH"' >> ~/.bash_profile
> echo 'eval "$(jenv init -)"' >> ~/.bash_profile
> . ~/.bash_profile
Check version
> jenv --version
jenv 0.5.2-12-gdcbfd48
Download JDK 8, 11, 12 from Official website
https://www.oracle.com/technetwork/java/javase/downloads/jdk12-downloads-5295953.html
jdk-11.0.4_linux-x64_bin.tar.gz
Unzip all of these files and place in working directory, link to /opt directory
> tar zxvf jdk-11.0.4_linux-x64_bin.tar.gz
> mv jdk-11.0.4 ~/tool/
> sudo ln -s /home/redis/tool/jdk-11.0.4 /opt/jdk-11.0.4
Add to JENV
> jenv add /opt/jdk-11.0.4
Check the installed versions
> jenv versions
* system (set by /home/redis/.jenv/version)
11
11.0
11.0.4
oracle64-11.0.4
Try to set global to 11
> jenv global 11.0
> java -version
java version "11.0.4" 2019-07-16 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.4+10-LTS)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.4+10-LTS, mixed mode)
Prepare HADOOP
> wget http://apache-mirror.8birdsvideo.com/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz
> tar zxvf hadoop-3.2.1.tar.gz
> mv hadoop-3.2.1 ~/tool/
> sudo ln -s /home/carl/tool/hadoop-3.2.1 /opt/hadoop-3.2.1
> sudo ln -s /opt/hadoop-3.2.1 /opt/hadoop
Site Configuration
> vi etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://0.0.0.0:9000</value>
</property>
</configuration>
HDFS site configuration
> vi etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
Shell Command ENV
Check JAVA_HOME before we configuration the file
> jenv doctor
[OK] No JAVA_HOME set
[OK] Java binaries in path are jenv shims
[OK] Jenv is correctly loaded
> jenv enable-plugin export
Restart the service
> jenv global 11.0
> java -version
java version "11.0.4" 2019-07-16 LTS
> echo $JAVA_HOME
/home/carl/.jenv/versions/11.0
> vi etc/hadoop/hadoop-env.sh
export JAVA_HOME="/home/carl/.jenv/versions/11.0"
SSH to my localhost, promote for password
> ssh localhost
Generate the key pair
> ssh-keygen -q -t rsa -N '' -f ~/.ssh/id_rsa
> cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
SSH to localhost successful
> ssh localhost
Last login: Thu Oct 24 16:12:36 2019 from localhost
Start HDFS
> cd /opt/hadoop
> bin/hdfs namenode -format
> sbin/start-dfs.sh
Visit the web UI
http://rancher-worker1:9870/dfshealth.html#tab-overview
Exceptions:
Failed to retrieve data from /webhdfs/v1/?op=LISTSTATUS: Server Error
Find this in the logging
> grep "Error" ./*
./hadoop-carl-namenode-rancher-worker1.log:2019-10-24 16:15:59,844 WARN org.eclipse.jetty.servlet.ServletHandler: Error for /webhdfs/v1/
./hadoop-carl-namenode-rancher-worker1.log:java.lang.NoClassDefFoundError: javax/activation/DataSource
./hadoop-carl-namenode-rancher-worker1.log: at com.sun.jersey.spi.inject.Errors.processWithErrors(Errors.java:193)
Solution:
https://github.com/highsource/jsonix-schema-compiler/issues/81
https://salmanzg.wordpress.com/2018/02/20/webhdfs-on-hadoop-3-with-java-9/
> vi etc/hadoop/hadoop-env.sh
export HADOOP_OPTS="--add-modules java.activation"
Not working at all, according to this web page
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Java+Versions
Currently it only support JDK8, I will need to change to JDK8 instead.
> jenv global 1.8
> java -version
java version "1.8.0_221"
> echo $JAVA_HOME
/home/carl/.jenv/versions/1.8
Prepare Spark
> wget http://apache.mirrors.ionfish.org/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
> tar zxvf spark-2.4.4-bin-hadoop2.7.tgz
> mv spark-2.4.4-bin-hadoop2.7 ~/tool/spark-2.4.4
> sudo ln -s /home/carl/tool/spark-2.4.4 /opt/spark-2.4.4
> sudo ln -s /opt/spark-2.4.4 /opt/spark
> cd /opt/spark
> cp conf/spark-env.sh.template conf/spark-env.sh
> vi conf/spark-env.sh
HADOOP_CONF_DIR=/opt/hadoop/etc/hadoop
Prepare PYTHON 3.7 ENV
Since we want to migrate all the things to Python3, install and prepare python3
Install PYENV from the latest github
> git clone https://github.com/pyenv/pyenv.git ~/.pyenv
Add to the PATH
> vi ~/.bash_profile
PATH=$PATH:$HOME/.pyenv/bin
eval "$(pyenv init -)"
> . ~/.bash_profile
Check installation
> pyenv -v
pyenv 1.2.14-8-g0e7cfc3
Check all the versions, latest is 3.7.5 and 3.8.0, install some other versions
https://www.python.org/downloads/
Some warning and possible dependencies
WARNING: The Python bz2 extension was not compiled. Missing the bzip2 lib?
WARNING: The Python readline extension was not compiled. Missing the GNU readline lib?
WARNING: The Python sqlite3 extension was not compiled. Missing the SQLite3 lib?
> sudo yum install bzip2-devel
> sudo yum install sqlite-devel
> sudo yum install readline-devel
> pyenv install 3.8.0
> pyenv install 3.7.5
> pyenv versions
* system (set by /home/carl/.pyenv/version)
3.7.5
3.8.0
> pyenv global 3.8.0
> python -V
Python 3.8.0
More Python Libraries
> pip install --upgrade pip
> pip install pandas
> pip install -U pandasql
Failed with
ModuleNotFoundError: No module named '_ctypes'
Solution:
> sudo yum install libffi-devel
It will solve the problem. Success install pandas and pandasql again.
> pip install pandas
> pip install -U pandasql
Prepare Zeppelin
> wget http://www.gtlib.gatech.edu/pub/apache/zeppelin/zeppelin-0.8.2/zeppelin-0.8.2-bin-all.tgz
> tar zxvf zeppelin-0.8.2-bin-all.tgz
> mv zeppelin-0.8.2-bin-all ~/tool/zeppelin-0.8.2
> sudo ln -s /home/carl/tool/zeppelin-0.8.2 /opt/zeppelin-0.8.2
> sudo ln -s /opt/zeppelin-0.8.2 /opt/zeppelin
Some Configuration for Zeppelin
> cp conf/zeppelin-site.xml.template conf/zeppelin-site.xml
> cp conf/shiro.ini.template conf/shiro.ini
> cp conf/zeppelin-env.sh.template conf/zeppelin-env.sh
Add user to the auth config
> vi conf/shiro.ini
[users]
carl = pass123, admin
kiko = pass123, admin
Site configuration
> vi conf/zeppelin-site.xml
<property>
<name>zeppelin.server.addr</name>
<value>0.0.0.0</value>
<description>Server binding address</description>
</property>
<property>
<name>zeppelin.anonymous.allowed</name>
<value>false</value>
<description>Anonymous user allowed by default</description>
</property>
ENV configuration
> vi conf/zeppelin-env.sh
export SPARK_HOME="/opt/spark"
export HADOOP_CONF_DIR="/opt/hadoop/etc/hadoop/"
Start the Service
> bin/zeppelin.sh
> sudo bin/zeppelin-daemon.sh stop
> sudo bin/zeppelin-daemon.sh start
Visit these UI
http://rancher-worker1:9870/explorer.html#/
http://rancher-worker1:8080/#/
References:
Docker Permission
https://askubuntu.com/questions/477551/how-can-i-use-docker-without-sudo
JAVA_HOME
https://www.jenv.be/
https://github.com/jenv/jenv
https://stackoverflow.com/questions/28615671/set-java-home-to-reflect-jenv-java-version
发表评论
-
Stop Update Here
2020-04-28 09:00 332I will stop update here, and mo ... -
NodeJS12 and Zlib
2020-04-01 07:44 493NodeJS12 and Zlib It works as ... -
Docker Swarm 2020(2)Docker Swarm and Portainer
2020-03-31 23:18 378Docker Swarm 2020(2)Docker Swar ... -
Docker Swarm 2020(1)Simply Install and Use Swarm
2020-03-31 07:58 381Docker Swarm 2020(1)Simply Inst ... -
Traefik 2020(1)Introduction and Installation
2020-03-29 13:52 351Traefik 2020(1)Introduction and ... -
Portainer 2020(4)Deploy Nginx and Others
2020-03-20 12:06 440Portainer 2020(4)Deploy Nginx a ... -
Private Registry 2020(1)No auth in registry Nginx AUTH for UI
2020-03-18 00:56 450Private Registry 2020(1)No auth ... -
Docker Compose 2020(1)Installation and Basic
2020-03-15 08:10 392Docker Compose 2020(1)Installat ... -
VPN Server 2020(2)Docker on CentOS in Ubuntu
2020-03-02 08:04 475VPN Server 2020(2)Docker on Cen ... -
Buffer in NodeJS 12 and NodeJS 8
2020-02-25 06:43 401Buffer in NodeJS 12 and NodeJS ... -
NodeJS ENV Similar to JENV and PyENV
2020-02-25 05:14 496NodeJS ENV Similar to JENV and ... -
Prometheus HA 2020(3)AlertManager Cluster
2020-02-24 01:47 440Prometheus HA 2020(3)AlertManag ... -
Serverless with NodeJS and TencentCloud 2020(5)CRON and Settings
2020-02-24 01:46 346Serverless with NodeJS and Tenc ... -
GraphQL 2019(3)Connect to MySQL
2020-02-24 01:48 262GraphQL 2019(3)Connect to MySQL ... -
GraphQL 2019(2)GraphQL and Deploy to Tencent Cloud
2020-02-24 01:48 463GraphQL 2019(2)GraphQL and Depl ... -
GraphQL 2019(1)Apollo Basic
2020-02-19 01:36 336GraphQL 2019(1)Apollo Basic Cl ... -
Serverless with NodeJS and TencentCloud 2020(4)Multiple Handlers and Running wit
2020-02-19 01:19 322Serverless with NodeJS and Tenc ... -
Serverless with NodeJS and TencentCloud 2020(3)Build Tree and Traverse Tree
2020-02-19 01:19 331Serverless with NodeJS and Tenc ... -
Serverless with NodeJS and TencentCloud 2020(2)Trigger SCF in SCF
2020-02-19 01:18 306Serverless with NodeJS and Tenc ... -
Serverless with NodeJS and TencentCloud 2020(1)Running with Component
2020-02-19 01:17 321Serverless with NodeJS and Tenc ...
相关推荐
Impala coexists alongside other Hadoop components, such as Hive and Pig, offering a complementary solution for interactive analytics. While Hive is better suited for long-running batch jobs, Impala ...
1、文件内容:sblim-gather-provider-2.2.8-9.el7.rpm以及相关依赖 2、文件形式:tar.gz压缩包 3、安装指令: #Step1、解压 tar -zxvf /mnt/data/output/sblim-gather-provider-2.2.8-9.el7.tar.gz #Step2、进入解压后的目录,执行安装 sudo rpm -ivh *.rpm 4、更多资源/技术支持:公众号禅静编程坊
本图书进销存管理系统管理员功能有个人中心,用户管理,图书类型管理,进货订单管理,商品退货管理,批销订单管理,图书信息管理,客户信息管理,供应商管理,库存分析管理,收入金额管理,应收金额管理,我的收藏管理。 用户功能有个人中心,图书类型管理,进货订单管理,商品退货管理,批销订单管理,图书信息管理,客户信息管理,供应商管理,库存分析管理,收入金额管理,应收金额管理。因而具有一定的实用性。 本站是一个B/S模式系统,采用Spring Boot框架,MYSQL数据库设计开发,充分保证系统的稳定性。系统具有界面清晰、操作简单,功能齐全的特点,使得图书进销存管理系统管理工作系统化、规范化。本系统的使用使管理人员从繁重的工作中解脱出来,实现无纸化办公,能够有效的提高图书进销存管理系统管理效率。 关键词:图书进销存管理系统;Spring Boot框架;MYSQL数据库
2024中国在人工智能领域的创新能力如何研究报告.pdf
人脸识别项目实战
人脸识别项目实战
人脸识别项目实战
内容概要:本文档详细介绍了基于CEEMDAN(完全自适应噪声集合经验模态分解)的方法实现时间序列信号分解的具体项目。文中涵盖项目背景介绍、主要目标、面临的挑战及解决方案、技术创新点、应用领域等多方面内容。项目通过多阶段流程(数据准备、模型设计与构建、性能评估、UI设计),并融入多项关键技术手段(自适应噪声引入、并行计算、机器学习优化等)以提高非线性非平稳信号的分析质量。同时,该文档包含详细的模型架构描述和丰富的代码样例(Python代码),有助于开发者直接参考与复用。 适合人群:具有时间序列分析基础的科研工作者、高校教师与研究生,从事信号处理工作的工程技术人员,或致力于数据科学研究的从业人员。 使用场景及目标:此项目可供那些面临时间序列数据中噪声问题的人群使用,尤其适用于需从含有随机噪音的真实世界信号里提取有意义成分的研究者。具体场景包括但不限于金融市场趋势预测、设备故障预警、医疗健康监控以及环境质量变动跟踪等,旨在提供一种高效的信号分离和分析工具,辅助专业人士进行精准判断和支持决策。 其他说明:本文档不仅限于理论讲解和技术演示,更着眼于实际工程项目落地应用,强调软硬件资源配置、系统稳定性测试等方面的细节考量。通过完善的代码实现说明以及GUI界面设计指南,使读者能够全面理解整个项目的开发流程,同时也鼓励后续研究者基于已有成果继续创新拓展,探索更多的改进空间与发展机遇。此外,针对未来可能遇到的各种情况,提出了诸如模型自我调整、多模态数据融合等发展方向,为长期发展提供了思路指导。
监护人,小孩和玩具数据集 4647张原始图片 监护人 食物 孩子 玩具 精确率可达85.4% pasical voc xml格式
人脸识别项目实战
人脸识别项目实战
在智慧园区建设的浪潮中,一个集高效、安全、便捷于一体的综合解决方案正逐步成为现代园区管理的标配。这一方案旨在解决传统园区面临的智能化水平低、信息孤岛、管理手段落后等痛点,通过信息化平台与智能硬件的深度融合,为园区带来前所未有的变革。 首先,智慧园区综合解决方案以提升园区整体智能化水平为核心,打破了信息孤岛现象。通过构建统一的智能运营中心(IOC),采用1+N模式,即一个智能运营中心集成多个应用系统,实现了园区内各系统的互联互通与数据共享。IOC运营中心如同园区的“智慧大脑”,利用大数据可视化技术,将园区安防、机电设备运行、车辆通行、人员流动、能源能耗等关键信息实时呈现在拼接巨屏上,管理者可直观掌握园区运行状态,实现科学决策。这种“万物互联”的能力不仅消除了系统间的壁垒,还大幅提升了管理效率,让园区管理更加精细化、智能化。 更令人兴奋的是,该方案融入了诸多前沿科技,让智慧园区充满了未来感。例如,利用AI视频分析技术,智慧园区实现了对人脸、车辆、行为的智能识别与追踪,不仅极大提升了安防水平,还能为园区提供精准的人流分析、车辆管理等增值服务。同时,无人机巡查、巡逻机器人等智能设备的加入,让园区安全无死角,管理更轻松。特别是巡逻机器人,不仅能进行360度地面全天候巡检,还能自主绕障、充电,甚至具备火灾预警、空气质量检测等环境感知能力,成为了园区管理的得力助手。此外,通过构建高精度数字孪生系统,将园区现实场景与数字世界完美融合,管理者可借助VR/AR技术进行远程巡检、设备维护等操作,仿佛置身于一个虚拟与现实交织的智慧世界。 最值得关注的是,智慧园区综合解决方案还带来了显著的经济与社会效益。通过优化园区管理流程,实现降本增效。例如,智能库存管理、及时响应采购需求等举措,大幅减少了库存积压与浪费;而设备自动化与远程监控则降低了维修与人力成本。同时,借助大数据分析技术,园区可精准把握产业趋势,优化招商策略,提高入驻企业满意度与营收水平。此外,智慧园区的低碳节能设计,通过能源分析与精细化管理,实现了能耗的显著降低,为园区可持续发展奠定了坚实基础。总之,这一综合解决方案不仅让园区管理变得更加智慧、高效,更为入驻企业与员工带来了更加舒适、便捷的工作与生活环境,是未来园区建设的必然趋势。
本届年会的主题是“青春梦想创新创业”。通过学术论文报告、创新创业项目展示、创业项目推介、工作研讨、联谊活动、大会报告等活动,全面展示大学生最新的创新创业成果。年会共收到491所高校推荐的学术论文756篇、创新创业展示项目721项、创业推介项目156项,合计1633项,为历届年会数量最高。经过36所“985”高校相关学科专家的初评以及国家级大学生创新创业训练计划专家组的复选,最终遴选出可参加本次年会的学术论文180篇,创新创业展示项目150个,创业推介项目45项,共计375项,涉及30个省市的236所高校。年会还收到了来自澳门特别行政区、俄罗斯的13项学术论文及参展项目。这些材料集中反映了各高校最新的创新创业教育成果,也直接体现了当代大学生的创新思维和实践能力。
人脸识别项目实战
6ES7215-1AG40-0XB0_V04.04.01固件4.5
在无人机上部署SchurVins的yaml配置文件
uniapp实战商城类app和小程序源码,包含后端API源码和交互完整源码。
基于MobileNet轻量级网络实现的常见30多种食物分类,包含数据集、训练脚本、验证脚本、推理脚本等等。 数据集总共20k左右,推理的形式是本地的网页推理
2024年央国企RPA市场研究报.pdf
VSCodeSetup-x64-1.98.0.rar vscode是一种简化且高效的代码编辑器,同时支持诸如调试,任务执行和版本管理之类的开发操作。它的目标是提供一种快速的编码编译调试工具。然后将其余部分留给IDE。vscode集成了所有一款现代编辑器所应该具备的特性,包括语法高亮、可定制的热键绑定、括号匹配、以及代码片段收集等。 Visual Studio Code(简称VSCode)是Microsoft开发的代码编辑器,它支持Windows,Linux和macOS等操作系统以及开源代码。它支持测试,并具有内置的Git版本控制功能以及开发环境功能,例如代码完成(类似于IntelliSense),代码段和代码重构等。编辑器支持用户定制的配置,例如仍在编辑器中时,可以更改各种属性和参数,例如主题颜色,键盘快捷键等,内置的扩展程序管理功能。