- 浏览: 283574 次
- 性别:
- 来自: 广州
最新评论
-
jpsb:
...
为什么需要分布式? -
leibnitz:
hi guy, this is used as develo ...
compile hadoop-2.5.x on OS X(macbook) -
string2020:
撸主真土豪,在苹果里面玩大数据.
compile hadoop-2.5.x on OS X(macbook) -
youngliu_liu:
怎样运行这个脚本啊??大牛,我刚进入搜索引擎行业,希望你能不吝 ...
nutch 数据增量更新 -
leibnitz:
also, there is a similar bug ...
2。hbase CRUD--Lease in hbase
文章列表
15/12/09 16:47:52 INFO yarn.ExecutorRunnable: Setting up executor with environment: Map(CLASSPATH -> {{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/s
hare/hadoop/common/*<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/lib/*<CPS>$HADOOP_ ...
run on a yarn ensemble is straightforward,
1.setup HADOOP_CONF_DIR
u can use command export HADOOP_CONF_DIR=xx
or add it to spark-env.sh
2.
spark-submit --master yarn --class org.apache.spark.examples.JavaWordCount --verbose --deploy-mode client ~/spark/spark-1.4.1-bin-hadoop2.4/ ...
yep,u can submit a app to spark ensemble by spark-submit command ,e.g.
spark-submit --master spark://gzsw-02:7077 --class org.apache.spark.examples.JavaWordCount --verbose --deploy-mode client ~/spark/spark-1.4.1-bin-hadoop2.4/lib/spark-examples-1.4.1-hadoop2.4.0.jar spark/spark-1.4.1-bin-hadoo ...
per partition versions of map() and foreach,
ref :learning spark
http://www.baidu.com/p/hejuncheng1018?from=wenku
chapter 9,http://www.tuicool.com/articles/3mMz6b
chapter 11
chapter 12
chapter 14
chapter 20(this answer is not all correct:(
chapter 17
hadoop-compression
- 博客分类:
- hadoop
http://blog.cloudera.com/blog/2009/11/hadoop-at-twitter-part-1-splittable-lzo-compression/
(namely :使hadoop支持Splittable压缩lzo)
Very basic question about Hadoop and compressed input files
Hadoop gzip input file using only one mapper
Why can't hadoop split up a large text file and then compress t ...
all figures below are from 'learing-spark',
answers for the books:
outline
required 1
required 2
https://zhidao.baidu.com/question/1671038094187205107.html
optional 2.1
optional 2-3
required 3
required 4
required 5
--
teacher's book
in zookeeper ,during certain io pressure,the client will try to reconnect to quorum.after that,the quorum peer will return a new session timeout (akka negotiatedSessionTimeout) to former,then client will recompulate the real connTimeout and readTimeout from the response .the negotiatedSessionTime ...
after a heavy cost time(primary at download huge number of jars),the first example from book 'learning spark' is run through.
the source code is very simple
/**
* Illustrates flatMap + countByValue for wordcount.
*/
package com.oreilly.learningsparkexamples.scala
import org.apache. ...
as u know,the hbas's data logs (akka wal) will roll after certain intervals to speedup restore data lost occasionally.and of course,both log rolling and flush memstore will block up all wirtes but reads.so if decreasing the log rolling will optimize the cluster perf.
1.case
during the h ...
abstract,spark can be compiled with:
maven,
sbt,
intellj ideal
ref:Spark1.0.0 源码编译和部署包生成
also,if u want to load spark-project into eclipse ,then it is necessary to make a 'eclipse project' first by one of below solutions:
1.mvn eclipse:eclipse [optional]
2. ./sbt/sbt clean compile packa ...
the flows of distributing a scala project by using scala sbt(scala simple build tool) plugin,sbt assembly plugin through 'create scala project','download dependent jars','publish scala project' steps
scala eclipse sbt( Simple Build Tool) 应用程序开发
below u will see different ip selection for command 'ping' and telnet' in linux:
server1@myhost18:~$ telnet host1-26 60020
Trying 192.168.1.126...
Connected to host1-26.
Escape character is '^]'.
quit
|??)org.apache.hadoop.ipc.RPC$VersionMismatch>Server IPC version 3 cannot communica ...
in these days ,i learned to the data warehouse framework-hive ,mainly from the ebook 'programming hive' [1],as it's about 23 chapters in detail;)
so below are the outlines about this topic:
1.overview
2.architecture
3.features
4. hive vs pig,hive vs hbase
5.use cases
1.overview
...