Running Solr on HDFS

博客分类：

Solr

https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS

2014-11-27 17:25
浏览 537
评论(0)
分类:开源软件

How does LinkedIn's recommendation system work?

博客分类：

Design

http://www.quora.com/How-does-LinkedIns-recommendation-system-work I gave this talk earlier this week at Hadoop World(http://www.hadoopworld.com/sessi...), a conference that is evangelizing Hadoop by way of highlighting how people across the industry are solving big business challenges by lever ...

2014-11-26 10:19
浏览 903
评论(0)
分类:开源软件

Kafka: High Qulity Posts

博客分类：

Kafka

http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying The Log: What every software engineer should know about real-time data's unifying abstraction Jay Kreps Principal Staff Engineer Posted on 12/16/2013 ...

2014-11-26 09:43
浏览 1003
评论(1)
分类:开源软件

Centos: install maven, ant , tomcat,mysql,Python,g++,NodeJS

博客分类：

operatingsystem

Install ant #wget http://apache.tradebit.com/pub//ant/binaries/apache-ant-1.9.4-bin.zip#unzip apache-ant-1.9.4-bin.zip # mv apache-ant-1.9.4/ /opt/ant #ln -s /opt/ant/bin/ant /usr/bin/ant #vi /etc/profile.d/ant.sh #!/bin/bash ANT_HOME=/opt/antPATH=$ANT_HOME/bin:$PATH export PATH ANT_HOME ex ...

2014-11-21 12:48
浏览 1039
评论(0)
分类:开源软件

Centos: install jdk

博客分类：

operatingsystem

#tar -xzf jdk-7u51-linux-x64.tar.gz -C /opt/ #ln -s /opt/jdk1.7.0_51/bin/java /sbin/java #echo "export JAVA_HOME=/opt/jdk1.7.0_51" > /etc/profile.d/java_env.sh #echo "export JRE_HOME=/opt/jdk1.7.0_51/jre" >> /etc/profile.d/java_env.sh #echo "export CLASSPATH=.:\ ...

2014-11-21 11:18
浏览 443
评论(0)
分类:开源软件

Tez: 5 Dynamic Graph Reconfiguration

博客分类：

Tez

Case Study: Automatic Reduce Parallelism Motivation Distributed data processing is dynamic by nature and it is extremely difficult to statically determine optimal concurrency and data movement methods a priori. More information is available during runtime, like data samples and sizes, which may he ...

2014-11-21 10:25
浏览 444
评论(0)
分类:开源软件

Tez: 4 Writing a Tez Input/Processor/Output

博客分类：

Tez

The previous couple of blogs covered Tez concepts and APIs. This gives some details on what is required to write a custom Input / Processor / Output, along with examples of existing I/P/Os provided by the Tez runtime library. Tez Task A Tez task is constituted of all the Inputs on its incoming edg ...

2014-11-19 16:22
浏览 495
评论(0)
分类:开源软件

Tez: 1 Apache Tez: A New Chapter in Hadoop Data Processing

博客分类：

Tez

What is Apache Tez? Apache Tez generalizes the MapReduce paradigm to execute a complex DAG (directed acyclic graph) of tasks. It also represents the next logical next step for Hadoop 2 and the introduction of with YARN and its more general-purpose resource management framework. While MapReduce has ...

2014-11-19 15:16
浏览 464
评论(0)
分类:开源软件

Tez: 2 Data Processing API in Apache Tez

博客分类：

Tez

Overview Apache Tez models data processing as a dataflow graph, with the vertices in the graph representing processing of data and edges representing movement of data between the processing. Thus user logic, that analyses and modifies the data, sits in the vertices. Edges determine the consumer of ...

2014-11-19 15:05
浏览 517
评论(0)
分类:开源软件

Tez: 3 Runtime API in Apache Tez

博客分类：

Tez

Apache Tez models data processing as a dataflow graph, with the vertices in the graph representing processing of data and edges representing movement of data between the processing. The user logic, that analyses and modifies the data, sits in the vertices. Edges determine the consumer of the data, ...

2014-11-19 14:47
浏览 500
评论(0)
分类:开源软件

Ambari: Install and configureation on Ubuntu

博客分类：

Ambari

Build from source code under ubuntu12.04 1. donw #wget http://mirrors.hust.edu.cn/apache/ambari/ambari-1.6.1/ambari-1.6.1.tar.gz #tar -xvfz ambari-1.6.1.tar.gz #cd ambari-1.6.1 2.prepare env see: https://cwiki.apache.org/confluence/display/AMBARI/Ambari+Development not: https://cwiki. ...

2014-11-19 14:41
浏览 2040
评论(0)
分类:开源软件

Build Hama and deploy it to clusters

博客分类：

Hama

1. download souce code #svn checkout https://svn.apache.org/repos/asf/hama/trunk hama-trunk 2. build #mvn -Declipse.workspace="/home/zhaohj/workspace/" eclipse:configure-workspace #mvn clean install -Phadoop2 -Dhadoop.version=2.3.0 #mvn eclipse:eclipse Note: use java 1.7. IF jav ...

2014-11-11 13:31
浏览 884
评论(0)
分类:开源软件

The Hadoop Ecosystem Table

博客分类：

Hadoop

http://hadoopecosystemtable.github.io/ http://blog.andreamostosi.name/big-data/ https://github.com/youngwookim/awesome-hadoop

2014-11-10 15:28
浏览 479
评论(0)
分类:开源软件

OpenNLP integrate with solr

博客分类：

OpenNLP
Solr

https://wiki.apache.org/solr/OpenNLP

2014-11-07 17:20
浏览 635
评论(0)
分类:开源软件

Tez: build from souce code

博客分类：

Tez

build from source code 1. download from http://tez.apache.org/install.html if you want to get the lattest codes through this command #git clone https://git-wip-us.apache.org/repos/asf/tez.git #tar xvf apache-tez-0.5.1-src.tar.gz #cd apache-tez-0.5.1-src #mvn package -Dhadoop.version= ...

2014-11-05 15:57
浏览 1123
评论(0)
分类:开源软件

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Running Solr on HDFS

How does LinkedIn's recommendation system work?

Kafka: High Qulity Posts

Centos: install maven, ant , tomcat,mysql,Python,g++,NodeJS

Centos: install jdk

Tez: 5 Dynamic Graph Reconfiguration

Tez: 4 Writing a Tez Input/Processor/Output

Tez: 1 Apache Tez: A New Chapter in Hadoop Data Processing

Tez: 2 Data Processing API in Apache Tez

Tez: 3 Runtime API in Apache Tez

Ambari: Install and configureation on Ubuntu

Build Hama and deploy it to clusters

The Hadoop Ecosystem Table

OpenNLP integrate with solr

Tez: build from souce code

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

最近访客更多访客>>