- 浏览: 245948 次
- 性别:
- 来自: 成都
最新评论
-
oldrat:
https://github.com/oldratlee/tr ...
Kafka: High Qulity Posts
文章列表
转http://mobile.51cto.com/ahot-466539.htm
微信这么大的流量,尤其是瞬间的峰值,对于任何团队和架构师都是一个极大的挑战,我们也在想,微信团队会用什么样的办法扛住了抢红包的流量,正巧今 天腾讯大讲堂的公共账号就分发出了 ...
https://github.com/pferrel/solr-recommender
References
http://ssc.io/deploying-a-massively-scalable-recommender-system-with-apache-mahout/
https://github.com/pferrel/solr-recommender
http://wiki.apache.org/hama/BSPModel
Overview
In Apache Hama, you can implement your own BSP method by extending from org.apache.hama.bsp.BSP class. Apache Hama provides in this class a user-defined function bsp() that can be used to write your own BSP program.
The bsp() function handles ...
http://wiki.apache.org/hama/GraphModuleInternals
Hama includes the Graph module for vertex-centric graph computations. Hama's Graph APIs allows you to program Google's Pregel style applications with simple programming interface.
Internals
The Graph APIs are implemented on top of Hama BS ...
Hama: BSPMaster
- 博客分类:
- Hama
http://wiki.apache.org/hama/BSPMaster
Introduction
The main responsibility of BSPMaster can be found at Architecture
Services
The BSPMaster is a collection of services performing different tasks, including:
masterServer: An RPC server.
instructor: Asynchronous message dispatch ...
Hama: GroomServer
- 博客分类:
- Hama
http://wiki.apache.org/hama/GroomServer
Introduction
GroomServer is a process whose main responsibility is to manage bsp tasks. In addition to task management, GroomServer collaborates with BSPMaster so that job execution can be done correctly. Works that GroomServer performs include:
Chec ...
Hama: Architecture
- 博客分类:
- Hama
http://wiki.apache.org/hama/Architecture
Components
Apache Hama, based on Bulk Synchronous Parallel model[1], comprises three major components:
BSPMaster
GroomServer
Zookeeper.
It is very similar with Hadoop architecture, only except the portion of communication and synchroni ...
http://pages.cs.wisc.edu/~shuchi/courses/787-F09/
The high-level organization of Pregel programs is inspired by Valiant’s Bulk Synchronous Parallel model . Pregelcomputations consist of a sequence of iterations, called supersteps. During a superstep the framework invokes a user-defined function for each vertex, conceptually in parallel.The function ...
BSP Ralated papers
- 博客分类:
- Hama
http://paloaltodata.com/index.php?option=com_content&view=article&id=22
http://en.wikipedia.org/wiki/Bulk_synchronous_parallel
Setup Hama Eclipse Project
1. download hama-0.6.4-src.tar.gz
2. tar xzf hama-0.6.4-src.tar.gz
3. cd hama-0.6.4-src
4.change hadoop version to 2.6.0 in pom.xml
<hadoop.version>2.6.0</hadoop.version>
<id>hadoop2</id>
<properties>
<hadoop.versi ...
http://www.android-studio.org/
http://www.cnblogs.com/zoupeiyang/p/4034517.html
http://www.oschina.net/question/265039_173445
https://highlyscalable.wordpress.com/2012/05/01/probabilistic-structures-web-analytics-data-mining/
Statistical analysis and mining of huge multi-terabyte data sets is a common task nowadays, especially in the areas like web analytics and Internet advertising. Analysis of such large data sets of ...
机器学习中的相似性度量
- 博客分类:
- ML
http://www.cnblogs.com/heaad/archive/2011/03/08/1977733.html
1. 欧氏距离(Euclidean Distance)
欧氏距离是最易于理解的一种距离计算方法,源自欧氏空间中两点间的距离公式。
(1)二维平面上两点a(x1,y1)与b(x2,y2)间的欧氏距离:
(2)三维空间两点a(x1,y1,z1)与b(x2,y2,z2)间的欧氏距离:
(3)两个n维向量a(x11,x12,…,x1n)与 b(x21,x22,…,x2n)间的欧氏距离:
也可以用表示成向量运算的形式:
...
http://highscalability.com/blog/2012/4/5/big-data-counting-how-to-count-a-billion-distinct-objects-us.html
This is a guest post by Matt Abrams (@abramsm), from Clearspring, discussing how they are able to accurately estimate the cardinality of sets with billions of distinct elements using surpr ...