- 浏览: 246075 次
- 性别:
- 来自: 成都
最新评论
-
oldrat:
https://github.com/oldratlee/tr ...
Kafka: High Qulity Posts
文章列表
mvn clean package -Dhadoop2.version=2.3.0 -DskipTests
mvn clean package -Dhadoop.version=2.3.0 -DskipTests
mvn clean package -Dhadoop.profile=200 -DskipTests
The above commands will not work. Actually, you should download patch and patch mahout0.9 to support hadoop2 using the below build c ...
Running recommendations with Hadoop
The glue that binds together the various Mapper and Reducer components is org.apache.mahout.cf.taste.hadoop.item.RecommenderJob. It configures and invokes the series of MapReduce jobs discussed previously. These MapReduces and their relationships are illustrated ...
generating user vectors
Input format
userID: itemID1 itemID2 itemID3 ....
Output format
a Vector from all item IDs for the user, and outputs the user ID mapped to the user’s preference vector. All values in this vector are 0 or 1. For example, 98955 / [590:1.0, 22:1.0, 9059:1.0]
package ...
co-occurrence matrix
Instead of computing the similarity between every pair of items, it’ll compute the number of times each pair of items occurs together in some user’s list of preferences, in order to fill out the matrix.
Co-occurrence is like similarity; the more two items turn up togeth ...
Singular value decomposition–based recommenders
SVDRecommender
Linear interpolation item–based recommendation
KnnItemBasedRecommender
Cluster-based recommendation
TreeClusteringRecommender
References
http://en.wikipedia.org/wiki/Singular_value_decomposition
http: ...
User-based: Who is similar to the boy, and what do they like?
Item-based: What is similar to what the boy likes?
The algorithm
The difference between User-based and Item-based :
Slope-one recommender
It estimates preferences for new items based on average difference in the preference ...
Sample Data
1,101,5.0
1,102,3.0
1,103,2.5
2,101,2.0
2,102,2.5
2,103,5.0
2,104,2.0
3,101,2.5
3,104,4.0
3,105,4.5
3,107,5.0
4,101,5.0
4,103,3.0
4,104,4.5
4,106,4.0
5,101,4.0
5,102,3.0
5,103,2.0
5,104,4.0
5,105,3.5
5,106,4.0
Pearson correlation–based similarity
The Pear ...
recommending items to some user, denoted by u, as seen below
It would be terribly slow to examine every item. In reality, a neighborhood of most similar users is computed first, and only items known to those users are considered:
The basic flow to build a CF likes blow codes in mahout:
Da ...
1.# svn co http://svn.apache.org/repos/asf/mahout/trunk
or download mahout-distribution-0.9-src.tar.gz
2.mvn -DskipTests clean install package
3.create a common java project which using mahout.
4.configure build path. add external jars in mahout-distribution-0.9/examples/target/dependency d ...
collaborative filtering
producing recommendations based on, and only based on, knowledge of users’ rela-tionships to items. These techniques require no knowledge of the properties of the items themselves.
In fact, these are the two broadest categories of recommender engine algorithms: user-based a ...
When I export data to mysql using the following sqoop commands
--options-file sqoop.export.opt --export-dir inok/test/friendrecomend2
--table friend_rec --staging-table friend_rec_stage --clear-staging-table
--update-key id --update-mode allowinsert
the content of sqoop.export.opt like ...
ENV: Ubuntu 12.04
1. Install Eclipse
2. create desktop shortcut for Eclipse
a. create an empty document named eclipse.xx
b. edit eclipse.xx like followings (avoid failing to open eclipse memu Items)
[Desktop Entry]
Categories=Development;
Comment[zh_CN]=
Comment=
Exec=/home/z ...
When we run a pig job which using hive metastore table through hue. We need to locate all related jars to oozie sharelib
Prepare
a. compile hive-0.12.0 and hive-0.13.0 to against hadoop2.3.0
b. compile pig-0.12.0 to against hadoop2.3.0
Update local pig's sharelib
a. backups all jars in ...
Supervised learning
is tasked with learning a function from labeled training data in order to predict the value of any valid input. Common examples of supervised learning include classifying e-mail messages as spam, labeling Web pages according to their genre, and recognizing handwriting. Many alg ...
When run a sqoop job in hue. The is a error:
The sqoop job likes:
Root Cause:
The Sqoop command can be specified either using the command element or multiple arg elements.
When using the command element, Oozie will split the command on every space into multiple arguments.
When using the ar ...