- 浏览: 245837 次
- 性别:
- 来自: 成都
最新评论
-
oldrat:
https://github.com/oldratlee/tr ...
Kafka: High Qulity Posts
文章列表
Overview
HCatalog is a table and storage management layer for Hadoop that enables users with different data processing tools — Pig, MapReduce — to more easily read and write data on the grid. HCatalog’s table abstraction presents users with a relational view of data in the Hadoop distributed file ...
Hive: HiveServer2
- 博客分类:
- Hive
HiveServer2 (HS2) is a server interface that enables remote clients to execute queries against Hive and retrieve the results. The current implementation, based on Thrift RPC, is an improved version of HiveServer and supports multi-client concurrency and authentication. It is designed to provide bet ...
start time
1.uptime
16:11:40 up 59 days, 4:21, 2 users, load average: 0.00, 0.01, 0.002. date -d "$(awk -F. '{print $1}' /proc/uptime) second ago" +"%Y-%m-%d %H:%M:%S"
3. cat /proc/uptime| awk -F. '{run_days=$1 / 86400;run_hour=($1 % 86400)/3600;run_minute=($1 % 3600)/6 ...
1. configure oozie-site.xml
<property> <name>oozie.db.schema.name</name> <value>oozie</value> </property> <property> <name>oozie.service.JPAService.create.db.schema</name> <value>true</value& ...
Oozie: Run examples
- 博客分类:
- oozie
#cd /path/to/oozie-4.0.1
#tar -xzf oozie-examples.tar.gz
modify all job.properties in the examples dirnameNode=hdfs://192.168.122.1:2014jobTracker=192.168.122.1:2015queueName=defaultexamplesRoot=oozie/examples
#hdfs dfs -put -R examples oozie/
Run pig example
#oozie job --oozie http://lo ...
Oozie: configuration
- 博客分类:
- oozie
---conf/oozie-site.xml---
<property> <!--<name>oozie.service.AuthorizationService.security.enabled</name>-->
<name>oozie.service.AuthorizationService.authorization.enabled</name> <value>false</value> </propert ...
oozie: common errors
- 博客分类:
- oozie
1. when run oozie examples, there is a error
Traceback (most recent call last): File "/usr/lib/python3/dist-packages/CommandNotFound/util.py", line 24, in crash_guard callback() File "/usr/lib/command-not-found", line 69, in main enable_i18n() File "/usr/lib/comma ...
oozie: Workflow
- 博客分类:
- oozie
Workflow Definition
A workflow definition is a DAG with control flow nodes (start, end, decision, fork, join, kill) or action nodes (map-reduce, pig, etc.), nodes are connected by transitions arrows.
The workflow definition language is XML based and it is called hPDL (Hadoop Process Definition La ...
Import data from mysql to hdfs
----------------------------------------
Export data from hdfs to mysql
--------------------------------------
----------------------------------------------------
The difference between sqoop1 and sqoop2
Feature
Sqoop
Sqoop2
Connect ...
Create a new database in MySQL and grant privileges to a Hue user to manage this database.
mysql> create database hue;
Query OK, 1 row affected (0.01 sec)
mysql> grant all on hue.* to 'hue'@'localhost' identified by 'secretpassword';
Query OK, 0 rows affected (0.00 sec)
Shut down Hu ...
MySQL: Data Join
- 博客分类:
- mysql
Introduction
The purpose of this article is to show the power of MySQL JOIN operations, Nested MySQL Queries(intermediate or temporary result set table) and Aggregate functions(like GROUP BY). One can refer to link1 or link2 for basic understanding of MySQL join operations.
In order to explain ...
1. when click a job in 'Job Browser' panle, the log of the job doesn't appare
R: Hadoop2.x has the ablity to aggregate logs from cluster nodes and purge the old logs. My problem is rooted at wrong vm guest clock(namenode runs on my local host while the other three datanodes run on
centos6.4 vm ...
When I create a sqoop job to import data from mysql to hdfs, and submit it to run. Hue's error.log contains the follwoing error inof:
and the sqoop.log has error info:
The root is that sqoop job syntax has some error.
1. Press 'Data Browsers' -> 'Sqoop Transfer' -> 'create new job'
2.create new connection using the default connector with id 1 3. fill the 'From' fileds
4.fill the 'To' fields and click 'save and run' button
5.check the job status in 'Job Browser' panel
cogroup
cogroup is a generalization of group . Instead of collecting records of one input based on a key, it collects records of n inputs based on a key. The result is a record with a key and one bag for each input.
A = load 'input1' as (id:int, val:float);B = load 'input2' as (id:int, val2 ...