Hive CLI

k_lb

浏览: 834041 次
性别:
来自: 郑州

最近访客更多访客>>

u012363178

rattersnake

LuffyMother

uclnn

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

Variables and Properties -- 设置 hive 变量

$ hive --define foo=bar
hive> set foo;
foo=bar;
hive> set hivevar:foo;
hivevar:foo=bar;
hive> set hivevar:foo=bar2;
hive> set foo;
foo=bar2

hive> create table toss1(i int, ${hivevar:foo} string);
hive> describe toss1;
i       int
bar2    string
hive> create table toss2(i2 int, ${foo} string);
hive> describe toss2;
i2      int
bar2    string
hive> drop table toss1;
hive> drop table toss2;

Hive “One Shot” Commands

A quick and dirty technique is to use this feature to output the query results to a file.
Adding the -S for silent mode removes the OK and Time taken ... lines, as well as other
inessential output, as in this example:

$ hive -S -e "select * FROM mytable LIMIT 3" > /tmp/myquery
$ cat /tmp/myquery
name1 10
name2 20
name3 30

Finally, here is a useful trick for finding a property name that you can’t quite remember,
without having to scroll through the list of the set output. Suppose you can’t remember
the name of the property that specifies the “warehouse” location for managed tables:

$ hive -S -e "set" | grep warehouse
hive.metastore.warehouse.dir=/user/hive/wa
hive.warehouse.subdir.inherit.perms=false

Executing Hive Queries from Files -- 执行命令文件.hql .q

Hive can execute one or more queries that were saved to a file using the -f file argu-
ment. By convention, saved Hive query files use the .q or .hql extension.
$ hive -f /path/to/file/withqueries.hql
If you are already inside the Hive shell you can use the SOURCE command to execute a
script file. Here is an example:
$ cat /path/to/file/withqueries.hql
SELECT x.* FROM src x;
$ hive
hive> source /path/to/file/withqueries.hql;

The details for xpath don’t concern us here, but note that we pass string literals to the
xpath function and use FROM src LIMIT 1 to specify the required FROM clause and to limit
the output. Substitute src with the name of a table you have already created or create
a dummy table named src:
CREATE TABLE src(s STRING);
Also the source table must have at least one row of content in it:
$ echo "one row" > /tmp/myfile
$ hive -e "LOAD DATA LOCAL INPATH '/tmp/myfile' INTO TABLE src;

The .hiverc File -- HIVE 系统参数配制

The following shows an example of a typical $HOME/.hiverc file:
ADD JAR /path/to/custom_hive_extensions.jar;
set hive.cli.print.current.db=true;
set hive.exec.mode.local.auto=true;
The first line adds a JAR file to the Hadoop distributed cache. The second line modifies
the CLI prompt to show the current working Hive database, as we described earlier in
“Variables and Properties” on page 31. The last line “encourages” Hive to be more
aggressive about using local-mode execution when possible, even when Hadoop is
running in distributed or pseudo-distributed mode, which speeds up queries for small
data sets.

配置如下参数，可以开启Hive的本地模式：

hive> set hive.exec.mode.local.auto=true;(默认为false)

当一个job满足如下条件才能真正使用本地模式：

1.job的输入数据大小必须小于参数：hive.exec.mode.local.auto.inputbytes.max(默认128MB)

2.job的map数必须小于参数：hive.exec.mode.local.auto.tasks.max(默认4)

3.job的reduce数必须为0或者1

可用参数hive.mapred.local.mem(默认0)控制child jvm使用的最大内存数

Autocomplete -- TAB 自动命令补齐

If you start typing and hit the Tab key, the CLI will autocomplete possible keywords
and function names. For example, if you type SELE and then the Tab key, the CLI will
complete the word SELECT.
If you type the Tab key at the prompt, you’ll get this reply:
hive>
Display all 407 possibilities? (y or n)

Command History

You can use the up and down arrow keys to scroll through previous commands. Ac-
tually, each previous line of input is shown separately; the CLI does not combine mul-
tiline commands and queries into a single history entry. Hive saves the last 100,00 lines
into a file $HOME/.hivehistory.

Shell Execution -- 执行SELL 命令

You don’t need to leave the hive CLI to run simple bash shell commands. Simply
type ! followed by the command and terminate the line with a semicolon (;):
hive> ! /bin/echo "what up dog";
"what up dog"
hive> ! pwd;
/home/me/hiveplay

Hadoop dfs Commands from Inside Hive --查询DFS 命令

You can run the hadoop dfs ... commands from within the hive CLI; just drop the
hadoop word from the command and add the semicolon at the end:
hive> dfs -ls / ;
Found 3 items
drwxr-xr-x - root supergroup 0 2011-08-17 16:27 /etl
drwxr-xr-x - edward supergroup 0 2012-01-18 15:51 /flag
drwxrwxr-x - hadoop supergroup 0 2010-02-03 17:50 /users
This method of accessing hadoop commands is actually more efficient than using the
hadoop dfs ... equivalent at the bash shell, because the latter starts up a new JVM
instance each time, whereas Hive just runs the same code in its current process.

Comments in Hive Scripts -- 注释

As of Hive v0.8.0, you can embed lines of comments that start with the string --, for
example:
-- Copyright (c) 2012 Megacorp, LLC.
-- This is the best Hive script evar!!
SELECT * FROM massive_table;
...

Query Column Headers --显示字段名

As a final example that pulls together a few things we’ve learned, let’s tell the CLI to
print column headers, which is disabled by default. We can enable this feature by setting
the hiveconf property hive.cli.print.header to true:
hive> set hive.cli.print.header=true;

hive> SELECT * FROM system_logs LIMIT 3;
tstamp severity server message
1335667117.337715 ERROR server1 Hard drive hd1 is 90% full!
1335667117.338012 WARN server1 Slow response from server2.
1335667117.339234 WARN server2 Uh, Dude, I'm kinda busy right now...

分享到：