Variables and Properties -- 设置 hive 变量
$ hive --define foo=bar
hive> set foo;
foo=bar;
hive> set hivevar:foo;
hivevar:foo=bar;
hive> set hivevar:foo=bar2;
hive> set foo;
foo=bar2
hive> create table toss1(i int, ${hivevar:foo} string);
hive> describe toss1;
i int
bar2 string
hive> create table toss2(i2 int, ${foo} string);
hive> describe toss2;
i2 int
bar2 string
hive> drop table toss1;
hive> drop table toss2;
Hive “One Shot” Commands
A quick and dirty technique is to use this feature to output the query results to a file.
Adding the -S for silent mode removes the OK and Time taken ... lines, as well as other
inessential output, as in this example:
$ hive -S -e "select * FROM mytable LIMIT 3" > /tmp/myquery
$ cat /tmp/myquery
name1 10
name2 20
name3 30
Finally, here is a useful trick for finding a property name that you can’t quite remember,
without having to scroll through the list of the set output. Suppose you can’t remember
the name of the property that specifies the “warehouse” location for managed tables:
$ hive -S -e "set" | grep warehouse
hive.metastore.warehouse.dir=/user/hive/wa
hive.warehouse.subdir.inherit.perms=false
Executing Hive Queries from Files -- 执行命令文件.hql .q
Hive can execute one or more queries that were saved to a file using the -f file argu-
ment. By convention, saved Hive query files use the .q or .hql extension.
$ hive -f /path/to/file/withqueries.hql
If you are already inside the Hive shell you can use the SOURCE command to execute a
script file. Here is an example:
$ cat /path/to/file/withqueries.hql
SELECT x.* FROM src x;
$ hive
hive> source /path/to/file/withqueries.hql;
The details for xpath don’t concern us here, but note that we pass string literals to the
xpath function and use FROM src LIMIT 1 to specify the required FROM clause and to limit
the output. Substitute src with the name of a table you have already created or create
a dummy table named src:
CREATE TABLE src(s STRING);
Also the source table must have at least one row of content in it:
$ echo "one row" > /tmp/myfile
$ hive -e "LOAD DATA LOCAL INPATH '/tmp/myfile' INTO TABLE src;
The .hiverc File -- HIVE 系统参数配制
The following shows an example of a typical $HOME/.hiverc file:
ADD JAR /path/to/custom_hive_extensions.jar;
set hive.cli.print.current.db=true;
set hive.exec.mode.local.auto=true;
The first line adds a JAR file to the Hadoop distributed cache. The second line modifies
the CLI prompt to show the current working Hive database, as we described earlier in
“Variables and Properties” on page 31. The last line “encourages” Hive to be more
aggressive about using local-mode execution when possible, even when Hadoop is
running in distributed or pseudo-distributed mode, which speeds up queries for small
data sets.
配置如下参数,可以开启Hive的本地模式:
hive> set hive.exec.mode.local.auto=true;(默认为false)
当一个job满足如下条件才能真正使用本地模式:
1.job的输入数据大小必须小于参数:hive.exec.mode.local.auto.inputbytes.max(默认128MB)
2.job的map数必须小于参数:hive.exec.mode.local.auto.tasks.max(默认4)
3.job的reduce数必须为0或者1
可用参数hive.mapred.local.mem(默认0)控制child jvm使用的最大内存数
Autocomplete -- TAB 自动命令补齐
If you start typing and hit the Tab key, the CLI will autocomplete possible keywords
and function names. For example, if you type SELE and then the Tab key, the CLI will
complete the word SELECT.
If you type the Tab key at the prompt, you’ll get this reply:
hive>
Display all 407 possibilities? (y or n)
Command History
You can use the up and down arrow keys to scroll through previous commands. Ac-
tually, each previous line of input is shown separately; the CLI does not combine mul-
tiline commands and queries into a single history entry. Hive saves the last 100,00 lines
into a file $HOME/.hivehistory.
Shell Execution -- 执行SELL 命令
You don’t need to leave the hive CLI to run simple bash shell commands. Simply
type ! followed by the command and terminate the line with a semicolon (;):
hive> ! /bin/echo "what up dog";
"what up dog"
hive> ! pwd;
/home/me/hiveplay
Hadoop dfs Commands from Inside Hive --查询DFS 命令
You can run the hadoop dfs ... commands from within the hive CLI; just drop the
hadoop word from the command and add the semicolon at the end:
hive> dfs -ls / ;
Found 3 items
drwxr-xr-x - root supergroup 0 2011-08-17 16:27 /etl
drwxr-xr-x - edward supergroup 0 2012-01-18 15:51 /flag
drwxrwxr-x - hadoop supergroup 0 2010-02-03 17:50 /users
This method of accessing hadoop commands is actually more efficient than using the
hadoop dfs ... equivalent at the bash shell, because the latter starts up a new JVM
instance each time, whereas Hive just runs the same code in its current process.
Comments in Hive Scripts -- 注释
As of Hive v0.8.0, you can embed lines of comments that start with the string --, for
example:
-- Copyright (c) 2012 Megacorp, LLC.
-- This is the best Hive script evar!!
SELECT * FROM massive_table;
...
Query Column Headers --显示字段名
As a final example that pulls together a few things we’ve learned, let’s tell the CLI to
print column headers, which is disabled by default. We can enable this feature by setting
the hiveconf property hive.cli.print.header to true:
hive> set hive.cli.print.header=true;
hive> SELECT * FROM system_logs LIMIT 3;
tstamp severity server message
1335667117.337715 ERROR server1 Hard drive hd1 is 90% full!
1335667117.338012 WARN server1 Slow response from server2.
1335667117.339234 WARN server2 Uh, Dude, I'm kinda busy right now...
分享到:
相关推荐
大数据hadoop中hive-1.1.0 的cli ,jar包,hive-cli-1.1.0.jar
HiveCLI和Beeline命令行的基本使用,基础篇
经过对源码的改造,这个特别的Hive 0.10.0版本能够支持在命令行界面(CLI)中正确地显示中文注释。这对于中国开发者来说是一个重大的改进,使得他们在管理Hive表时可以更加方便地阅读和理解注释内容。 改造的焦点...
Hive和HBase是两种大数据处理工具,它们在大数据生态系统中各自扮演着重要角色。Hive是一个基于Hadoop的数据仓库工具,它允许用户使用SQL-like语法(HQL,Hive Query Language)对大规模数据集进行分析。而HBase是...
hive-cli.jar hive-common.jar hive-contrib.jar hive-hbaseec.jar hive-hbase-handler.jar hive-hwi.jar hive-jdbc.jar hive-metastorejar hive-serde.jar hive-service.jar hive-shims.jar hadoop-core-1.0.4.jar
在Windows环境中,Hive是Apache Hadoop项目的一部分,它提供了一个命令行接口(CLI),名为Hive命令行工具,用于处理大数据集。Hive的主要功能是将结构化的数据文件映射为一张数据库表,并提供SQL(HQL,Hive SQL)...
现在你可以通过Hive命令行接口(CLI)与Hive交互了: ``` hive ``` 在Hive CLI中,你可以创建数据库、表,加载数据,执行SQL查询等操作。 七、连接Hive与Hadoop 确保Hive知道Hadoop的位置,编辑$HIVE_HOME/conf/...
Hive指令样例.txt 文件则包含了Hive CLI(Command Line Interface)的操作命令。这些命令涵盖了Hive的基本操作,例如: 1. **创建数据库**:`CREATE DATABASE IF NOT EXISTS mydb;` 这个命令会创建一个名为mydb的新...
- **hive**:这是Hive的可执行脚本,用于启动Hive CLI或与Hive服务器通信。 - **hiveserver2**:Hive Server 2是Hive的服务端组件,允许远程客户端连接并执行Hive查询。 - **hplsql**:可能是一个用于执行Hive和...
Hive CLI主要用于交互式查询,而Beeline是基于JDBC的CLI,兼容多种数据库,提供更好的性能和错误处理。Beeline的引入是逐步替代Hive CLI的,因为它支持更多的SQL标准和改进的用户体验。 **Hive批处理和交互式模式**...
2. **启动Hive**:运行`bin/hive`启动Hive CLI,或者启动Hive Server以供远程连接。 3. **创建表**:使用HQL创建数据表,并指定存储位置(通常是HDFS的一个路径)。 4. **加载数据**:将数据从本地文件系统或HDFS...
4. **Hive CLI**:Hive命令行接口,用户可以在这里输入HQL查询并查看结果。 5. **Hive Server2**:提供了更安全、高性能的Hive服务,支持多种客户端连接方式,如Beeline、JDBC和ODBC。 Spark 3.0.0是Apache Spark的...
3. **Hive CLI (Command Line Interface)**:命令行接口,用户可以通过它提交查询并查看结果。 4. **Hive JDBC/ODBC Drivers**:允许其他应用程序通过JDBC或ODBC标准连接到Hive,支持多种编程语言进行数据操作。 5. ...
9. **管理元数据**:Hive 的元数据(如表结构、分区等)存储在 metastore 中,可以通过 Hive CLI 或其他工具进行管理。 10. **性能优化**:Hive 提供了多种优化策略,如分区、桶化、倾斜表处理、统计信息收集等,以...
6. **启动Hive**:运行`hive`命令启动Hive CLI,或者启动`hiveserver2`服务以供远程连接。 7. **创建数据库和表**:在Hive CLI中,用户可以创建数据库、定义表结构,并导入数据。 8. **执行查询**:使用HQL进行...
2. **Hive CLI (Command Line Interface)**:Hive的命令行接口,让用户可以通过输入HQL语句来执行查询和管理数据仓库。 3. **Hive Server**:提供了远程访问Hive的接口,支持多种客户端连接方式,如Beeline(一个...
4. **Beeline**:Hive 0.14.0引入了Beeline作为新的SQL客户端,它是Hive CLI的替代品,提供了更好的性能和JDBC/ODBC支持。 5. **HiveQL**:Hive的查询语言,与SQL高度兼容,但有其特有的语法和功能,如支持动态分区...
- **Beeline或Hive CLI增强工具**:有些第三方工具可以增强Hive的命令行接口,添加执行计划的可视化功能。 压缩包文件列表中的"dist"可能是工具的分发目录,其中可能包含可执行文件、配置文件和其他支持文件,用于...
5. **hive-service-0.11.0.jar**:这个文件包含了 Hive 服务端的相关组件,如 CLI(命令行接口)、Beeline(JDBC/ODBC 支持)等,使得用户可以通过网络连接到 Hive Server。 6. **guava-r07.jar**:Guava 是 Google...
在Hive CLI中,可以创建表、加载数据、执行查询等操作,验证Hive是否正常工作。例如: ```sql CREATE TABLE test_table (id INT, name STRING); LOAD DATA LOCAL INPATH '/path/to/data.txt' INTO TABLE test_table...