`
bupt04406
  • 浏览: 348379 次
  • 性别: Icon_minigender_1
  • 来自: 杭州
社区版块
存档分类
最新评论

hive 的 test case

    博客分类:
  • Hive
 
阅读更多
src/data 目录下面是输入数据
src/ql/src/test/queries下面是测试用例,clientpositive是运行成功的用例,clientnegative是运行失败,返回非0的用例。
src/ql/src/test/results 下面是测试用例对应的输出结果。如src/ql/src/test/queries/case_sensitivity.q对应的输出结果是src/ql/src/test/results/case_sensitivity.q.out
测试会运行case_sensitivity.q产生的输出放入build/ql/test/logs/clientpositive/case_sensitivity.q.out文件中,然后通过diff比对build/ql/test/logs/clientpositive/case_sensitivity.q.out src/ql/src/test/results/case_sensitivity.q.out两个文件,如果相同则测试通过,如果不同则测试失败。


CREATE TABLE sales (name STRING, id INT)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t’;

CREATE TABLE things (id INT, name STRING) partitioned by (ds string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t’;


load data local inpath 'examples/files/sales.txt' INTO TABLE sales;
load data local inpath 'examples/files/things.txt' INTO TABLE things partition(ds='2011-10-23’);
load data local inpath 'examples/files/things2.txt' INTO TABLE things partition(ds='2011-10-24’);

SELECT name,id FROM sales ORDER BY name ASC, id ASC;
SELECT id,name FROM things ORDER BY id ASC, name ASC;
SELECT name,id FROM sales LEFT SEMI JOIN things ON (sales.id = things.id) ORDER BY name ASC, id ASC;


测试前会事先create一些table,并load一些数据进去。
比如:case_sensitivity.q 的代码是
tianzhao@tianzhao-VirtualBox:~/hive/hive-1.1.2/src/ql/src/test/queries/clientpositive$ cat case_sensitivity.q
CREATE TABLE DEST1(Key INT, VALUE STRING) STORED AS TEXTFILE;

EXPLAIN
FROM SRC_THRIFT
INSERT OVERWRITE TABLE dest1 SELECT src_Thrift.LINT[1], src_thrift.lintstring[0].MYSTRING where src_thrift.liNT[0] > 0;

FROM SRC_THRIFT
INSERT OVERWRITE TABLE dest1 SELECT src_Thrift.LINT[1], src_thrift.lintstring[0].MYSTRING where src_thrift.liNT[0] > 0;

SELECT DEST1.* FROM Dest1;
/// 上面的HSQL中并没有创建SRC_THRIFT 表,但是能够用,那是因为在执行该文件的语句前已经创建好了这个表。
代码在org.apache.hadoop.hive.ql.QTestUtil.java中,它的方法init、createSources等。
从ant test -Dtestcase=TestCliDriver -Dqfile=case_sensitivity.q   -Dtest.silent=false 在终端的输出也可以看出:
    [junit] PREHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/kv1.txt' OVERWRITE INTO TABLE srcpart PARTITION (ds='2008-04-08',hr='11')
    [junit] PREHOOK: type: LOAD
    [junit] PREHOOK: Output: default@srcpart
    [junit] Copying data from file:/home/tianzhao/apache/hive-trunk/data/files/kv1.txt
    [junit] Copying file: file:/home/tianzhao/apache/hive-trunk/data/files/kv1.txt
    [junit] Loading data to table default.srcpart partition (ds=2008-04-08, hr=11)
    [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/kv1.txt' OVERWRITE INTO TABLE srcpart PARTITION (ds='2008-04-08',hr='11')
    [junit] POSTHOOK: type: LOAD
    [junit] POSTHOOK: Output: default@srcpart
    [junit] POSTHOOK: Output: default@srcpart@ds=2008-04-08/hr=11
    [junit] OK
    [junit] PREHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/kv1.txt' OVERWRITE INTO TABLE srcpart PARTITION (ds='2008-04-08',hr='12')
    [junit] PREHOOK: type: LOAD
    [junit] PREHOOK: Output: default@srcpart
    [junit] Copying data from file:/home/tianzhao/apache/hive-trunk/data/files/kv1.txt
    [junit] Copying file: file:/home/tianzhao/apache/hive-trunk/data/files/kv1.txt
    [junit] Loading data to table default.srcpart partition (ds=2008-04-08, hr=12)
    [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/kv1.txt' OVERWRITE INTO TABLE srcpart PARTITION (ds='2008-04-08',hr='12')
    [junit] POSTHOOK: type: LOAD
    [junit] POSTHOOK: Output: default@srcpart
    [junit] POSTHOOK: Output: default@srcpart@ds=2008-04-08/hr=12
    [junit] OK
    [junit] PREHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/kv1.txt' OVERWRITE INTO TABLE srcpart PARTITION (ds='2008-04-09',hr='11')
    [junit] PREHOOK: type: LOAD
    [junit] PREHOOK: Output: default@srcpart
    [junit] Copying data from file:/home/tianzhao/apache/hive-trunk/data/files/kv1.txt
    [junit] Copying file: file:/home/tianzhao/apache/hive-trunk/data/files/kv1.txt
    [junit] Loading data to table default.srcpart partition (ds=2008-04-09, hr=11)
    [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/kv1.txt' OVERWRITE INTO TABLE srcpart PARTITION (ds='2008-04-09',hr='11')
    [junit] POSTHOOK: type: LOAD
    [junit] POSTHOOK: Output: default@srcpart
    [junit] POSTHOOK: Output: default@srcpart@ds=2008-04-09/hr=11
    [junit] OK
    [junit] PREHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/kv1.txt' OVERWRITE INTO TABLE srcpart PARTITION (ds='2008-04-09',hr='12')
    [junit] PREHOOK: type: LOAD
    [junit] PREHOOK: Output: default@srcpart
    [junit] Copying data from file:/home/tianzhao/apache/hive-trunk/data/files/kv1.txt
    [junit] Copying file: file:/home/tianzhao/apache/hive-trunk/data/files/kv1.txt
    [junit] Loading data to table default.srcpart partition (ds=2008-04-09, hr=12)
    [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/kv1.txt' OVERWRITE INTO TABLE srcpart PARTITION (ds='2008-04-09',hr='12')
    [junit] POSTHOOK: type: LOAD
    [junit] POSTHOOK: Output: default@srcpart
    [junit] POSTHOOK: Output: default@srcpart@ds=2008-04-09/hr=12
    [junit] OK
    [junit] PREHOOK: query: CREATE TABLE srcbucket(key int, value string) CLUSTERED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE
    [junit] PREHOOK: type: CREATETABLE
    [junit] POSTHOOK: query: CREATE TABLE srcbucket(key int, value string) CLUSTERED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE
    [junit] POSTHOOK: type: CREATETABLE
    [junit] POSTHOOK: Output: default@srcbucket
    [junit] OK
    [junit] PREHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/srcbucket0.txt' INTO TABLE srcbucket
    [junit] PREHOOK: type: LOAD
    [junit] PREHOOK: Output: default@srcbucket
    [junit] Copying data from file:/home/tianzhao/apache/hive-trunk/data/files/srcbucket0.txt
    [junit] Copying file: file:/home/tianzhao/apache/hive-trunk/data/files/srcbucket0.txt
    [junit] Loading data to table default.srcbucket
    [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/srcbucket0.txt' INTO TABLE srcbucket
    [junit] POSTHOOK: type: LOAD
    [junit] POSTHOOK: Output: default@srcbucket
    [junit] OK
    [junit] PREHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/srcbucket1.txt' INTO TABLE srcbucket
    [junit] PREHOOK: type: LOAD
    [junit] PREHOOK: Output: default@srcbucket
    [junit] Copying data from file:/home/tianzhao/apache/hive-trunk/data/files/srcbucket1.txt
    [junit] Copying file: file:/home/tianzhao/apache/hive-trunk/data/files/srcbucket1.txt
    [junit] Loading data to table default.srcbucket
    [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/srcbucket1.txt' INTO TABLE srcbucket
    [junit] POSTHOOK: type: LOAD
    [junit] POSTHOOK: Output: default@srcbucket
    [junit] OK
    [junit] PREHOOK: query: CREATE TABLE srcbucket2(key int, value string) CLUSTERED BY (key) INTO 4 BUCKETS STORED AS TEXTFILE
    [junit] PREHOOK: type: CREATETABLE
    [junit] POSTHOOK: query: CREATE TABLE srcbucket2(key int, value string) CLUSTERED BY (key) INTO 4 BUCKETS STORED AS TEXTFILE
    [junit] POSTHOOK: type: CREATETABLE
    [junit] POSTHOOK: Output: default@srcbucket2
    [junit] OK
    [junit] PREHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/srcbucket20.txt' INTO TABLE srcbucket2
    [junit] PREHOOK: type: LOAD
    [junit] PREHOOK: Output: default@srcbucket2
    [junit] Copying data from file:/home/tianzhao/apache/hive-trunk/data/files/srcbucket20.txt
    [junit] Copying file: file:/home/tianzhao/apache/hive-trunk/data/files/srcbucket20.txt
    [junit] Loading data to table default.srcbucket2
    [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/srcbucket20.txt' INTO TABLE srcbucket2
    [junit] POSTHOOK: type: LOAD
    [junit] POSTHOOK: Output: default@srcbucket2
    [junit] OK
    [junit] PREHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/srcbucket21.txt' INTO TABLE srcbucket2
    [junit] PREHOOK: type: LOAD
    [junit] PREHOOK: Output: default@srcbucket2
    [junit] Copying data from file:/home/tianzhao/apache/hive-trunk/data/files/srcbucket21.txt
    [junit] Copying file: file:/home/tianzhao/apache/hive-trunk/data/files/srcbucket21.txt
    [junit] Loading data to table default.srcbucket2
    [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/srcbucket21.txt' INTO TABLE srcbucket2
    [junit] POSTHOOK: type: LOAD
    [junit] POSTHOOK: Output: default@srcbucket2
    [junit] OK
    [junit] PREHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/srcbucket22.txt' INTO TABLE srcbucket2
    [junit] PREHOOK: type: LOAD
    [junit] PREHOOK: Output: default@srcbucket2
    [junit] Copying data from file:/home/tianzhao/apache/hive-trunk/data/files/srcbucket22.txt
    [junit] Copying file: file:/home/tianzhao/apache/hive-trunk/data/files/srcbucket22.txt
    [junit] Loading data to table default.srcbucket2
    [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/srcbucket22.txt' INTO TABLE srcbucket2
    [junit] POSTHOOK: type: LOAD
    [junit] POSTHOOK: Output: default@srcbucket2
    [junit] OK
    [junit] PREHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/srcbucket23.txt' INTO TABLE srcbucket2
    [junit] PREHOOK: type: LOAD
    [junit] PREHOOK: Output: default@srcbucket2
    [junit] Copying data from file:/home/tianzhao/apache/hive-trunk/data/files/srcbucket23.txt
    [junit] Copying file: file:/home/tianzhao/apache/hive-trunk/data/files/srcbucket23.txt
    [junit] Loading data to table default.srcbucket2
    [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/srcbucket23.txt' INTO TABLE srcbucket2
    [junit] POSTHOOK: type: LOAD
    [junit] POSTHOOK: Output: default@srcbucket2
    [junit] OK
    [junit] PREHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/kv1.txt' INTO TABLE src
    [junit] PREHOOK: type: LOAD
    [junit] PREHOOK: Output: default@src
    [junit] Copying data from file:/home/tianzhao/apache/hive-trunk/data/files/kv1.txt
    [junit] Copying file: file:/home/tianzhao/apache/hive-trunk/data/files/kv1.txt
    [junit] Loading data to table default.src
    [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/kv1.txt' INTO TABLE src
    [junit] POSTHOOK: type: LOAD
    [junit] POSTHOOK: Output: default@src
    [junit] OK
    [junit] PREHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/kv3.txt' INTO TABLE src1
    [junit] PREHOOK: type: LOAD
    [junit] PREHOOK: Output: default@src1
    [junit] Copying data from file:/home/tianzhao/apache/hive-trunk/data/files/kv3.txt
    [junit] Copying file: file:/home/tianzhao/apache/hive-trunk/data/files/kv3.txt
    [junit] Loading data to table default.src1
    [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/kv3.txt' INTO TABLE src1
    [junit] POSTHOOK: type: LOAD
    [junit] POSTHOOK: Output: default@src1
    [junit] OK
    [junit] PREHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/kv1.seq' INTO TABLE src_sequencefile
    [junit] PREHOOK: type: LOAD
    [junit] PREHOOK: Output: default@src_sequencefile
    [junit] Copying data from file:/home/tianzhao/apache/hive-trunk/data/files/kv1.seq
    [junit] Copying file: file:/home/tianzhao/apache/hive-trunk/data/files/kv1.seq
    [junit] Loading data to table default.src_sequencefile
    [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/kv1.seq' INTO TABLE src_sequencefile
    [junit] POSTHOOK: type: LOAD
    [junit] POSTHOOK: Output: default@src_sequencefile
    [junit] OK
    [junit] PREHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/complex.seq' INTO TABLE src_thrift
    [junit] PREHOOK: type: LOAD
    [junit] PREHOOK: Output: default@src_thrift
    [junit] Copying data from file:/home/tianzhao/apache/hive-trunk/data/files/complex.seq
    [junit] Copying file: file:/home/tianzhao/apache/hive-trunk/data/files/complex.seq
    [junit] Loading data to table default.src_thrift
    [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/complex.seq' INTO TABLE src_thrift
    [junit] POSTHOOK: type: LOAD
    [junit] POSTHOOK: Output: default@src_thrift
    [junit] OK
    [junit] PREHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/json.txt' INTO TABLE src_json
    [junit] PREHOOK: type: LOAD
    [junit] PREHOOK: Output: default@src_json
    [junit] Copying data from file:/home/tianzhao/apache/hive-trunk/data/files/json.txt
    [junit] Copying file: file:/home/tianzhao/apache/hive-trunk/data/files/json.txt
    [junit] Loading data to table default.src_json
    [junit] POSTHOOK: query: LOAD DATA LOCAL INPATH '/home/tianzhao/apache/hive-trunk/data/files/json.txt' INTO TABLE src_json
    [junit] POSTHOOK: type: LOAD
    [junit] POSTHOOK: Output: default@src_json
    [junit] OK
上面显示了创建的表和load的数据。

测试case_sensitivity.q的显示是:
    [junit] Running org.apache.hadoop.hive.cli.TestCliDriver
    [junit] Begin query: case_sensitivity.q
    [junit] Deleted file:/home/tianzhao/apache/hive-trunk/build/ql/test/data/warehouse/dest1
    [junit] diff -a -I file: -I pfile: -I hdfs: -I /tmp/ -I invalidscheme: -I lastUpdateTime -I lastAccessTime -I [Oo]wner -I CreateTime -I LastAccessTime -I Location -I transient_lastDdlTime -I last_modified_ -I java.lang.RuntimeException -I at org -I at sun -I at java -I at junit -I Caused by: -I LOCK_QUERYID: -I grantTime -I [.][.][.] [0-9]* more -I USING 'java -cp /home/tianzhao/apache/hive-trunk/build/ql/test/logs/clientpositive/case_sensitivity.q.out /home/tianzhao/apache/hive-trunk/ql/src/test/results/clientpositive/case_sensitivity.q.out
    [junit] Done query: case_sensitivity.q
    [junit] Cleaning up TestCliDriver
生成的结果在build/ql/test/logs/clientpositive/case_sensitivity.q.out 中,通过diff比对它与ql/src/test/results/clientpositive/case_sensitivity.q.out是否相同来判断该UT是否通过。



我们来看如何手工创建src_thrift表,load数据,并执行case_sensitivity.q中的HSQL。
org.apache.hadoop.hive.ql.QTestUtil.createSources()中创建src_thrift表的语句是:
    Table srcThrift = new Table(db.getCurrentDatabase(), "src_thrift");
    srcThrift.setInputFormatClass(SequenceFileInputFormat.class.getName());
    srcThrift.setOutputFormatClass(SequenceFileOutputFormat.class.getName());
    srcThrift.setSerializationLib(ThriftDeserializer.class.getName());
    srcThrift.setSerdeParam(Constants.SERIALIZATION_CLASS, Complex.class
        .getName());
    srcThrift.setSerdeParam(Constants.SERIALIZATION_FORMAT,
        TBinaryProtocol.class.getName());
    db.createTable(srcThrift);
存储格式是SequenceFile,serde是ThriftDeserializer,serde的两个属性是SERIALIZATION_CLASS和SERIALIZATION_FORMAT。创建表的时候需要知道表的字段,这里没有写。但是在(Constants.SERIALIZATION_CLASS, Complex.class.getName())中已经定义了。
那么看org.apache.hadoop.hive.serde2.thrift.test.Complex的定义:
  public int aint;
  public static final int AINT = 1;
  public String aString;
  public static final int ASTRING = 2;
  public List<Integer> lint;
  public static final int LINT = 3;
  public List<String> lString;
  public static final int LSTRING = 4;
  public List<IntString> lintString;
  public static final int LINTSTRING = 5;
  public Map<String,String> mStringString;
  public static final int MSTRINGSTRING = 6;
IntString的定义:
  public int myint;
  public static final int MYINT = 1;
  public String myString;
  public static final int MYSTRING = 2;
  public int underscore_int;
  public static final int UNDERSCORE_INT = 3;
可以还原src_thrift的字段:
建表语句是:
hive> create table src_thrift(aint int, aString string,lint array<int>, lString array<string>, lintString array<struct<myint:int, mString:string, underscore_int:int>>, mStringString map<string, string>)
    > row format serde 'org.apache.hadoop.hive.serde2.thrift.ThriftDeserializer' with serdeproperties("serialization.class"="org.apache.hadoop.hive.serde2.thrift.test.Complex", "serialization.format"="org.apache.thrift.protocol.TBinaryProtocol")
    > stored as sequencefile;
OK
Time taken: 0.462 seconds
load数据是:
hive> load data local inpath 'src/data/files/complex.seq' into table src_thrift;
Copying data from file:/home/tianzhao/hive/hive-1.1.2/src/data/files/complex.seq
Loading data to table src_thrift
OK
Time taken: 0.286 seconds
查看数据是:
hive> select * from src_thrift limit 2;
OK
1712634731 record_0 [0,0,0] ["0","0","0"] [{"myint":0,"mystring":"0","underscore_int":0}] {"key_0":"value_0"}
465985200 record_1 [1,2,3] ["10","100","1000"] [{"myint":1,"mystring":"1","underscore_int":1}] {"key_1":"value_1"}
Time taken: 0.34 seconds
查看表的信息是:
hive> desc src_thrift;
OK
aint int from deserializer
astring string from deserializer
lint array<int> from deserializer
lstring array<string> from deserializer
lintstring array<org.apache.hadoop.hive.serde2.thrift.test.IntString> from deserializer
mstringstring map<string,string> from deserializer
Time taken: 0.085 seconds
运行case_sensitivity.q里面的一个语句:(截取了一部分)
hive> from src_thrift SELECT src_Thrift.LINT[1], src_thrift.lintstring[0].MYSTRING where src_thrift.liNT[0] > 0;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201105281127_0001, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201105281127_0001
Kill Command = /home/tianzhao/hive/hadoop-0.20.2/bin/../bin/hadoop job  -Dmapred.job.tracker=localhost:54311 -kill job_201105281127_0001
2011-05-28 12:04:52,869 Stage-1 map = 0%,  reduce = 0%
2011-05-28 12:04:55,921 Stage-1 map = 100%,  reduce = 0%
2011-05-28 12:04:58,962 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201105281127_0001
OK
2 1
4 8
6 27
8 64
10 125
12 216
14 343
16 512
18 729
Time taken: 12.147 seconds


src表
org.apache.hadoop.hive.ql.QTestUtil.createSources() :
    LinkedList<String> cols = new LinkedList<String>();
    cols.add("key");
    cols.add("value");

    for (String tname : new String[] {"src", "src1"}) {
      db.createTable(tname, cols, null, TextInputFormat.class,
          IgnoreKeyTextOutputFormat.class);
    }
src、src1两表都是两个columns:
create table src(key string, value string);
load数据:
    // load the input data into the src table
    fpath = new Path(testFiles, "kv1.txt");
    newfpath = new Path(tmppath, "kv1.txt");
    fs.copyFromLocalFile(false, true, fpath, newfpath);
    // db.loadTable(newfpath, "src", false);
    runLoadCmd("LOAD DATA INPATH '" + newfpath.toString() + "' INTO TABLE src");

load data local inpath 'src/data/files/kv1.txt' into table src;




有了这些东西,我们可以手工测试使用src/ql/src/test/queries下面的各个UT里面的语句。






目录下面的文件 hive-0.7.0\src\ql\src\test\queries\clientpositive

(1) add_part_exist.q   创建有partition的表,增加partition,显示partition

CREATE TABLE add_part_test (key STRING, value STRING) PARTITIONED BY (ds STRING);
SHOW PARTITIONS add_part_test;

ALTER TABLE add_part_test ADD PARTITION (ds='2010-01-01');
SHOW PARTITIONS add_part_test;

ALTER TABLE add_part_test ADD IF NOT EXISTS PARTITION (ds='2010-01-01');
SHOW PARTITIONS add_part_test;

ALTER TABLE add_part_test ADD IF NOT EXISTS PARTITION (ds='2010-01-02');
SHOW PARTITIONS add_part_test;

ALTER TABLE add_part_test ADD IF NOT EXISTS PARTITION (ds='2010-01-01') PARTITION (ds='2010-01-02') PARTITION (ds='2010-01-03');
SHOW PARTITIONS add_part_test;

DROP TABLE add_part_test;
SHOW TABLES;

desc extended add_part_test  partition(ds='2010-01-03');

需要注意的是:
SHOW TABLES  , SHOW PARTITIONS 时,PARTITIONS和TABLES 都需要加上S,复数形式.

关于'2010-01-02'的一些困惑。
hive> alter table add_part_test add if not exists partition(ds=2010-01-02);
FAILED: Parse Error: line 1:61 mismatched input '-' expecting ) in add partition statement

hive> alter table add_part_test add if not exists partition(ds='2010-01-02');
OK
Time taken: 0.294 seconds
hive> alter table add_part_test add if not exists partition(ds=2011102);    
OK
Time taken: 0.178 seconds
hive> show partitions add_part_test;
OK
ds=2010-01-01
ds=2010-01-02
ds=2011102
Time taken: 0.057 seconds


(2)alter1.q  修改一个表的属性,表的serde以及serde属性。修改column信息。

(3)create_default_prop.q  创建表
create_1.q  创建表的操作
create_*

DESCRIBE table3;
DESCRIBE EXTENDED table3;
DESC table3;
DESC EXTENDED table3;
是一样的,大小写不区分。
desc 0.7版本支持,0.5不支持

hive.map.aggr   hive.groupby.skewindata (倾斜)
https://issues.apache.org/jira/browse/HIVE-223
分享到:
评论

相关推荐

    hive-testbench-hive14.zip大数据TPCDS-99SQL自动测试脚本

    《Hive Testbench与TPCDS:大数据性能测试与99条SQL脚本解析》 在大数据处理领域,Hive作为一个分布式数据仓库系统,被广泛应用于海量数据的存储和分析。而TPCDS(Transaction Processing Performance Council ...

    Hive 性能测试程序,https://github.com/hortonworks/hive-testbench 编译

    `Hive Testbench` 是一个专门针对 Hive 进行性能测试的开源项目,由 Hortonworks 维护,它为开发人员和数据工程师提供了评估和优化 Hive 性能的平台。 在编译 `Hive Testbench` 之前,我们需要了解以下关键知识点:...

    hive客户端安装_hive客户端安装_hive_

    在大数据处理领域,Hive是一个非常重要的工具,它提供了一个基于Hadoop的数据仓库基础设施,用于数据查询、分析和管理大规模数据集。本教程将详细讲解如何在Linux环境下安装Hive客户端,以便进行数据操作和分析。 ...

    test_hive.rar_ThriftHive.php_hive_hive php_php hive_php hive Thr

    标题中的“test_hive.rar_ThriftHive.php_hive_hive php_php hive_php hive Thr”暗示了我们讨论的主题是关于使用PHP通过Thrift接口与Hive数据仓库进行交互的一个库,具体来说,是一个名为“ThriftHive.php”的文件...

    Hive Pkg STGY TACT TEST UTLpkg

    标题中的“Hive Pkg STGY TACT TEST UTLpkg”可能是指一个关于Hive的打包、策略、战术、测试和实用工具的项目或者模块。Hive是大数据处理领域的一个重要组件,主要用于结构化数据的查询和分析。在这个场景中,“Pkg...

    Hive_JDBC.zip_hive java_hive jdbc_hive jdbc pom_java hive_maven连

    在大数据处理领域,Apache Hive是一个基于Hadoop的数据仓库工具,它允许用户使用SQL(HQL,Hive Query Language)查询存储在Hadoop集群中的大型数据集。Hive JDBC(Java Database Connectivity)是Hive提供的一种...

    HIVE安装及详解

    "HIVE安装及详解" HIVE是一种基于Hadoop的数据仓库工具,主要用于处理和分析大规模数据。下面是关于HIVE的安装及详解。 HIVE基本概念 HIVE是什么?HIVE是一种数据仓库工具,主要用于处理和分析大规模数据。它将...

    Hive驱动1.1.0.zip

    在大数据处理领域,Hive是一个基于Hadoop的数据仓库工具,它允许用户使用SQL(HQL,Hive Query Language)查询和管理存储在Hadoop分布式文件系统(HDFS)中的大量结构化数据。Hive 1.1.0是Hive的一个版本,提供了...

    hive的一些报错及解决方法

    ### Hive的一些报错及解决方法 #### 一、Hive连接报错 在更换网络接口卡后重新启动服务时,可能会遇到Hive连接失败的问题。此类问题通常与元数据存储中的异常有关,例如存在多个版本的元数据或者端口被其他服务...

    修改hive表分区名称

    hdfs dfs -mv /user/hive/radius/test/province_id=105 /user/hive/radius/test/province_id=hunan; ``` 这条命令会将分区`province_id=105`重命名为`province_id=hunan`。 3. **修复Hive表的元数据信息**:在...

    连接hive依赖的jar包_hive连接方式

    在大数据处理领域,Hive是一个基于Hadoop的数据仓库工具,它可以将结构化的数据文件映射为一张数据库表,并提供SQL查询功能,使得用户可以使用SQL语句来处理存储在Hadoop分布式文件系统(HDFS)上的大数据。...

    利用Hive进行复杂用户行为大数据分析及优化案例

    04_HIve中的case when、cast及unix_timestamp的使用 05_复杂日志分析-需求分析 06_复杂日志分析-需求字段讲解及过滤 07_复杂日志分析-字段提取及临时表的创建 08_复杂日志分析-指标结果的分析实现 09_Hive中数据文件...

    hive 驱动包 hive 链接 datagrip的驱动包

    5. **测试连接**:保存配置后,点击“Test Connection”按钮验证是否能够成功连接到Hive服务器。如果一切正常,你应该能看到“Connection successful”这样的提示。 6. **使用Hive**:现在你可以使用DataGrip的SQL...

    hive2.1.1-cdh6.3.2

    Hive是Apache Hadoop生态系统中的一个数据仓库工具,它允许用户使用SQL(称为HQL,Hive Query Language)对存储在HDFS上的大型数据集进行分析。Hive 2.1.1是Hive的一个重要版本,而CDH6.3.2是Cloudera发行的商业版...

    Ambari下Hive3.0升级到Hive4.0

    在大数据领域,Apache Ambari 是一个用于 Hadoop 集群管理和监控的开源工具,而 Hive 是一个基于 Hadoop 的数据仓库系统,用于处理和分析大规模数据集。本话题聚焦于如何在 Ambari 环境下将 Hive 3.0 升级到 Hive ...

    Hive3.1.2编译源码

    使用hive3.1.2和spark3.0.0配置hive on spark的时候,发现官方下载的hive3.1.2和spark3.0.0不兼容,hive3.1.2对应的版本是spark2.3.0,而spark3.0.0对应的hadoop版本是hadoop2.6或hadoop2.7。 所以,如果想要使用高...

    如何在python中写hive脚本

    Test_user = pd.read_csv(StringIO(unicode(output1.read(),'utf-8')), sep="\t", header=0) ``` 如果有多个查询,可以按类似方式逐个执行并合并结果。 此外,为了显示查询结果的表头,可以在Hive脚本中添加`...

    hive-jdbc hive jdbc驱动

    hive-jdbc

    hive-site.xml

    hive+hadoop配置文件hive+hadoop配置文件hive+hadoop配置文件hive+hadoop配置文件hive+hadoop配置文件hive+hadoop配置文件hive+hadoop配置文件hive+hadoop配置文件hive+hadoop配置文件hive+hadoop配置文件hive+...

    Hive表生成工具,Hive表生成工具Hive表生成工具

    Hive表生成工具,Hive表生成工具Hive表生成工具

Global site tag (gtag.js) - Google Analytics