Hive - 创建Index失败，原因暂未知 - George's dev dream port

sunwinner

浏览: 203345 次
性别:
来自: 上海

最近访客更多访客>>

luojianbing

yanghuangsanguo

jahentao

baichoufei90sina

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

Hive - 创建Index失败，原因暂未知

博客分类：

Hadoop
Hive

运行环境Cloudera Hive 0.10-CDH4

在我机器上安装的Hive里有如下的表：

hive (human_resources)> describe formatted employees;

col_name data_type comment

# col_name data_type comment

name string None

salary float None

subordinates array<string> None

deductions map<string,float> None

address struct<country:string,city:string,zip:int> None

# Partition Information

# col_name data_type comment

country string None

state string None

# Detailed Table Information

Database: human_resources

Owner: root

CreateTime: Mon Jul 22 23:05:47 CST 2013

LastAccessTime: UNKNOWN

Protect Mode: None

Retention: 0

Location: hdfs://n8.example.com:8020/user/hive/warehouse/human_resources.db/employees

Table Type: MANAGED_TABLE

Table Parameters:

numFiles 1

numPartitions 1

numRows 0

rawDataSize 0

totalSize 784

transient_lastDdlTime 1375942564

# Storage Information

SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

InputFormat: org.apache.hadoop.mapred.TextInputFormat

OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat

Compressed: No

Num Buckets: -1

Bucket Columns: []

Sort Columns: []

Storage Desc Params:

serialization.format 1

Time taken: 0.132 seconds

该Employees表中有如下数据(Hive会自动把select * 操作转换成文件系统读操作，所以这里并没有MR Job)：

hive (human_resources)> select * from employees;

name salary subordinates deductions address country state

John Doe 100000.0 ["Mary Smith","Todd Jones"] {"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1} {"country":"1 Michigan Ave.","city":"Chicago","zip":null} US CA

Mary Smith 80000.0 ["Bill King"] {"Federal Taxes":0.2,"State Taxes":0.05,"Insurance":0.1} {"country":"100 Ontario St.","city":"Chicago","zip":null} US CA

Todd Jones 70000.0 [] {"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} {"country":"200 Chicago Ave.","city":"Oak Park","zip":null} US CA

Bill King 60000.0 [] {"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} {"country":"300 Obscure Dr.","city":"Obscuria","zip":null} US CA

Boss Man 200000.0 ["John Doe","Fred Finance"] {"Federal Taxes":0.3,"State Taxes":0.07,"Insurance":0.05} {"country":"1 Pretentious Drive.","city":"Chicago","zip":null} US CA

Fred Finance 150000.0 ["Stacy Accountant"] {"Federal Taxes":0.3,"State Taxes":0.07,"Insurance":0.05} {"country":"2 Pretentious Drive.","city":"Chicago","zip":null} US CA

Stacy Accountant 60000.0 [] {"Federal Taxes":0.15,"State Taxes":0.03,"Insurance":0.1} {"country":"300 Main St.","city":"Naperville","zip":null} US CA

Time taken: 0.164 seconds

现在我想用如下语句给Employees表创建索引，操作失败并有如下提示：

hive (human_resources)> CREATE INDEX employees_index

> ON TABLE employees (country, name)

> AS 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD

> IDXPROPERTIES ('creator' = 'me', 'created_at' = 'some_time')

> IN TABLE employees_index_table

> PARTITIONED BY (country)

> COMMENT 'Employees indexed by country and name.';

FAILED: ParseException line 6:0 missing EOF at 'PARTITIONED' near 'employees_index_table'

假如我去掉partitioned by子句会出现如下错误提示：

hive (human_resources)> CREATE INDEX employees_index

> ON TABLE employees (country, name)

> AS 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD

> IDXPROPERTIES ('creator' = 'me', 'created_at' = 'some_time')

> IN TABLE employees_index_table

> COMMENT 'Employees indexed by country and name.';

FAILED: Error in metadata: java.lang.RuntimeException: Check the index columns, they should appear in the table being indexed.

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

这是Programming Hive中的一个例子，O'Reilly官网的Errata链接是：

http://oreilly.com/catalog/errata.csp?isbn=0636920023555

但是Errata中并没有人提及这个示例运行错误。错误原因未知，希望有知道的大神提示一下，是不是Hive版本问题或是其他原因？

分享到：

如何制作Hive数据文件 | Cascading Terminology and Concepts

2013-08-10 00:08
浏览 3267
评论(0)
分类:企业架构
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Hive - 创建Index失败，原因暂未知

评论

发表评论

相关推荐

最近访客 更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论

Hive - 创建Index失败，原因暂未知

评论

发表评论

相关推荐

Availability and Reliability with HBase

Failed to Run Pig Script with Macro

Solution to Hive Thrift Client Hang without Any Return

Hive - Load Data from CSV/TSV

如何制作Hive数据文件

Cascading Terminology and Concepts

Cascading Kick Start: Word Counting

Joins with Apache Crunch

Getting Started with Apache Crunch

Blooming Filter in Hadoop

Finding Friends of Friends (FoFs)

Accelerating Comparison by Providing RawComparator

Hadoop Performance Woes Checklist

MapReduce Algorithm - Secondary Sort

MapReduce Algorithm - Semi-joins

MapReduce Algorithm - Another Way to Do Map-side Join

Running MapReduce Job with HBase

Hadoop DataJoin in Action

Adding HBase Library into Java Classpath

Total Order Sorting in MapReduce

最近访客更多访客>>