- 浏览: 347711 次
- 性别:
- 来自: 杭州
文章分类
最新评论
-
lvyuan1234:
你好,你那个sample.txt文件可以分享给我吗
hive insert overwrite into -
107x:
不错,谢谢!
hive 表的一些默认值 -
on_way_:
赞
Hadoop相关书籍 -
bupt04406:
dengkanghua 写道出来这个问题该怎么解决?hbase ...
Unexpected state导致HMaster abort -
dengkanghua:
出来这个问题该怎么解决?hbase master启动不起来。
Unexpected state导致HMaster abort
Driver:
public int compile(String command) {
ctx = new Context(conf); //
}
public Context(Configuration conf) throws IOException {
this(conf, generateExecutionId());
}
/**
* Generate a unique executionId. An executionId, together with user name and
* the configuration, will determine the temporary locations of all intermediate
* files.
*
* In the future, users can use the executionId to resume a query.
*/
public static String generateExecutionId() {
Random rand = new Random();
SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd_HH-mm-ss_SSS");
String executionId = "hive_" + format.format(new Date()) + "_"
+ Math.abs(rand.nextLong());
return executionId;
}
executionId的例子是:hive_2011-08-21_00-02-22_445_7799135143086468923
hive_yyyy-MM-dd_HH-mm-ss_SSS_长整型随机数
/**
* Create a Context with a given executionId. ExecutionId, together with
* user name and conf, will determine the temporary directory locations.
*/
public Context(Configuration conf, String executionId) throws IOException {
this.conf = conf;
this.executionId = executionId; //hive_2011-08-21_00-02-22_445_7799135143086468923
// non-local tmp location is configurable. however it is the same across
// all external file systems
nonLocalScratchPath =
new Path(HiveConf.getVar(conf, HiveConf.ConfVars.SCRATCHDIR),
executionId); // /tmp/hive-tianzhao/hive_2011-08-21_00-02-22_445_7799135143086468923
// HiveConf SCRATCHDIR("hive.exec.scratchdir", "/tmp/" + System.getProperty("user.name") + "/hive"),
hive-default.xml文件中默认的配置是:
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive-${user.name}</value>
<description>Scratch space for Hive jobs</description>
</property>
// local tmp location is not configurable for now
localScratchDir = System.getProperty("java.io.tmpdir")
+ Path.SEPARATOR + System.getProperty("user.name") + Path.SEPARATOR
+ executionId; // /tmp/tianzhao/hive_2011-08-21_00-02-22_445_7799135143086468923
}
// 本地临时目录 System.getProperty("java.io.tmpdir") = /tmp
// System.getProperty("user.name") = tianzhao
/**
* Get a tmp path on local host to store intermediate data.
*
* @return next available tmp path on local fs
*/
public String getLocalTmpFileURI() {
return getLocalScratchDir(true) + Path.SEPARATOR + LOCAL_PREFIX +
nextPathId(); // file:/tmp/tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461/-local-10003
}
private String getScratchDir(String scheme, String authority,
boolean mkdir, String scratchDir) {
String fileSystem = scheme + ":" + authority; // file:null
String dir = fsScratchDirs.get(fileSystem); //
if (dir == null) {
Path dirPath = new Path(scheme, authority, scratchDir);
if (mkdir) { // true
try {
FileSystem fs = dirPath.getFileSystem(conf);
dirPath = new Path(fs.makeQualified(dirPath).toString());
if (!fs.mkdirs(dirPath)) {
throw new RuntimeException("Cannot make directory: "
+ dirPath.toString());
}
} catch (IOException e) {
throw new RuntimeException (e);
}
}
dir = dirPath.toString(); // file:/tmp/tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461
fsScratchDirs.put(fileSystem, dir); // {file:null=file:/tmp/tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461, hdfs:localhost:54310=hdfs://localhost:54310/tmp/hive-tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461}
}
return dir; //file:/tmp/tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461
}
/**
* Get a path to store map-reduce intermediate data in.
*
* @return next available path for map-red intermediate data
*/
public String getMRTmpFileURI() {
return getMRScratchDir() + Path.SEPARATOR + MR_PREFIX +
nextPathId(); // hdfs://localhost:54310/tmp/hive-tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461/-mr-10004
}
public String getMRScratchDir() {
// if we are executing entirely on the client side - then
// just (re)use the local scratch directory
if(isLocalOnlyExecutionMode()) {
return getLocalScratchDir(!explain);
}
try {
// nonLocalScratchPath=/tmp/hive-tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461
Path dir = FileUtils.makeQualified(nonLocalScratchPath, conf); // hdfs://localhost:54310/tmp/hive-tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461
URI uri = dir.toUri();// hdfs://localhost:54310/tmp/hive-tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461
return getScratchDir(uri.getScheme(), uri.getAuthority(),
!explain, uri.getPath());
} catch (IOException e) {
throw new RuntimeException(e);
} catch (IllegalArgumentException e) {
throw new RuntimeException("Error while making MR scratch "
+ "directory - check filesystem config (" + e.getCause() + ")", e);
}
}
/**
* Get a tmp directory on specified URI
*
* @param scheme Scheme of the target FS
* @param authority Authority of the target FS
* @param mkdir create the directory if true
* @param scratchdir path of tmp directory
*/
private String getScratchDir(String scheme, String authority,
boolean mkdir, String scratchDir) {
String fileSystem = scheme + ":" + authority; //hdfs:localhost:54310
String dir = fsScratchDirs.get(fileSystem); // hdfs://localhost:54310/tmp/hive-tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461
if (dir == null) {
Path dirPath = new Path(scheme, authority, scratchDir);
if (mkdir) {
try {
FileSystem fs = dirPath.getFileSystem(conf);
dirPath = new Path(fs.makeQualified(dirPath).toString());
if (!fs.mkdirs(dirPath)) {
throw new RuntimeException("Cannot make directory: "
+ dirPath.toString());
}
} catch (IOException e) {
throw new RuntimeException (e);
}
}
dir = dirPath.toString();
fsScratchDirs.put(fileSystem, dir);
}
return dir; //hdfs://localhost:54310/tmp/hive-tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461
}
public int compile(String command) {
ctx = new Context(conf); //
}
public Context(Configuration conf) throws IOException {
this(conf, generateExecutionId());
}
/**
* Generate a unique executionId. An executionId, together with user name and
* the configuration, will determine the temporary locations of all intermediate
* files.
*
* In the future, users can use the executionId to resume a query.
*/
public static String generateExecutionId() {
Random rand = new Random();
SimpleDateFormat format = new SimpleDateFormat("yyyy-MM-dd_HH-mm-ss_SSS");
String executionId = "hive_" + format.format(new Date()) + "_"
+ Math.abs(rand.nextLong());
return executionId;
}
executionId的例子是:hive_2011-08-21_00-02-22_445_7799135143086468923
hive_yyyy-MM-dd_HH-mm-ss_SSS_长整型随机数
/**
* Create a Context with a given executionId. ExecutionId, together with
* user name and conf, will determine the temporary directory locations.
*/
public Context(Configuration conf, String executionId) throws IOException {
this.conf = conf;
this.executionId = executionId; //hive_2011-08-21_00-02-22_445_7799135143086468923
// non-local tmp location is configurable. however it is the same across
// all external file systems
nonLocalScratchPath =
new Path(HiveConf.getVar(conf, HiveConf.ConfVars.SCRATCHDIR),
executionId); // /tmp/hive-tianzhao/hive_2011-08-21_00-02-22_445_7799135143086468923
// HiveConf SCRATCHDIR("hive.exec.scratchdir", "/tmp/" + System.getProperty("user.name") + "/hive"),
hive-default.xml文件中默认的配置是:
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive-${user.name}</value>
<description>Scratch space for Hive jobs</description>
</property>
// local tmp location is not configurable for now
localScratchDir = System.getProperty("java.io.tmpdir")
+ Path.SEPARATOR + System.getProperty("user.name") + Path.SEPARATOR
+ executionId; // /tmp/tianzhao/hive_2011-08-21_00-02-22_445_7799135143086468923
}
// 本地临时目录 System.getProperty("java.io.tmpdir") = /tmp
// System.getProperty("user.name") = tianzhao
/**
* Get a tmp path on local host to store intermediate data.
*
* @return next available tmp path on local fs
*/
public String getLocalTmpFileURI() {
return getLocalScratchDir(true) + Path.SEPARATOR + LOCAL_PREFIX +
nextPathId(); // file:/tmp/tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461/-local-10003
}
private String getScratchDir(String scheme, String authority,
boolean mkdir, String scratchDir) {
String fileSystem = scheme + ":" + authority; // file:null
String dir = fsScratchDirs.get(fileSystem); //
if (dir == null) {
Path dirPath = new Path(scheme, authority, scratchDir);
if (mkdir) { // true
try {
FileSystem fs = dirPath.getFileSystem(conf);
dirPath = new Path(fs.makeQualified(dirPath).toString());
if (!fs.mkdirs(dirPath)) {
throw new RuntimeException("Cannot make directory: "
+ dirPath.toString());
}
} catch (IOException e) {
throw new RuntimeException (e);
}
}
dir = dirPath.toString(); // file:/tmp/tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461
fsScratchDirs.put(fileSystem, dir); // {file:null=file:/tmp/tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461, hdfs:localhost:54310=hdfs://localhost:54310/tmp/hive-tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461}
}
return dir; //file:/tmp/tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461
}
/**
* Get a path to store map-reduce intermediate data in.
*
* @return next available path for map-red intermediate data
*/
public String getMRTmpFileURI() {
return getMRScratchDir() + Path.SEPARATOR + MR_PREFIX +
nextPathId(); // hdfs://localhost:54310/tmp/hive-tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461/-mr-10004
}
public String getMRScratchDir() {
// if we are executing entirely on the client side - then
// just (re)use the local scratch directory
if(isLocalOnlyExecutionMode()) {
return getLocalScratchDir(!explain);
}
try {
// nonLocalScratchPath=/tmp/hive-tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461
Path dir = FileUtils.makeQualified(nonLocalScratchPath, conf); // hdfs://localhost:54310/tmp/hive-tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461
URI uri = dir.toUri();// hdfs://localhost:54310/tmp/hive-tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461
return getScratchDir(uri.getScheme(), uri.getAuthority(),
!explain, uri.getPath());
} catch (IOException e) {
throw new RuntimeException(e);
} catch (IllegalArgumentException e) {
throw new RuntimeException("Error while making MR scratch "
+ "directory - check filesystem config (" + e.getCause() + ")", e);
}
}
/**
* Get a tmp directory on specified URI
*
* @param scheme Scheme of the target FS
* @param authority Authority of the target FS
* @param mkdir create the directory if true
* @param scratchdir path of tmp directory
*/
private String getScratchDir(String scheme, String authority,
boolean mkdir, String scratchDir) {
String fileSystem = scheme + ":" + authority; //hdfs:localhost:54310
String dir = fsScratchDirs.get(fileSystem); // hdfs://localhost:54310/tmp/hive-tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461
if (dir == null) {
Path dirPath = new Path(scheme, authority, scratchDir);
if (mkdir) {
try {
FileSystem fs = dirPath.getFileSystem(conf);
dirPath = new Path(fs.makeQualified(dirPath).toString());
if (!fs.mkdirs(dirPath)) {
throw new RuntimeException("Cannot make directory: "
+ dirPath.toString());
}
} catch (IOException e) {
throw new RuntimeException (e);
}
}
dir = dirPath.toString();
fsScratchDirs.put(fileSystem, dir);
}
return dir; //hdfs://localhost:54310/tmp/hive-tianzhao/hive_2011-09-06_03-39-09_416_8623373648168603461
}
发表评论
-
hive rename table name
2013-09-18 14:28 2593hive rename tablename hive re ... -
hive的distribute by如何partition long型的数据
2013-08-20 10:15 2470有用户问:hive的distribute by分桶是怎么分 ... -
hive like vs rlike vs regexp
2013-04-11 18:53 11207like vs rlike vs regexp r ... -
hive sql where条件很简单,但是太多
2012-07-18 15:51 8733insert overwrite table aaaa ... -
insert into时(string->bigint)自动类型转换
2012-06-14 12:30 8277原表src: hive> desc src; ... -
通过复合结构来优化udf的调用
2012-05-11 14:07 1206select split("accba&quo ... -
RegexSerDe
2012-03-14 09:58 1544官方示例在: https://cwiki.apache.or ... -
Hive 的 OutputCommitter
2012-01-30 19:44 1813Hive 的 OutputCommitter publi ... -
hive LATERAL VIEW 行转列
2011-11-09 14:49 5442drop table lateralview; create ... -
hive complex type
2011-11-08 19:56 1358数据: 1,100|3,20|2,70|5,100 建表: ... -
hive转义字符
2011-10-25 16:41 6238CREATE TABLE escape (id STRING, ... -
hive 两个不同类型的columns进行比较
2011-09-19 13:46 3032select case when "ab1234&q ... -
lateral view
2011-09-18 04:04 0lateral view与udtf相关 -
udf 中获得 FileSystem
2011-09-14 10:28 0在udf中获得FileSystem,需要获得知道fs.defa ... -
hive union mapjoin
2011-09-09 16:29 0union union.q union2.q ... -
hive eclipse
2011-09-08 17:42 0eclipse-templates$ vi .classpat ... -
hive join filter
2011-09-07 23:05 0join16.q.out hive.optimize.ppd ... -
hive limit
2011-09-07 21:02 0limit 关键字: input4_limit.q.out ... -
hive convertMapJoin MapJoinProcessor
2011-09-06 21:17 0join25.q join26 ... -
hive hive.merge.mapfiles hive.merge.mapredfiles
2011-09-06 19:14 0HiveConf: HIVEMERGEMAPFILES ...
相关推荐
### Hive入门文档笔记 #### 一、Hive简介与安装配置 Hive 是一个构建在 Hadoop 之上的数据仓库工具,它通过提供 SQL 查询功能,使得用户可以更方便地处理存储在 Hadoop 分布式文件系统(HDFS)中的大规模数据集。...
比如,在处理`hive_wordcount`表时,将`context`列中的文本按制表符拆分成单词,并计算每个单词出现的次数。 除了文本处理,Hive还提供了建表的其他示例,例如`emp`和`dept`表,分别用于存储员工和部门信息。它们都...
在大数据处理领域,Hive 是一个非常重要的工具,它提供了SQL-like的语言(HQL)用于查询和管理存储在分布式文件系统中的大规模数据。输入格式(InputFormat)是Hive处理数据时的一个关键组件,它定义了如何读取数据...
【Hive函数重要应用案例1】本章主要探讨在Hive中处理特殊数据格式的技巧,特别是涉及多字节分隔符和字段内包含分隔符的问题。在Hive中,通常使用单字节分隔符(如逗号、制表符)来加载文本数据,但在实际场景中,...
### Hive 2.0:下一代数据处理框架 #### 概述 Hive 2.0是基于Hadoop的下一代基础设施,旨在提供易于使用的工具来进行数据汇总、即席查询和数据分析。与Hive 1.0类似,Hive 2.0为数据提供了结构化的机制,并且引入了...
### 大数据技术之Hive学习—Hive实战之谷粒影音+常见错误及解决方案 #### 10.1 需求描述 本实战案例旨在通过对谷粒影音视频网站的大数据分析,提炼出一系列关键指标,包括但不限于视频观看数量、类别热度等Top N...
public RecordReader, Text> createRecordReader(InputSplit split, TaskAttemptContext context) throws IOException, InterruptedException { return new CustomRecordReader(); } } public class ...
该项目主要针对500万条搜狗查询数据进行分析,使用Hadoop的MapReduce进行数据清洗,再通过Hive进行离线分析。由于原始数据中缺失用户ID字段,本案例提供完整数据,确保分析的准确性。 ### 一、数据预处理 1. **...
ServletContext context = this.getServlet().getServletContext(); // Check if the list is null or empty if (context.getAttribute("listnum") == null) { context.setAttribute("listnum", new ArrayList...
但对于更复杂的应用,可以使用`Context API`或者第三方库如Redux来全局管理状态。 5. **虚拟DOM**: - React使用虚拟DOM进行高效的UI更新。当状态变化时,React会计算最小的DOM变更集,然后应用到真实DOM上,提高...
public void map(LongWritable key, Text value, Context context) { String[] fields = value.toString().split(","); // 检查小表是否有匹配项 for (TableRow smallRow : smallTable) { if (fields[0].equals...
hive on spark调优 Spark SQL 多维聚合分析应用案例 Spark Streaming源码阅读 动态发现新增分区 Dstream join 操作和 RDD join 操作的区别 PIDController源码赏析及 back pressure 实现思路 Streaming Context...
import org.springframework.context.annotation.Configuration; @Configuration @MapperScan("com.example.mapper") // 替换为你的 Mapper 接口所在包 public class MybatisConfig { } ``` 在 `src/main/java` 下...
在Tomcat中,要连接到Hive,需要使用支持Hive的JDBC驱动,并在Tomcat的配置文件中定义相应的数据源。 2. **MySQL**:这是一个流行的开源关系型数据库管理系统,广泛用于web应用。在Tomcat中,要连接到MySQL,你需要...
- **组成模块**:Hive主要由Hive客户端、元数据存储和Hive服务器组成。 - **Hive客户端**:用户提交SQL查询的地方。 - **元数据存储**:存储表和分区等定义的元数据信息。 - **Hive服务器**:接收客户端的请求,...
public void reduce(Text key, Iterable<NullWritable> values, Context context) throws IOException, InterruptedException { context.write(key, new NullWritable()); } ``` 此例中,Mapper将每行数据拆分成多...
Learn about data management, mining, and warehousing in a distributed context using Apache Hive and HBase Use Sqoop and Apache Flume to ingest data from relational databases Program complex Hadoop and...
Learn about data management, mining, and warehousing in a distributed context using Apache Hive and HBase Use Sqoop and Apache Flume to ingest data from relational databases Program complex Hadoop and...
8. 创建 MAP,MapTextImportMapper.setup(Context context)。 9. RecordReader 一行一行从关系型数据库中读取数据,设置好 Map 的。 三、Sqoop 的功能 Sqoop 的主要功能包括: 1. 将关系型数据库的数据导入到 ...