- 浏览: 53197 次
-
文章分类
最新评论
[hadoop2.7.1]I/O之一步一步解析Text(实例)
通过上一篇的讲述,我们知道hadoop中的Text类,跟java中的String类很相似,在其定义的方法上,也多有相近之处,当然,由于用途、UTF编码的不同,两者之间还是有很大的区别。下面要讲实例除了测试Text的方法之外,着重跟java.lang.String进行比较。
1、首先,我们来看生成String串的方法:
源码如下:
// 生成java字符串 private static String getTestString(int len) throws Exception { StringBuilder buffer = new StringBuilder(); int length = (len==RAND_LEN) ? RANDOM.nextInt(1000) : len; while (buffer.length()<length) { int codePoint = RANDOM.nextInt(Character.MAX_CODE_POINT); char tmpStr[] = new char[2]; if (Character.isDefined(codePoint)) { //unpaired surrogate if (codePoint < Character.MIN_SUPPLEMENTARY_CODE_POINT && !Character.isHighSurrogate((char)codePoint) && !Character.isLowSurrogate((char)codePoint)) { Character.toChars(codePoint, tmpStr, 0); buffer.append(tmpStr); } } } return buffer.toString(); } //默认情况下生成随机String串 public static String getTestString() throws Exception { return getTestString(RAND_LEN); } //长串 public static String getLongString() throws Exception { String str = getTestString(); int length = Short.MAX_VALUE+str.length(); StringBuilder buffer = new StringBuilder(); while(buffer.length()<length) buffer.append(str); return buffer.toString(); }
2、测试编码方法:
public void testCoding() throws Exception { String before = "Bad \t encoding \t testcase"; Text text = new Text(before); String after = text.toString(); assertTrue(before.equals(after)); for (int i = 0; i < NUM_ITERATIONS; i++) { // generate a random string if (i == 0) before = getLongString(); else before = getTestString(); // test string to utf8 ByteBuffer bb = Text.encode(before); byte[] utf8Text = bb.array(); byte[] utf8Java = before.getBytes("UTF-8");//注意:这里指定用UTF-8标准charset,java语言的本机字符编码方案是UTF-16 assertEquals(0, WritableComparator.compareBytes( utf8Text, 0, bb.limit(), utf8Java, 0, utf8Java.length)); // test utf8 to string after = Text.decode(utf8Java); assertTrue(before.equals(after)); } }
注意:默认的equals方法是直接返回==的结果,所以也是比较数组是否是同一个,等同于使用==比较,是两个数组是否是同一个,而不是是否相等。
3、测试Text的输入\输出方法:
主要有:
读:
void |
readFields(DataInputin)
deserialize
|
void |
readFields(DataInputin,
intmaxLength)
|
static
String |
readString(DataInputin)
Read a UTF8 encoded string from in
|
static
String |
readString(DataInputin,
intmaxLength)
Read a UTF8 encoded string with a maximum size
|
void |
readWithKnownLength(DataInputin,
intlen)
Read a Text object whose length is already known.
|
写:
void |
write(DataOutputout)
serialize write this object to out length uses zero-compressed encoding
|
void |
write(DataOutputout,
intmaxLength)
|
static int |
writeString(DataOutputout,Strings)
Write a UTF8 encoded string to out
|
static int |
writeString(DataOutputout,Strings,
intmaxLength)
Write a UTF8 encoded string with a maximum size to out
|
测试源码:
public void testIO() throws Exception { DataOutputBuffer out = new DataOutputBuffer(); DataInputBuffer in = new DataInputBuffer(); for (int i = 0; i < NUM_ITERATIONS; i++) { // generate a random string String before; if (i == 0) before = getLongString(); else before = getTestString(); // write it out.reset(); Text.writeString(out, before); // test that it reads correctly in.reset(out.getData(), out.getLength()); String after = Text.readString(in); assertTrue(before.equals(after)); // Test compatibility with Java's other decoder int strLenSize = WritableUtils.getVIntSize(Text.utf8Length(before)); String after2 = new String(out.getData(), strLenSize, out.getLength()-strLenSize, "UTF-8"); assertTrue(before.equals(after2)); } } public void doTestLimitedIO(String str, int len) throws IOException { DataOutputBuffer out = new DataOutputBuffer(); DataInputBuffer in = new DataInputBuffer(); out.reset(); try { Text.writeString(out, str, len); fail("expected writeString to fail when told to write a string " + "that was too long! The string was '" + str + "'"); } catch (IOException e) { } Text.writeString(out, str, len + 1); // test that it reads correctly in.reset(out.getData(), out.getLength()); in.mark(len); String after; try { after = Text.readString(in, len); fail("expected readString to fail when told to read a string " + "that was too long! The string was '" + str + "'"); } catch (IOException e) { } in.reset(); after = Text.readString(in, len + 1); assertTrue(str.equals(after)); } public void testLimitedIO() throws Exception { doTestLimitedIO("汉", 2);//注意:汉字“汉”占用3个字节 doTestLimitedIO("abcd", 3); doTestLimitedIO("foo bar baz", 10); doTestLimitedIO("1", 0); }
4、Text专门优化重写了compare方法:
/** A WritableComparator optimized for Text keys. */ public static class Comparator extends WritableComparator { public Comparator() { super(Text.class); } @Override public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) { int n1 = WritableUtils.decodeVIntSize(b1[s1]); int n2 = WritableUtils.decodeVIntSize(b2[s2]); return compareBytes(b1, s1+n1, l1-n1, b2, s2+n2, l2-n2); } }
测试实例:
public void testCompare() throws Exception { DataOutputBuffer out1 = new DataOutputBuffer(); DataOutputBuffer out2 = new DataOutputBuffer(); DataOutputBuffer out3 = new DataOutputBuffer(); Text.Comparator comparator = new Text.Comparator(); for (int i=0; i<NUM_ITERATIONS; i++) { // reset output buffer out1.reset(); out2.reset(); out3.reset(); // generate two random strings String str1 = getTestString(); String str2 = getTestString(); if (i == 0) { str1 = getLongString(); str2 = getLongString(); } else { str1 = getTestString(); str2 = getTestString(); } // convert to texts Text txt1 = new Text(str1); Text txt2 = new Text(str2); Text txt3 = new Text(str1); // serialize them txt1.write(out1); txt2.write(out2); txt3.write(out3); // compare two strings by looking at their binary formats int ret1 = comparator.compare(out1.getData(), 0, out1.getLength(), out2.getData(), 0, out2.getLength()); // compare two strings int ret2 = txt1.compareTo(txt2); assertEquals(ret1, ret2); assertEquals("Equivalence of different txt objects, same content" , 0, txt1.compareTo(txt3)); assertEquals("Equvalence of data output buffers", 0, comparator.compare(out1.getData(), 0, out3.getLength(), out3.getData(), 0, out3.getLength())); } }
5、find()方法:
注:只带一个String参数的,默认为从头开始查找字串,而带两个参数的表示从start位置往后变开始查找字串,如果找到了,则返回值为字串第一次出现的位置,没找到,则返回-1
实例源码:
public void testFind() throws Exception { Text text = new Text("abcd\u20acbdcd\u20ac"); assertTrue(text.getLength()==14); assertTrue(text.find("abd")==-1); assertTrue(text.find("ac")==-1); assertTrue(text.find("\u20ac")==4); assertTrue(text.find("\u20ac", 5)==11); byte [] b1 = new byte[]{97, 98, 99, 100, -30, -126, -84, 98, 100, 99, 100, -30, -126, -84}; byte [] b2 = text.copyBytes(); assertTrue(Arrays.equals(b1, b2)); } public void testFindAfterUpdatingContents() throws Exception { Text text = new Text("abcd"); text.set("a".getBytes()); assertEquals(text.getLength(),1); assertEquals(text.find("a"), 0); assertEquals(text.find("b"), -1); }
注:这里比较数组里的值是否相等,使用arrays.equals(a,b)。
6、validateUTF8,有效地UTF-8编码检查
static void |
validateUTF8(byte[]utf8)
Check if a byte array contains valid utf-8
|
static void |
validateUTF8(byte[]utf8, intstart, intlen)
Check to see if a byte array is valid utf-8
|
实例源码:
public void testValidate() throws Exception { Text text = new Text("abcd\u20acbdcd\u20ac"); byte [] utf8 = text.getBytes(); int length = text.getLength(); Text.validateUTF8(utf8, 0, length); }
7、清空Text的clear()方法:
实例源码:
public void testClear() throws Exception { // Test lengths on an empty text object Text text = new Text(); assertEquals( "Actual string on an empty text object must be an empty string", "", text.toString()); assertEquals("Underlying byte array length must be zero", 0, text.getBytes().length); assertEquals("String's length must be zero", 0, text.getLength()); // Test if clear works as intended text = new Text("abcd\u20acbdcd\u20ac"); int len = text.getLength(); text.clear(); assertEquals("String must be empty after clear()", "", text.toString()); assertTrue( "Length of the byte array must not decrease after clear()", text.getBytes().length >= len); assertEquals("Length of the string must be reset to 0 after clear()", 0, text.getLength()); }
8、Text的append方法:
void |
append(byte[]utf8, intstart, intlen)
Append a range of bytes to the end of the given text
|
|
测试源码:
public void testTextText() throws CharacterCodingException { Text a=new Text("abc"); Text b=new Text("a"); b.set(a); assertEquals("abc", b.toString()); a.append("xdefgxxx".getBytes(), 1, 4); assertEquals("modified aliased string", "abc", b.toString()); assertEquals("appended string incorrectly", "abcdefg", a.toString()); // add an extra byte so that capacity = 14 and length = 8 a.append(new byte[]{'d'}, 0, 1); assertEquals("abcdefgd",a.toString()); byte[] b1= new byte[]{97, 98, 99, 100, 101, 102, 103, 100, 0, 0, 0, 0, 0, 0}; byte[] b2= new byte[]{97, 98, 99, 100, 101, 102, 103, 100}; assertEquals(8, a.getLength()); assertEquals(14, a.getBytes().length); assertEquals(8, a.copyBytes().length); assertTrue(Arrays.equals(b1, a.getBytes())); assertTrue(Arrays.equals(b2, a.copyBytes())); a.set(new Text("abc")); assertEquals(14, a.getBytes().length);//byte长度没变小! assertEquals(3, a.getLength()); }
完整的实例源码如下:
package org.apache.hadoop.io; import junit.framework.TestCase; import java.io.IOException; import java.nio.BufferUnderflowException; import java.nio.ByteBuffer; import java.nio.charset.CharacterCodingException; import java.util.Arrays; import java.util.Random; import com.google.common.base.Charsets; import com.google.common.primitives.Bytes; /** Unit tests for LargeUTF8. */ public class THT_TestText extends TestCase { private static final int NUM_ITERATIONS = 100; public THT_TestText(String name) { super(name); } private static final Random RANDOM = new Random(1); private static final int RAND_LEN = -1; // generate a valid java String private static String getTestString(int len) throws Exception { StringBuilder buffer = new StringBuilder(); int length = (len==RAND_LEN) ? RANDOM.nextInt(1000) : len; while (buffer.length()<length) { int codePoint = RANDOM.nextInt(Character.MAX_CODE_POINT); char tmpStr[] = new char[2]; if (Character.isDefined(codePoint)) { //unpaired surrogate if (codePoint < Character.MIN_SUPPLEMENTARY_CODE_POINT && !Character.isHighSurrogate((char)codePoint) && !Character.isLowSurrogate((char)codePoint)) { Character.toChars(codePoint, tmpStr, 0); buffer.append(tmpStr); } } } return buffer.toString(); } public static String getTestString() throws Exception { return getTestString(RAND_LEN); } public static String getLongString() throws Exception { String str = getTestString(); int length = Short.MAX_VALUE+str.length(); StringBuilder buffer = new StringBuilder(); while(buffer.length()<length) buffer.append(str); return buffer.toString(); } public void testWritable() throws Exception { for (int i = 0; i < NUM_ITERATIONS; i++) { String str; if (i == 0) str = getLongString(); else str = getTestString(); TestWritable.testWritable(new Text(str)); } } public void testCoding() throws Exception { String before = "Bad \t encoding \t testcase"; Text text = new Text(before); String after = text.toString(); assertTrue(before.equals(after)); for (int i = 0; i < NUM_ITERATIONS; i++) { // generate a random string if (i == 0) before = getLongString(); else before = getTestString(); // test string to utf8 ByteBuffer bb = Text.encode(before); byte[] utf8Text = bb.array(); byte[] utf8Java = before.getBytes("UTF-8"); assertEquals(0, WritableComparator.compareBytes( utf8Text, 0, bb.limit(), utf8Java, 0, utf8Java.length)); // test utf8 to string after = Text.decode(utf8Java); assertTrue(before.equals(after)); } } public void testIO() throws Exception { DataOutputBuffer out = new DataOutputBuffer(); DataInputBuffer in = new DataInputBuffer(); for (int i = 0; i < NUM_ITERATIONS; i++) { // generate a random string String before; if (i == 0) before = getLongString(); else before = getTestString(); // write it out.reset(); Text.writeString(out, before); // test that it reads correctly in.reset(out.getData(), out.getLength()); String after = Text.readString(in); assertTrue(before.equals(after)); // Test compatibility with Java's other decoder int strLenSize = WritableUtils.getVIntSize(Text.utf8Length(before)); String after2 = new String(out.getData(), strLenSize, out.getLength()-strLenSize, "UTF-8"); assertTrue(before.equals(after2)); } } public void doTestLimitedIO(String str, int len) throws IOException { DataOutputBuffer out = new DataOutputBuffer(); DataInputBuffer in = new DataInputBuffer(); out.reset(); try { Text.writeString(out, str, len); fail("expected writeString to fail when told to write a string " + "that was too long! The string was '" + str + "'"); } catch (IOException e) { } Text.writeString(out, str, len + 1); // test that it reads correctly in.reset(out.getData(), out.getLength()); in.mark(len); String after; try { after = Text.readString(in, len); fail("expected readString to fail when told to read a string " + "that was too long! The string was '" + str + "'"); } catch (IOException e) { } in.reset(); after = Text.readString(in, len + 1); assertTrue(str.equals(after)); } public void testLimitedIO() throws Exception { doTestLimitedIO("汉", 2); doTestLimitedIO("abcd", 3); doTestLimitedIO("foo bar baz", 10); doTestLimitedIO("1", 0); } public void testCompare() throws Exception { DataOutputBuffer out1 = new DataOutputBuffer(); DataOutputBuffer out2 = new DataOutputBuffer(); DataOutputBuffer out3 = new DataOutputBuffer(); Text.Comparator comparator = new Text.Comparator(); for (int i=0; i<NUM_ITERATIONS; i++) { // reset output buffer out1.reset(); out2.reset(); out3.reset(); // generate two random strings String str1 = getTestString(); String str2 = getTestString(); if (i == 0) { str1 = getLongString(); str2 = getLongString(); } else { str1 = getTestString(); str2 = getTestString(); } // convert to texts Text txt1 = new Text(str1); Text txt2 = new Text(str2); Text txt3 = new Text(str1); // serialize them txt1.write(out1); txt2.write(out2); txt3.write(out3); // compare two strings by looking at their binary formats int ret1 = comparator.compare(out1.getData(), 0, out1.getLength(), out2.getData(), 0, out2.getLength()); // compare two strings int ret2 = txt1.compareTo(txt2); assertEquals(ret1, ret2); assertEquals("Equivalence of different txt objects, same content" , 0, txt1.compareTo(txt3)); assertEquals("Equvalence of data output buffers", 0, comparator.compare(out1.getData(), 0, out3.getLength(), out3.getData(), 0, out3.getLength())); } } public void testFind() throws Exception { Text text = new Text("abcd\u20acbdcd\u20ac"); assertTrue(text.getLength()==14); assertTrue(text.find("abd")==-1); assertTrue(text.find("ac")==-1); assertTrue(text.find("\u20ac")==4); assertTrue(text.find("\u20ac", 5)==11); byte [] b1 = new byte[]{97, 98, 99, 100, -30, -126, -84, 98, 100, 99, 100, -30, -126, -84}; byte [] b2 = text.copyBytes(); assertTrue(Arrays.equals(b1, b2)); } public void testFindAfterUpdatingContents() throws Exception { Text text = new Text("abcd"); text.set("a".getBytes()); assertEquals(text.getLength(),1); assertEquals(text.find("a"), 0); assertEquals(text.find("b"), -1); } public void testValidate() throws Exception { Text text = new Text("abcd\u20acbdcd\u20ac"); byte [] utf8 = text.getBytes(); int length = text.getLength(); Text.validateUTF8(utf8, 0, length); } public void testClear() throws Exception { // Test lengths on an empty text object Text text = new Text(); assertEquals( "Actual string on an empty text object must be an empty string", "", text.toString()); assertEquals("Underlying byte array length must be zero", 0, text.getBytes().length); assertEquals("String's length must be zero", 0, text.getLength()); // Test if clear works as intended text = new Text("abcd\u20acbdcd\u20ac"); int len = text.getLength(); text.clear(); assertEquals("String must be empty after clear()", "", text.toString()); assertTrue( "Length of the byte array must not decrease after clear()", text.getBytes().length >= len); assertEquals("Length of the string must be reset to 0 after clear()", 0, text.getLength()); } public void testTextText() throws CharacterCodingException { Text a=new Text("abc"); Text b=new Text("a"); b.set(a); assertEquals("abc", b.toString()); a.append("xdefgxxx".getBytes(), 1, 4); assertEquals("modified aliased string", "abc", b.toString()); assertEquals("appended string incorrectly", "abcdefg", a.toString()); // add an extra byte so that capacity = 14 and length = 8 a.append(new byte[]{'d'}, 0, 1); assertEquals("abcdefgd",a.toString()); byte[] b1= new byte[]{97, 98, 99, 100, 101, 102, 103, 100, 0, 0, 0, 0, 0, 0}; byte[] b2= new byte[]{97, 98, 99, 100, 101, 102, 103, 100}; assertEquals(8, a.getLength()); assertEquals(14, a.getBytes().length); assertEquals(8, a.copyBytes().length); assertTrue(Arrays.equals(b1, a.getBytes())); assertTrue(Arrays.equals(b2, a.copyBytes())); a.set(new Text("abc")); assertEquals(14, a.getBytes().length);//byte长度没变小! assertEquals(3, a.getLength()); } private class ConcurrentEncodeDecodeThread extends Thread { public ConcurrentEncodeDecodeThread(String name) { super(name); } @Override public void run() { final String name = this.getName(); DataOutputBuffer out = new DataOutputBuffer(); DataInputBuffer in = new DataInputBuffer(); for (int i=0; i < 1000; ++i) { try { out.reset(); WritableUtils.writeString(out, name); in.reset(out.getData(), out.getLength()); String s = WritableUtils.readString(in); assertEquals("input buffer reset contents = " + name, name, s); } catch (Exception ioe) { throw new RuntimeException(ioe); } } } } public void testConcurrentEncodeDecode() throws Exception{ Thread thread1 = new ConcurrentEncodeDecodeThread("apache"); Thread thread2 = new ConcurrentEncodeDecodeThread("hadoop"); thread1.start(); thread2.start(); thread2.join(); thread2.join(); } public void testAvroReflect() throws Exception { AvroTestUtil.testReflect (new Text("foo"), "{\"type\":\"string\",\"java-class\":\"org.apache.hadoop.io.Text\"}"); } /** * */ public void testCharAt() { String line = "adsawseeeeegqewgasddga"; Text text = new Text(line); for (int i = 0; i < line.length(); i++) { assertTrue("testCharAt error1 !!!", text.charAt(i) == line.charAt(i)); } assertEquals("testCharAt error2 !!!", -1, text.charAt(-1)); assertEquals("testCharAt error3 !!!", -1, text.charAt(100)); } /** * test {@code Text} readFields/write operations */ public void testReadWriteOperations() { String line = "adsawseeeeegqewgasddga"; byte[] inputBytes = line.getBytes(); inputBytes = Bytes.concat(new byte[] {(byte)22}, inputBytes); DataInputBuffer in = new DataInputBuffer(); DataOutputBuffer out = new DataOutputBuffer(); Text text = new Text(line); try { in.reset(inputBytes, inputBytes.length); text.readFields(in); } catch(Exception ex) { fail("testReadFields error !!!"); } try { text.write(out); } catch(IOException ex) { } catch(Exception ex) { fail("testReadWriteOperations error !!!"); } } public void testReadWithKnownLength() throws IOException { String line = "hello world"; byte[] inputBytes = line.getBytes(Charsets.UTF_8); DataInputBuffer in = new DataInputBuffer(); Text text = new Text(); in.reset(inputBytes, inputBytes.length); text.readWithKnownLength(in, 5); assertEquals("hello", text.toString()); // Read longer length, make sure it lengthens in.reset(inputBytes, inputBytes.length); text.readWithKnownLength(in, 7); assertEquals("hello w", text.toString()); // Read shorter length, make sure it shortens in.reset(inputBytes, inputBytes.length); text.readWithKnownLength(in, 2); assertEquals("he", text.toString()); } /** * test {@code Text.bytesToCodePoint(bytes) } * with {@code BufferUnderflowException} * */ public void testBytesToCodePoint() { try { ByteBuffer bytes = ByteBuffer.wrap(new byte[] {-2, 45, 23, 12, 76, 89}); Text.bytesToCodePoint(bytes); assertTrue("testBytesToCodePoint error !!!", bytes.position() == 6 ); } catch (BufferUnderflowException ex) { fail("testBytesToCodePoint unexp exception"); } catch (Exception e) { fail("testBytesToCodePoint unexp exception"); } } public void testbytesToCodePointWithInvalidUTF() { try { Text.bytesToCodePoint(ByteBuffer.wrap(new byte[] {-2})); fail("testbytesToCodePointWithInvalidUTF error unexp exception !!!"); } catch (BufferUnderflowException ex) { } catch(Exception e) { fail("testbytesToCodePointWithInvalidUTF error unexp exception !!!"); } } public void testUtf8Length() { assertEquals("testUtf8Length1 error !!!", 1, Text.utf8Length(new String(new char[]{(char)1}))); assertEquals("testUtf8Length127 error !!!", 1, Text.utf8Length(new String(new char[]{(char)127}))); assertEquals("testUtf8Length128 error !!!", 2, Text.utf8Length(new String(new char[]{(char)128}))); assertEquals("testUtf8Length193 error !!!", 2, Text.utf8Length(new String(new char[]{(char)193}))); assertEquals("testUtf8Length225 error !!!", 2, Text.utf8Length(new String(new char[]{(char)225}))); assertEquals("testUtf8Length254 error !!!", 2, Text.utf8Length(new String(new char[]{(char)254}))); } }
运行结果如下:
相关推荐
Hadoop 2.7.1是Hadoop发展过程中的一个重要版本,它提供了许多增强特性和稳定性改进,使得大规模数据处理更加高效和可靠。在这个版本中,Hadoop的核心组件包括HDFS(Hadoop Distributed File System)和MapReduce,...
下载winutils.exe,hadoop.dll放到hadoop环境的bin目录,建议尽量使用版本匹配的,必然hadoop-2.6就使用2.6版本的。2.7版本就使用2.7.。理论上2.7版本可以使用在2.6版本上
- 设置I/O文件缓冲大小:`<value>131702</value>`。 - **修改`hdfs-site.xml`**: - 设置NameNode数据目录:`<value>file:/home/yy/hadoop-2.7.1/dfs/name</value>`。 - 设置DataNode数据目录:`<value>file:/...
1. **HDFS(Hadoop Distributed File System)**:Hadoop的核心组件之一,是一个分布式文件系统,旨在跨多台机器提供高容错性和高吞吐量的数据访问。HDFS通过将大文件分割成块并在集群中的多个节点上存储来实现这...
标题中的"hadoop2.7.1.rar"表明这是一个关于Apache Hadoop的压缩文件,具体版本为2.7.1。Hadoop是一个开源框架,主要用于分布式存储和计算,它由Apache软件基金会开发,广泛应用于大数据处理领域。这个压缩包可能是...
Hadoop安装教程_单机/伪分布式配置_Hadoop2.7.1/Ubuntu 16.04 本教程主要讲述了在 Ubuntu 16.04 环境下安装 Hadoop 2.7.1 的步骤,包括单机模式、伪分布式模式和分布式模式三种安装方式。以下是本教程的知识点总结...
Hadoop2.7.1是Hadoop发展中的一个重要版本,它在前一个版本的基础上进行了一系列的优化和改进,增强了系统的稳定性和性能。这个压缩包文件包含的是Hadoop2.7.1的中文文档,对于学习和理解Hadoop的运作机制、配置以及...
Hadoop 2.7.1是其一个重要的版本,提供了许多性能优化和功能增强。然而,Hadoop最初设计的目标是在Linux环境下运行,因此,直接在Windows系统上运行可能会遇到兼容性问题。为了在Windows上成功部署并运行Hadoop ...
Hadoop 2.7.1 是一个重要的版本,在大数据处理领域具有广泛的影响力。这个版本包含了Hadoop的核心组件,包括HDFS(Hadoop Distributed File System)和MapReduce,这两个组件是Hadoop生态系统的基础。HDFS提供了...
Hadoop2.7.1是Hadoop的一个稳定版本,提供了许多改进和新特性,旨在提高性能、可靠性和可管理性。 在Hadoop2.7.1安装包中,`hadoop-2.7.1.tar.gz`是主要的发布文件,包含了Hadoop的所有组件和依赖库。这个tarball...
在IT行业中,Eclipse是一款广泛使用的Java集成开发环境(IDE),而Hadoop是Apache软件基金会的一个开源项目,主要用于处理和存储大数据。`eclipse hadoop2.7.1 plugin`是为了方便开发者在Eclipse中进行Hadoop项目...
hadoop2.7.1平台搭建
在这个hadoop2.7.1tar包.zip文件中,我们拥有了Hadoop 2.7.1的源码或二进制版本,它是一个重要的里程碑版本,包含了很多改进和优化。Hadoop在大数据领域扮演着核心角色,其主要由两个关键组件构成:HDFS(Hadoop ...
标题中的"hadoop-2.7.1.tar.gz"是一个压缩包文件,它是Apache Hadoop的2.7.1版本。Hadoop是一个开源框架,主要用于分布式存储和计算,它使得处理和存储海量数据变得可能。".tar.gz"是Linux/Unix系统中常用的文件压缩...
标题 "hadoop2.7.1-win32.zip" 指示了这是一个适用于Windows 32位操作系统的Hadoop版本,具体为2.7.1。Hadoop是Apache软件基金会开发的一个开源分布式计算框架,它允许在大量计算机节点上处理和存储海量数据。这个...
《Hadoop Winutils.exe在2.7.1版本中的应用与配置详解》 Hadoop作为一个分布式计算框架,广泛应用于大数据处理领域。在Windows环境中,Winutils.exe和hadoop.dll是Hadoop的重要组成部分,它们为Hadoop在Windows上的...
Hadoop 2.7.1 是 Apache 基金会发布的一个开源分布式计算框架,它在大数据处理领域扮演着至关重要的角色。这个版本是Hadoop发展中的一个重要里程碑,提供了许多性能优化和新特性,旨在提高数据处理的效率和可靠性。 ...
Hadoop 2.7.1是这个框架的一个重要版本,它包含了各种优化和改进,以提高数据处理的效率和稳定性。在这个版本中,有两个关键的组件是hadoop.dll和winutils.exe,它们在Windows环境下运行Hadoop时扮演着至关重要的...
Hadoop2.7.1是Hadoop的一个重要版本,它带来了许多改进和优化,而Spark则是一个快速、通用且可扩展的数据处理框架,尤其在处理大规模数据时表现出色。Spark与Hadoop的兼容性是确保大数据工作流流畅运行的关键。 ...
http://archive.apache.org/dist/hadoop/core/hadoop-2.7.1/hadoop-2.7.1-src.tar.gz ``` - **解压操作**: ``` # tar -zxvf hadoop-2.7.1-src.tar.gz -C /opt ``` 解压完成后,进入到 `/opt/hadoop-2.7.1-...