`
ClouderaHadoop
  • 浏览: 10897 次
社区版块
存档分类
最新评论

那些在CDH5中是bug,到了CDH6版本就修复了的问题

阅读更多
盘点那些在CDH5中是bug,到了CDH6版本就修复了的问题。涉及到Hadoop、HDFS、YARN、HBASE、hive、hue、impala、kudu、oozie、solr、spark、kafka、parquet、zookeeper等组件。如果你的集群问题是被列出的这些,那么升级是可以解决问题的。
列出的只是部分的一百多个问题,Cloudera今年年底会停止CDH5的支持,对于CDH5的用户来说,升级是大势所趋。
问题 问题描述

HADOOP-12267 s3a failure due to integer overflow bug in AWS SDK
HADOOP-15169 "hadoop.ssl.enabled.protocols" should be considered in httpserver2
HADOOP-15812 ABFS: Improve AbfsRestOperationException format to ensure full msg can be displayed on console
HADOOP-15846 ABFS: fix mask related bugs in setAcl, modifyAclEntries and removeAclEntries.
HADOOP-15872 ABFS: Update to target 2018-11-09 REST version for ADLS Gen 2
HADOOP-15940 ABFS: For HNS account, avoid unnecessary get call when doing Rename
HADOOP-15948 Inconsistency in get and put syntax if filename/dirname contains space
HADOOP-15968 ABFS: getNamespaceEnabled can fail blocking user access thru ACLs
HADOOP-15969 ABFS: getNamespaceEnabled can fail blocking user access thru ACLs
HADOOP-15972 ABFS: reduce list page size to to 500
HADOOP-15975 ABFS: remove timeout check for DELETE and RENAME
HADOOP-16048 ABFS: Fix Date format parser
HADOOP-16461 Regression: FileSystem cache lock parses XML within the lock
HADOOP-16578 ABFS: fileSystemExists() should not call container level apis
HADOOP- 16587 OM and DN should persist SCM certificate as the trust root
HDFS-13193 Various Improvements for BlockTokenSecretManager
HDFS-13941 make storageId in BlockPoolTokenSecretManager.checkAccess optional
HDFS-14026 Overload BlockPoolTokenSecretManager.checkAccess to make storageId and storageType optional
HDFS-14366 Improve HDFS append performance
YARN- 9217 Nodemanager will fail to start if GPU is misconfigured on the node or GPU drivers missing
YARN-9235 If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown
YARN-9337 GPU auto-discovery script runs even when the resource is given by hand
HBASE- 21991 Fix MetaMetrics issues - [Race condition, Faulty remove logic], few improvements
HBASE-22380 Break circle replication when doing bulkload
HBASE-23046 Remove compatibility case from truncate command
HIVE- 21999 Add sensitive ABFS configuration properties to HiveConf hidden list
HIVE-22236 Fail to create View selecting View containing NOT IN subquery
HUE-8946 [core] Add back name as argument to import LDAP group or user commands[useradmin] Fix argument as list in import_ldap_user and import_ldap_group
HUE-9011 Fix invalid delimiters in create Hive table
HUE-9019 Fix concurrent_user_session_limit failed after Django upgrade
HUE-9025 Fix multi query statement with invalidate metadata
HUE-9027 Fix erratic behaviour of the horizontal result scrollbar
IMPALA-6159 DataStreamSender should transparently handle some connection reset by peer
IMPALA-7802 Implement support for closing idle sessions
IMPALA-8333 Remove Impala Shell warnings part 2
IMPALA-8612 NPE when DropTableOrViewStmt analysis leaves serverName_ NULL
IMPALA-8673 Add query option to force plan hints for insert queries
IMPALA-8790 IllegalStateException: Illegal reference to non-materialized slot
IMPALA-8851 Drop table if exists throws authorization exception when table does not exist
IMPALA-8969 Grouping aggregator can cause segmentation fault when doing multiple aggregations.
KUDU-3014 Java client doesn't verify channel bindings during connection negotiation
KUDU-2980 Fault tolerant and diff scans fail if projection contains mis-ordered primary key columns
KUDU-2871 TLS 1.3 not supported by krpc
KUDU-2989 SASL server fails when FQDN is greater than 63 characters long
OOZIE-3464 Use UTF8 charset instead of default one
OOZIE-3543 Upgrade quartz to 2.3.1
SOLR-13532 Unable to start core recovery due to timeout in ping request
SOLR-13921 Processing UpdateRequest with delegation token throws NullPointerException
SENTRY-2535 SentryKafkaAuthorizer throws Exception when describing ACLs
SPARK-24621 WebUI - application 'name' urls point to http instead of https (even when ssl enabled)
SPARK-27453 DataFrameWriter.partitionBy is Silently Dropped by DSV1
SPARK-27621 Calling transform() method on a LinearRegressionModel throws NoSuchElementException
SPARK-29082 Spark driver cannot start with only delegation tokens
SPARK-29105 SHS may delete driver log file of in progress application
ZOOKEEPER-2251 testManyChildWatchersAutoReset is flaky
YARN-4212 airScheduler: Can't create a DRF queue under a FAIR policy queue
MAPREDUCE-6638 Do not attempt to recover progress from previous job attempts if spill encryption is enabled
YARN-1558 After apps are moved across queues, store new queue info in the RM state store
HBASE-7621 REST client (RemoteHTable) doesn't support binary row keys
Hive-11600 Hive Parser to Support multi col in clause (x,y..) in ((..),..., ())
HIVE-12727, refactor Hive strict checks to be more granular, allow order by no limit and no partition filter by default for now
HIVE-15148 disallow loading data into bucketed tables (by default)
HIVE-18251 Loosen restriction for some checks
HIVE-18552 Split hive.strict.checks.large.query into two configs
HIVE-12609 Remove javaXML serialization
HIVE-15797 separate the configs for gby and oby position alias usage
HIVE-12442 HiveServer2: Refactor/repackage HiveServer2's Thrift code so that it can be used in the tasks
HIVE-12063 Pad Decimal numbers with trailing zeros to the scale of the column
HIVE-12237 Use slf4j as logging façade
HIVE-11304 Migrate to Log4j2 from Log4j 1.x
HIVE-6757 Remove deprecated parquet classes from outside of org.apache package
HIVE-12164 Remove jdbc stats collection mechanism
HIVE-12411 Remove counter based stats collection mechanism
HIVE-12005 Remove hbase based stats collection mechanism
HIVE-7575 GetTables thrift call is very slow
HIVE-11785 Support escaping carriage return and new line for LazySimpleSerDe
KAFKA-6252 A metric named 'XX' already exists, can't register another one.
KAFKA-5987 Kafka metrics templates used in document generation should maintain order of tags
KAFKA-5968 Remove all broker metrics during shutdown
KAFKA-5746 Add new metrics to support health checks
KAFKA-5738 Add cumulative count attribute for all Kafka rate metrics
KAFKA-5597 Autogenerate Producer sender metrics
KAFKA-5461 KIP-168: Add GlobalTopicCount metric per cluster
KAFKA-5341 Add UnderMinIsrPartitionCount and per-partition UnderMinIsr metrics
KAFKA-6258 SSLTransportLayer should keep reading from socket until either the buffer is full or the socket has no more data
KAFKA-5920 Handle SSL authentication failures as non-retriable exceptions in clients
KAFKA-5854 Handle SASL authentication failures as non-retriable exceptions in clients
KAFKA-5783 Implement KafkaPrincipalBuilder interface with support for SASL (KIP-189)
KAFKA-5720 In Jenkins, kafka.api.SaslSslAdminClientIntegrationTest failed with org.apache.kafka.common.errors.TimeoutException
KAFKA-5417 Clients get inconsistent connection states when SASL/SSL connection is marked CONECTED and DISCONNECTED at the same time
KAFKA-4764 Improve diagnostics for SASL authentication failures
KAFKA-6287 Inconsistent protocol type for empty consumer groups
KAFKA-5856 Add AdminClient.createPartitions()
KAFKA-5763 Refactor NetworkClient to use LogContext
KAFKA-5762 Refactor AdminClient to use LogContext
KAFKA-5755 Refactor Producer to use LogContext
KAFKA-5737 KafkaAdminClient thread should be daemon
KAFKA-5726 KafkaConsumer.subscribe() overload that takes just Pattern without ConsumerRebalanceListener
KAFKA-5629 Console Consumer overrides auto.offset.reset property when provided on the command line without warning about it.
KAFKA-5556 KafkaConsumer.commitSync throws IllegalStateException: Attempt to retrieve exception from future which hasn't failed
KAFKA-5534 KafkaConsumer offsetsForTimes should include partitions in result even if no offset could be found
KAFKA-5512 KafkaConsumer: High memory allocation rate when idle
KAFKA-4856 Calling KafkaProducer.close() from multiple threads may cause spurious error
KAFKA-4767 KafkaProducer is not joining its IO thread properly
KAFKA-4669 KafkaProducer.flush hangs when NetworkClient.handleCompletedReceives throws exception
PARQUET-1217 Incorrect handling of missing values in Statistics
PARQUET-686 Allow for Unsigned Statistics in Binary Type
PARQUET-357 Parquet-thrift generates wrong schema for Thrift binary fields
PARQUET-753 GroupType.union() doesn't merge the original type
PARQUET-765 Upgrade Avro to 1.8.1
PARQUET-783 H2SeekableInputStream does not close its underlying FSDataInputStream, leading to connection leaks
PARQUET-791 Predicate pushing down on missing columns should work on UserDefinedPredicate too
PARQUET-806 Parquet-tools silently suppresses error messages
PARQUET-825 Static analyzer findings (NPEs, resource leaks)
PARQUET-1005 Fix DumpCommand parsing to allow column projection
PARQUET-1064 Deprecate type-defined sort ordering for INTERVAL type
PARQUET-1065 Deprecate type-defined sort ordering for INT96 type
PARQUET-1133 INT96 types and Maps without OriginalType cause exceptions in PigSchemaConverter
PARQUET-1141 IDs are dropped in metadata conversion
PARQUET-1152 Parquet-thrift doesn't compile with Thrift 0.9.3
PARQUET-1153 Parquet-thrift doesn't compile with Thrift 0.10.0
PARQUET-1185 TestBinary#testBinary unit test fails after PARQUET-1141
PARQUET-1191 Type.hashCode() takes originalType into account but Type.equals() does not
PARQUET-1208 Occasional endless loop in unit test
PARQUET-1217 Incorrect handling of missing values in Statistics
PARQUET-1246 Ignore float/double statistics in case of NaN
KUDU-2353 Add tooling to parse diagnostics log
KUDU-2290 Tool to re-create a tablet
KUDU-2399 Support IS NULL / IS NOT NULL predicates in Python
KUDU-2287 Add replica metric tracking time since there was a valid leader
KUDU-2427 Add support for Ubuntu 18.04
KUDU-1889 Support OpenSSL 1.1.0
KUDU-2012 Kudu Flume sink authn support
KUDU-2539 Supporting Spark Streaming DataFrame in KuduContext
KUDU-2529 kudu CLI command supports list the tablets under a table and list the replicas of a tablet
KUDU-16 Add server-side LIMIT for scanners
KUDU-1276 Add a vectorized read/write interface for pandas DataFrame objects
KUDU-2441 Unlike C++, Kudu Python API missing "set mutation buffer space"
KUDU-2095 Add scanner keepAlive method to the java client
KUDU-2563 Spark integration should use the scanner keep-alive API
KUDU-2368 Add ability to configure the number of reactors in KuduClient
KUDU-2395 Thread spike with all threads blocked in libnss
KUDU-2566 Enhance rowset tree pruning and discard string copy while querying
KUDU-1861 kudu test loadgen: change default behavior to avoid compactions on tablet servers
KUDU-2469 Handle CFile checksum failures
KUDU-2359 tserver should allow starting with a small number of missing data dirs
KUDU-2191 Hive Metastore Integration
KUDU-2242 Wait for NTP synchronization on startup
KUDU-2289 Tablet deletion should be throttled
ZOOKEEPER-2940 Deal with maxbuffer as it relates to large requests from clients
ZOOKEEPER-3019 Add a metric to track number of slow fsyncs
ZOOKEEPER-2994 Tool required to recover log and snapshot entries with CRC errors

分享到:
评论

相关推荐

    CDH5离线安装和配置指南

    截至目前,CDH已经发布了多个版本,其中最新的版本是CDH5,它是在Apache Hadoop 2.0.0的基础上进行了一系列升级和优化的结果。 **版本特点**: - **PatchLevel划分**:Cloudera采用PatchLevel来标识每个版本中的...

    cdh6.3.2+cm6.3.1.zip下载

    版本6.3.1可能包含了一些新的功能、bug修复和性能提升,以提供更好的用户体验和管理能力。 `manifest.json`文件在CDH和CM的上下文中,通常包含了关于软件包的元数据,如版本信息、依赖关系和安装指示。它是安装过程...

    apache-phoenix-4.14.0-cdh5.14.2-bin.tar.gz

    - 这个版本可能包含了一些新的特性、性能改进和bug修复,比如支持更多的 SQL 功能,优化查询执行效率等。 - 它可能与 CDH 5.14.2 版本兼容,确保在 Cloudera 的大数据平台上的稳定运行。 3. **CDH 5.14.2**: ...

    hadoop-2.6.0-cdh5.10.0.tar.gz

    Hadoop 2.6.0-cdh5.10.1是Hadoop 2.x.y系列的一个次要版本,它的发布旨在提供增强的功能、性能改进以及对之前稳定版本2.4.1的bug修复。这个版本的Hadoop由Cloudera公司作为CDH(Cloudera Distribution Including ...

    hive2.1.1中orc格式读取报数组越界错误解决方法

    替换这些jar包意味着升级了Hive在CDH环境中的执行部分和ORC处理库,以解决在2.1.1版本中遇到的问题。 分发新jar包到各个服务器是必要的步骤,因为Hive通常在分布式环境中运行,每个节点都需要有正确的库才能正确...

    hive-jdbc-1.1.0-cdh5.12.1-standalone.rar

    3. **Bug修复**:解决了之前版本存在的问题,提高了系统的可靠性。 4. **安全增强**:可能加入了更多安全特性,比如对SSL/TLS的支持,以保护数据传输过程中的安全性。 "standalone"版本意味着这个驱动不需要依赖CDH...

    hadoop-2.6.0-cdh5.5.0.tar.gz

    1. 提升了HDFS的稳定性,修复了若干可能导致数据丢失或不一致性的bug。 2. 对YARN进行了优化,减少了资源浪费,提升了任务调度效率。 3. 引入了Impala 2.5,提供了更快的SQL查询性能和更丰富的数据分析功能。 4. ...

    phoenix官方所有版本的下载地址(包含最新)

    - **备份数据**:在升级到新版本或尝试新功能时,请确保已经备份了关键数据,以防万一出现任何意外情况。 - **监控性能**:在生产环境中部署Phoenix时,建议实施监控机制,以便及时发现并解决性能问题。 #### 结论 ...

    大数据云计算技术 Hadoop运维笔记(共21页).pptx

    - CDH的优势在于简化了安装和升级流程,提供yum、tar、rpm和Cloudera Manager等多种安装方式,并且能快速获取新功能和bug修复。 - Cloudera Manager是管理Hadoop集群的重要工具,它提供了详细的文档和自动化配置...

    tez.tar.gz

    这可能是因为新版本的Tez修复了某些bug,或者提供了性能改进。然而,由于没有进行深度验证,使用时需谨慎,以免引入新的问题。替换之前,建议先在测试环境中进行充分的兼容性和性能测试。 **标签关联的知识点:** ...

    zk3.4.5.tar.gz&zk3.4.10.tar.gz&zk3.4.14.tar.gz

    在3.4.14中,修复了大量的bug,包括一些可能导致服务崩溃或数据丢失的问题。这个版本还加强了对Java 8的支持,使其与现代Java开发环境更加兼容。同时,3.4.14对Zookeeper的配置和管理工具进行了改进,使得集群管理和...

    SF备份

    在实际开发过程中,"SF备份"可能代表了一次特定的代码版本,这可能是在功能实现、性能优化或bug修复后的一个稳定状态。备份文件的存在使得团队可以在出现问题时回滚到之前的状态,或者作为对比,理解代码的演变过程...

    ClouderaImpalaJDBC4-2.6.4.1005.zip

    这个版本号2.6.4.1005表明这是该驱动的一个特定发行版,可能包含了性能优化、bug修复或其他功能改进。 描述中提到的"impala_jdbc_2.6.4.1005.jar"是用于连接到Impala数据库的JDBC驱动文件。DBeaver是一款流行的开源...

    10-5+Impala在腾讯灯塔的优化和实践.pdf

    2. **Parquet Rowgroups和PageIndex合理性**:通过修复两个社区Bug(IMPALA-9566和IMPALA-10310),改进Parquet文件的组织结构,防止排序热谓词导致的字段离散,从而提高压缩比。 3. **HDFS大扫描压力**:利用...

    hadoop-common-0.23.8.jar_hadoop_

    在0.23.8版本中,它优化了性能,增强了稳定性,并修复了前一版本中的已知问题,为用户提供了更可靠的运行环境。 Hadoop Common的组成部分主要包括以下几个方面: 1. **网络通信**:Hadoop Common包含了如Socket和...

    华为大数据FusionInsight HD解决方案

    - **采纳社区精华,去除开源Bug**:吸收社区中的优秀贡献,同时修复已知的问题。 #### 特性介绍 - **敏捷**:提供完全开放的架构设计,支持性能的线性扩展;具备强大的SQL处理能力,方便业务迁移;配备了丰富的...

    hadoop-2.6.0.tar.gz

    这个版本在Hadoop的发展历程中扮演了重要角色,引入了许多改进和新特性,以提升性能和稳定性。 首先,Hadoop的主要组件包括Hadoop Distributed File System (HDFS)和MapReduce。HDFS是一个高度容错性的文件系统,...

Global site tag (gtag.js) - Google Analytics