`
IXHONG
  • 浏览: 451900 次
  • 性别: Icon_minigender_1
  • 来自: 北京
社区版块
存档分类
最新评论

HashMap解惑

    博客分类:
  • Java
阅读更多

 HashMap中有一些我们容易忽视的点

1. 关于key的hash和equals

public V put(K key, V value) {
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key);
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }

 由上述代码知道,hash值是用来确定bucketIndex,equals是用来和链表上的值比较,因此对于key是自定义的类,强烈建议重写hashCode和equals方法。此处也是容易引起内存泄露的点。

下面摘抄一段JDK里面的注释,

/**
     * Returns a hash code value for the object. This method is
     * supported for the benefit of hash tables such as those provided by
     * {@link java.util.HashMap}.
     * <p>
     * The general contract of {@code hashCode} is:
     * <ul>
     * <li>Whenever it is invoked on the same object more than once during
     *     an execution of a Java application, the {@code hashCode} method
     *     must consistently return the same integer, provided no information
     *     used in {@code equals} comparisons on the object is modified.
     *     This integer need not remain consistent from one execution of an
     *     application to another execution of the same application.
     * <li>If two objects are equal according to the {@code equals(Object)}
     *     method, then calling the {@code hashCode} method on each of
     *     the two objects must produce the same integer result.
     * <li>It is <em>not</em> required that if two objects are unequal
     *     according to the {@link java.lang.Object#equals(java.lang.Object)}
     *     method, then calling the {@code hashCode} method on each of the
     *     two objects must produce distinct integer results.  However, the
     *     programmer should be aware that producing distinct integer results
     *     for unequal objects may improve the performance of hash tables.
     * </ul>
     * <p>
     * As much as is reasonably practical, the hashCode method defined by
     * class {@code Object} does return distinct integers for distinct
     * objects. (This is typically implemented by converting the internal
     * address of the object into an integer, but this implementation
     * technique is not required by the
     * Java<font size="-2"><sup>TM</sup></font> programming language.)
     *
     * @return  a hash code value for this object.
     * @see     java.lang.Object#equals(java.lang.Object)
     * @see     java.lang.System#identityHashCode
     */
    public native int hashCode();

 

 

2. rehash的条件

void addEntry(int hash, K key, V value, int bucketIndex) {
        if ((size >= threshold) && (null != table[bucketIndex])) {
            resize(2 * table.length);
            hash = (null != key) ? hash(key) : 0;
            bucketIndex = indexFor(hash, table.length);
        }

        createEntry(hash, key, value, bucketIndex);
    }

 if条件告诉我们rehash的条件要同时满足两个:map中元素个数不小于阀值即容量*负载因子,对应的bucketIndex处有元素。

 另外,如下代码作备忘,

static int indexFor(int h, int length) {
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
        return h & (length-1);
    }

 

3. 可以插入(null, null)吗

Map<String, String> map = new HashMap<String, String>();
        map.put(null, null);
        System.out.println(map.size()); // 1

private V putForNullKey(V value) {
        for (Entry<K,V> e = table[0]; e != null; e = e.next) {
            if (e.key == null) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
        modCount++;
        addEntry(0, null, value, 0); // hash = 0, bucketIndex = 0
        return null;
    }

注意,Hashtable和ConcurrentHashMap进行put时若value为null,将抛出NullPointerException。 

 

4. table默认初始大小 - 16

public HashMap(int initialCapacity, float loadFactor) {
        // ...

        this.loadFactor = loadFactor; // 0.75
        threshold = initialCapacity; // 16
        init(); // nothing
    }

public V put(K key, V value) {
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        // ...
}

private void inflateTable(int toSize) {
        // Find a power of 2 >= toSize
        int capacity = roundUpToPowerOf2(toSize); // 16

        threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
        table = new Entry[capacity];
        initHashSeedAsNeeded(capacity);
    }

private static int roundUpToPowerOf2(int number) {
        // assert number >= 0 : "number must be non-negative";
        return number >= MAXIMUM_CAPACITY
                ? MAXIMUM_CAPACITY
                : (number > 1) ? Integer.highestOneBit((number - 1) << 1) : 1;
    }

 

5. 关于HashMap里的hash(Object key)方法

final int hash(Object k) {
        int h = hashSeed;
        if (0 != h && k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }

        h ^= k.hashCode();

        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

/**
     * Initialize the hashing mask value. We defer initialization until we
     * really need it.
     */
    final boolean initHashSeedAsNeeded(int capacity) {
        boolean currentAltHashing = hashSeed != 0;
        boolean useAltHashing = sun.misc.VM.isBooted() &&
                (capacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD);
        boolean switching = currentAltHashing ^ useAltHashing;
        if (switching) {
            hashSeed = useAltHashing
                ? sun.misc.Hashing.randomHashSeed(this)
                : 0;
        }
        return switching;
    }

 

 6.从hashmap的put方法的具体逻辑来看,里面充斥着线程不安全因素:

1)map为空时,同时初始化map

2)hash冲突时,同时操作链表

3)同时rehash时造成机器load高

 

 7.元素为什么放在链表的头部:因为单链表,放头部最快

 

 8.LinkedHashMap的实现:继承于HashMap,内部的Entry也继承于HashMap.Entry,增加了属性before,after,另外,重写了get,addEntry,createEntry方法,提供了iterator相关方法。

 

 9.TreeMap的实现,重点看put方法

public V put(K key, V value) {
        Entry<K,V> t = root;
        if (t == null) {
            compare(key, key); // type (and possibly null) check

            root = new Entry<>(key, value, null);
            size = 1;
            modCount++;
            return null;
        }
        int cmp;
        Entry<K,V> parent;
        // split comparator and comparable paths
        Comparator<? super K> cpr = comparator;
        if (cpr != null) {
            do {
                parent = t;
                cmp = cpr.compare(key, t.key);
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)
                    t = t.right;
                else
                    return t.setValue(value);
            } while (t != null);
        }
        else {
            if (key == null)
                throw new NullPointerException();
            Comparable<? super K> k = (Comparable<? super K>) key;
            do {
                parent = t;
                cmp = k.compareTo(t.key);
                if (cmp < 0)
                    t = t.left;
                else if (cmp > 0)
                    t = t.right;
                else
                    return t.setValue(value);
            } while (t != null);
        }
        Entry<K,V> e = new Entry<>(key, value, parent);
        if (cmp < 0)
            parent.left = e;
        else
            parent.right = e;
        fixAfterInsertion(e);
        size++;
        modCount++;
        return null;
    }

 

  10. jdk7 ConcurrentHashMap的get,size,put方法

public V get(Object key) {
        Segment<K,V> s; // manually integrate access methods to reduce overhead
        HashEntry<K,V>[] tab;
        int h = hash(key);
        long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
        if ((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) != null &&
            (tab = s.table) != null) {
            for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile
                     (tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE);
                 e != null; e = e.next) {
                K k;
                if ((k = e.key) == key || (e.hash == h && key.equals(k)))
                    return e.value;
            }
        }
        return null;
    }

 

public int size() {
        // Try a few times to get accurate count. On failure due to
        // continuous async changes in table, resort to locking.
        final Segment<K,V>[] segments = this.segments;
        int size;
        boolean overflow; // true if size overflows 32 bits
        long sum;         // sum of modCounts
        long last = 0L;   // previous sum
        int retries = -1; // first iteration isn't retry
        try {
            for (;;) {
                if (retries++ == RETRIES_BEFORE_LOCK) {
                    for (int j = 0; j < segments.length; ++j)
                        ensureSegment(j).lock(); // force creation
                }
                sum = 0L;
                size = 0;
                overflow = false;
                for (int j = 0; j < segments.length; ++j) {
                    Segment<K,V> seg = segmentAt(segments, j);
                    if (seg != null) {
                        sum += seg.modCount;
                        int c = seg.count;
                        if (c < 0 || (size += c) < 0)
                            overflow = true;
                    }
                }
                if (sum == last)
                    break;
                last = sum;
            }
        } finally {
            if (retries > RETRIES_BEFORE_LOCK) {
                for (int j = 0; j < segments.length; ++j)
                    segmentAt(segments, j).unlock();
            }
        }
        return overflow ? Integer.MAX_VALUE : size;
    }

  

public V put(K key, V value) {
        Segment<K,V> s;
        if (value == null)
            throw new NullPointerException();
        int hash = hash(key);
        int j = (hash >>> segmentShift) & segmentMask;
        if ((s = (Segment<K,V>)UNSAFE.getObject          // nonvolatile; recheck
             (segments, (j << SSHIFT) + SBASE)) == null) //  in ensureSegment
            s = ensureSegment(j);
        return s.put(key, hash, value, false);
    }

public V putIfAbsent(K key, V value) {
        Segment<K,V> s;
        if (value == null)
            throw new NullPointerException();
        int hash = hash(key);
        int j = (hash >>> segmentShift) & segmentMask;
        if ((s = (Segment<K,V>)UNSAFE.getObject
             (segments, (j << SSHIFT) + SBASE)) == null)
            s = ensureSegment(j);
        return s.put(key, hash, value, true);
    }

final V put(K key, int hash, V value, boolean onlyIfAbsent) {
            HashEntry<K,V> node = tryLock() ? null :
                scanAndLockForPut(key, hash, value);
            V oldValue;
            try {
                HashEntry<K,V>[] tab = table;
                int index = (tab.length - 1) & hash;
                HashEntry<K,V> first = entryAt(tab, index);
                for (HashEntry<K,V> e = first;;) {
                    if (e != null) {
                        K k;
                        if ((k = e.key) == key ||
                            (e.hash == hash && key.equals(k))) {
                            oldValue = e.value;
                            if (!onlyIfAbsent) {
                                e.value = value;
                                ++modCount;
                            }
                            break;
                        }
                        e = e.next;
                    }
                    else {
                        if (node != null)
                            node.setNext(first);
                        else
                            node = new HashEntry<K,V>(hash, key, value, first);
                        int c = count + 1;
                        if (c > threshold && tab.length < MAXIMUM_CAPACITY)
                            rehash(node);
                        else
                            setEntryAt(tab, index, node);
                        ++modCount;
                        count = c;
                        oldValue = null;
                        break;
                    }
                }
            } finally {
                unlock();
            }
            return oldValue;
        }

/**
         * Scans for a node containing given key while trying to
         * acquire lock, creating and returning one if not found. Upon
         * return, guarantees that lock is held. UNlike in most
         * methods, calls to method equals are not screened: Since
         * traversal speed doesn't matter, we might as well help warm
         * up the associated code and accesses as well.
         *
         * @return a new node if key not found, else null
         */
        private HashEntry<K,V> scanAndLockForPut(K key, int hash, V value) {
            HashEntry<K,V> first = entryForHash(this, hash);
            HashEntry<K,V> e = first;
            HashEntry<K,V> node = null;
            int retries = -1; // negative while locating node
            while (!tryLock()) {
                HashEntry<K,V> f; // to recheck first below
                if (retries < 0) {
                    if (e == null) {
                        if (node == null) // speculatively create node
                            node = new HashEntry<K,V>(hash, key, value, null);
                        retries = 0;
                    }
                    else if (key.equals(e.key))
                        retries = 0;
                    else
                        e = e.next;
                }
                else if (++retries > MAX_SCAN_RETRIES) {
                    lock();
                    break;
                }
                else if ((retries & 1) == 0 &&
                         (f = entryForHash(this, hash)) != first) {
                    e = first = f; // re-traverse if entry changed
                    retries = -1;
                }
            }
            return node;
        }

 

1
2
分享到:
评论

相关推荐

    Java解惑(中文版)_java_java解惑_solve65p_

    7. **集合框架**:熟悉ArrayList、LinkedList、HashSet、HashMap等集合类的使用,以及泛型的概念。 8. **输入/输出流**:理解I/O流的概念,包括文件操作、字节流和字符流的使用。 9. **多线程**:学习如何创建和...

    JAVA解惑.pdf

    《JAVA解惑》这本书主要针对Java编程中遇到的各种常见问题和困惑进行了解答,旨在帮助开发者深入理解Java语言,提高编程技巧。以下是一些关键的知识点解析: 1. **异常处理**:Java中的异常处理是通过try-catch-...

    4,JAVA解惑 高清PDF 下载

    此外,《JAVA解惑》还对集合框架进行了详细的解析,包括ArrayList、LinkedList、HashMap等常见数据结构的使用和底层实现原理。这部分内容对于提高代码效率和编写高效算法具有重要意义。书中还会介绍如何选择合适的...

    java解惑(包括pdf和答案)

    4. **集合框架**:讲解ArrayList、LinkedList、HashSet、HashMap等集合类的使用,以及它们之间的区别和选择原则。 5. **IO流**:介绍输入输出流的基本概念,以及如何读写文件、网络通信等。 6. **线程编程**:讲述...

    java解惑java解惑java解惑

    - **HashMap与TreeMap**:哈希表和有序映射的区别,以及它们的遍历方式。 - **Set接口**:不包含重复元素的集合,如HashSet和TreeSet。 5. **泛型** - **泛型的引入**:用于提供类型安全,避免在运行时进行强制...

    JAVA 解惑 java经典

    4. **集合框架**:ArrayList、LinkedList、HashMap等是Java集合框架中的常用类,它们在存储和操作数据时各有优势,了解并熟练运用它们能提高代码效率。 5. **垃圾回收机制**:Java自动管理内存,通过垃圾回收器自动...

    Java解惑.pdf

    4. **集合框架**:Java集合框架包括List、Set、Queue和Map接口,以及ArrayList、LinkedList、HashSet、HashMap等实现类。理解它们的区别和使用场景,能够提高代码的效率和可读性。 5. **多线程**:Java提供了对多...

    Java解惑

    《Java解惑》这篇博客文章及其配套的PDF文件,为我们提供了深入理解Java编程语言中常见问题和困惑的机会。本文将围绕“源码”和“工具”这两个关键标签,详细探讨Java编程中的若干重要知识点。 首先,Java源码是...

    Java解惑(整理版本)

    了解它们的特点、性能差异及使用场景,如`ArrayList`与`LinkedList`、`HashMap`与`ConcurrentHashMap`的区别,能提高代码效率。 5. **IO与NIO**:Java的输入输出系统提供了丰富的API,传统IO基于字节流和字符流,而...

    Java解惑(中文).pdf

    5. **集合框架**:ArrayList与LinkedList、HashMap与TreeMap等都是常见的集合类型,它们各有优缺点,选择合适的集合类型能显著提高程序性能。 6. **多线程**:Java提供了Thread类和Runnable接口来创建和管理线程,...

    Java_解惑(PDF)

    4. **数组与集合**:Java提供了数组和多种集合框架,如ArrayList、LinkedList、HashMap等。PDF可能包含如何操作数组,以及何时使用不同类型的集合,以及它们之间的转换方法。 5. **字符串操作**:String类在Java中...

    Java解惑(中文).pdf

    书中将深入剖析它们的工作原理,如ArrayList与LinkedList的区别,HashMap与ConcurrentHashMap的并发性能等。 6. **IO与NIO**:Java的输入输出系统提供了处理文件、网络流等功能。NIO(非阻塞I/O)为高性能网络编程...

    2010年-Java解惑(中文)

    - 集合接口(List、Set、Queue)和实现类(ArrayList、LinkedList、HashSet、HashMap等)的理解与应用。 - 遍历和操作集合的方法,如迭代器、foreach循环。 5. **多线程**: - 线程的创建:通过Thread类和...

    JAVA面试题解惑系列.rar

    2. **集合框架**:如ArrayList、LinkedList、HashMap、HashSet等,以及它们的实现原理、性能特点和使用场景。面试官可能会问到如何选择合适的集合类型,以及并发环境下集合的安全使用。 3. **异常处理**:理解不同...

    Java解惑(中文版)

    4. **集合框架**:Java集合框架是程序设计中必不可少的部分,可能涵盖ArrayList、LinkedList、HashMap、HashSet等数据结构的使用和优化。 5. **IO流**:Java的IO系统是处理输入输出的关键,书中可能涉及文件读写、...

    JAVA解惑.rar

    Java集合框架是组织和操作数据的重要工具,包括List、Set、Queue等接口,以及ArrayList、LinkedList、HashSet、HashMap等实现类。理解这些接口和类的区别和应用场景,以及它们的底层实现,能帮助我们更高效地处理...

    《经典JAVA面试题解惑系列合集(臧圩人)》

    这包括ArrayList、LinkedList、HashSet、HashMap等各种集合类的使用场景和性能特性。面试官可能会要求候选人分析不同集合之间的区别,比如时间复杂度、空间占用,以及如何根据需求选择合适的集合。 再者,线程并发...

    Java解惑 中英文

    5. **集合框架**:Java集合框架是程序设计的基础,书中可能讨论`ArrayList`、`LinkedList`、`HashSet`、`HashMap`等容器的性能特征和使用场景,以及`Collections`和`Stream` API的使用技巧。 6. **泛型**:Java泛型...

Global site tag (gtag.js) - Google Analytics