从map开始说起一（仔细学习Hashmap)

wcf1987

浏览: 635401 次
性别:
来自: 西安

最近访客更多访客>>

guojch

XiaoPY

cloaking

gaoaohan

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

java

提起map，这个java中collection家族中的典范，特别是hashmap更是大家耳熟能详的工具类，下面就细细的看看

public class HashMap<K,V>
    extends AbstractMap<K,V>
    implements Map<K,V>, Cloneable, Serializable
{

    /**
     * The default initial capacity - MUST be a power of two.
     */
    static final int DEFAULT_INITIAL_CAPACITY = 16;

    /**
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * The load factor used when none specified in constructor.
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**
     * The table, resized as necessary. Length MUST Always be a power of two.
     */
    transient Entry[] table;

    /**
     * The number of key-value mappings contained in this map.
     */
   transient int size;

    /**
     * The next size value at which to resize (capacity * load factor).
     * @serial
     */
    int threshold;

    /**
     * The load factor for the hash table.
     *
     * @serial
     */
    final float loadFactor;

    /**
     * The number of times this HashMap has been structurally modified
     * Structural modifications are those that change the number of mappings in
     * the HashMap or otherwise modify its internal structure (e.g.,
     * rehash).  This field is used to make iterators on Collection-views of
     * the HashMap fail-fast.  (See ConcurrentModificationException).
     */
    transient volatile int modCount;

这是Hashmap中的实例变量（jdk1.6），其中大家可以看到hashmap的实质其实是一个transient Entry[] table数组，另外像DEFAULT_INITIAL_CAPACITY（默认容量），MAXIMUM_CAPACITY最大容量（2^30），DEFAULT_LOAD_FACTOR默认装载因子（0.75），这些都是staic final的，用来当做默认配置和检查边界的，另外可以设置的3个也是我们一般传进去的参数，size（这个是记录map内存了多少对数据的），threshold，loadFactor，分别是边界值，加载因子，关系就是threshold=a*loadFactor，意思就是你的table初始化为a大小，当你不断添加内容到了threshold大小时，table就要自动加倍了。

public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);

        // Find a power of 2 >= initialCapacity
        int capacity = 1;
        while (capacity < initialCapacity)
            capacity <<= 1;

        this.loadFactor = loadFactor;
        threshold = (int)(capacity * loadFactor);
        table = new Entry[capacity];
        init();
    }

这个就是我们常用的构造函数，基本上就是对初始容量大小，loadFactor的检查，以及最关键的table = new Entry[capacity];，其中capacity并不是我们制定多少，他就是多少，实际上他选择了刚好小于输入initialCapacity的2的倍数作为大小，

 public V put(K key, V value) {
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key.hashCode());
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }

这就是著名的put函数了，注意这里可以明显看到hashmap不是同步的，以及他可以接受空值为键，此外大家也可以明显看到一个良好的key的hashcode()还是很必要的,如果设置了一个垃圾的hashcode()函数，那么

    static int hash(int h) {
        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

即便HashMap的hash函数也无能为力了，当得到处理过的32位hash码后，还要继续处理得到table[]的index

    static int indexFor(int h, int length) {
        return h & (length-1);
    }

很简单的一个函数，利用table的长度很好的截出了适当的大小，然后就是利用index在table中开始找key了，很明显这个就是直到找到为空或者找到"相同的"key，然后覆盖，如果没有调用addEntry

for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);
        return null;

    void addEntry(int hash, K key, V value, int bucketIndex) {
	Entry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new Entry<K,V>(hash, key, value, e);
        if (size++ >= threshold)
            resize(2 * table.length);
    }

这里很明显，就看到了每当新加一个Entry时，size++，并且如果size>threshold(就是前面两者相乘的结果)，则table长度翻倍。

get基本上类似，就不细讲了，下来看看hashmap中很实用的keySet()，

    public Set<K> keySet() {
        Set<K> ks = keySet;
        return (ks != null ? ks : (keySet = new KeySet()));
    }

    private final class KeySet extends AbstractSet<K> {
        public Iterator<K> iterator() {
            return newKeyIterator();
        }
        public int size() {
            return size;
        }
        public boolean contains(Object o) {
            return containsKey(o);
        }
        public boolean remove(Object o) {
            return HashMap.this.removeEntryForKey(o) != null;
        }
        public void clear() {
            HashMap.this.clear();
        }
    }

明显keyset()返回的是一个内部类实现了AbstractSet,而这个内部类中实际上的每一个set操作，都直接影响着Haspmap中的数据。

在这个内部类中我们还能看到一个非常多见的方法public Iterator<K> iterator()，这个方法伴随在collection的每一个角落

 private abstract class HashIterator<E> implements Iterator<E> {
        Entry<K,V> next;	// next entry to return
        int expectedModCount;	// For fast-fail
        int index;		// current slot
        Entry<K,V> current;	// current entry

        HashIterator() {
            expectedModCount = modCount;
            if (size > 0) { // advance to first entry
                Entry[] t = table;
                while (index < t.length && (next = t[index++]) == null)
                    ;
            }
        }

        public final boolean hasNext() {
            return next != null;
        }

        final Entry<K,V> nextEntry() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            Entry<K,V> e = next;
            if (e == null)
                throw new NoSuchElementException();

            if ((next = e.next) == null) {
                Entry[] t = table;
                while (index < t.length && (next = t[index++]) == null)
                    ;
            }
	    current = e;
            return e;
        }

        public void remove() {
            if (current == null)
                throw new IllegalStateException();
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            Object k = current.key;
            current = null;
            HashMap.this.removeEntryForKey(k);
            expectedModCount = modCount;
        }

    }

这同样是Hashmap中的一个内部类，而且这还是一个实现Iterator的抽象类，在hashmap中有3个用来对keyset，value，和entry返回iterator的内部类均是继承HashIterator，而且很明显的可以看到对iterator的任何改变都会带来Hashmap的改变，特别要注意的是 expectedModCount = modCount;

if (modCount != expectedModCount)
throw new ConcurrentModificationException();

这里实际上是保证了在iterator遍历Hashmap过程中对Hashmap的改变（增加和删除均会带来modCount的增加，可以看前面的put函数），均会导致iterator扔出异常，但仔细看put函数，我们又会发现如果只是value的更替，而不是新加，modCount 不会发生变化。

    Iterator<K> newKeyIterator()   {
        return new KeyIterator();
    }
    Iterator<V> newValueIterator()   {
        return new ValueIterator();
    }
    Iterator<Map.Entry<K,V>> newEntryIterator()   {
        return new EntryIterator();
    }

因为HashIterator实现的很好了，故每个自己的iterator就实现的很简单了

    private final class KeyIterator extends HashIterator<K> {
        public K next() {
            return nextEntry().getKey();
        }
    }

好了Hashmap就大概讲这么多，明天再从Hashmap铺开比较更多的map

分享到：

从map开始说起二（谈谈hashmap的兄弟，Li ... | XML的操作（一）

2009-09-21 22:27
浏览 2248
评论(0)
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论