Java Collections

Collections Basics

colls-coreInterfaces.gif
  • A map is not a true collection as it does not extend Collection.
  • All core collection interfaces are generic.
  • Convert collection c to array: Object[] a = c.toArray();
  • Java doesn't include an interface for multi-maps (a map where each key maps to multiple values). It's a fairly simple matter to use a Map whose values are List instances as a multimap.
  • If you try to sort a list, the elements of which do not implement Comparable, Collections.sort(list) will throw a ClassCastException.
  • You can't have a List of a primitive type. In other words, List<int> is not possible. Generics does not apply to primitives.

Iteration and Modification

  • Iterator.remove is the only safe way to modify a collection during iteration.
static void filter(Collection<?> c) {
    for (Iterator<?> it = c.iterator(); it.hasNext(); )
        if (!cond(it.next()))
            it.remove();
}
  • Use Iterator instead of the for loop when you need to:
    • Remove the current element. The for-each construct hides the iterator, so you cannot call remove. Therefore, the for-each construct is not usable for filtering.
    • Iterate over multiple collections in parallel.

Collections Implementations

  • Collections implementations are categorized into general purpose, special purpose, concurrent, wrapper, convenient and abstract implementations.
  • Most general purpose implementations of collections are unsynchronized. The legacy collections Vector and Hashtable are synchronized.
  • java.util.concurrent offer much higher concurrency than mere synchronized implementations.
  • If you need thread-safe collections, the synchronization wrappers allow any collection to be transformed into a synchronized collection. Every major collection implementation have a synchronized wrapper. for example for list : List<Type> list = Collections.synchronizedList(new ArrayList<Type>());
  • We also have unmodifiable wrappers: public static <T> SortedSet<T> unmodifiableSortedSet(SortedSet<? extends T> s);
  • To write your own implementation of a Collection interface use existing abstract classes such as AbstractList.
  • The Collections.checked interface wrappers are provided for use with generic collections. These implementations return a dynamically type-safe view of the specified collection, which throws a ClassCastException if a client attempts to add an element of the wrong type. The generics mechanism in the language provides compile-time (static) type-checking, but it is possible to defeat this mechanism. Dynamically type-safe views eliminate this possibility entirely.

Linked

  • Linked implementations (linkedhashmap, linked list) have a linked list running in them.
  • A linked list is a data structure where each node has a ref/link to the next node in the sequence. LinkedList class provides methods to get, remove and insert an element at the beginning and end of the list.
  • In doubly linked list, each node contains, besides the next-node link, a second link field pointing to the previous node in the sequence. LinkedList implements Deque and has doubly linked list capabilities.

List

  • List implementations: ArrayList(better-performing) and LinkedList (better performance under certain circumstances).
  • Vector has been retrofitted to implement List.
  • ArrayList has one tuning parameter — the initial capacity, which refers to the number of elements the ArrayList can hold before it has to grow. Most collection impls have such a parameter.

Set

  • Set implementations: HashSet(best performing), TreeSet, and LinkedHashSet.
  • SortedSet extends Set and maintains its elements in ascending order.
  • HashSet is much faster than TreeSet (constant-time versus log-time for most operations) but offers no ordering guarantees.

Map

  • Map implementations: HashMap, TreeMap, and LinkedHashMap.
  • Hashtable was retrofitted to implement Map.
  • Hashtable is synchronized but HashMap is not.
  • SortedMap extends Map and maintains its entries in ascending order, sorted according to the keys' natural ordering, or according to a Comparator provided at the time of the SortedMap creation.
  • If you want to map an enum to a value, you should always use an EnumMap in preference to an array.
  • WeakHashMap is an implementation of the Map interface that stores only weak references to its keys. Storing only weak references allows a key-value pair to be garbage-collected when its key is no longer referenced outside of the WeakHashMap.
  • HashMap permits null values and the null key but Hashtable doesn't.

Algorithms

  • Collections.sort, shuffle, reverse, fill, copy, swap, addAll, binary search (searches for an element in a sorted list), frequency(counts the occurrence of an element), disjoint (check if 2 collections have anything in common), min, max
  • Collections.sort() uses merge sort algorithm and is n log(n). It is stable (does not reorder equal elements) and faster than quicksort. Quicksort is not stable and does not always guarantee O(n log n ) as in rate cases it might be O(n^2)

Stable Algorithn

It is important if you sort the same list repeatedly on different attributes. If a user of a mail program sorts the inbox by mailing date and then sorts it by sender, the user naturally expects that the now-contiguous list of messages from a given sender will (still) be sorted by mailing date. This is guaranteed only if the second sort was stable.

When sorting some kinds of data, only part of the data is examined when determining the sort order. For example, for sorting cards by rank where the suit is being ignored, it's possible to have multiple different correctly sorted versions of the original list. Stable sorting algorithms choose one of these, according to the following rule: if two items compare as equal, like the two 5 cards, then their relative order will be preserved, so that if one came before the other in the input, it will also come before the other in the output.

Iterator vs Iterable

Iterable is the interface needed to use foreach "for(MyObject o:objects)". Collection implements Iterable.
Iterator is also an interface. Each java Collection (List, Set, etc) has a method .iterator that returns an Iterator which is used for iterating over the collection.

Other Collections

  • Check Google collections. It has a multimap.
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License