Friday, October 10, 2008

ConcurrentModificationException: Why do Java collections not have robust iterators?

Ever got a ConcurrentModificationException? It just hit me. How does this happen? It happens when one method iterates over a collection while another method (that bis recursively called from the for loop) modifies the collection.

Note: ConcurrentModificationException has nothing to do with threading! (Well, this is a bit too strong statement, as Rafael points point out in a comment: it might occur due to a threading race condition (but in such cases you probably have other problems as well)). A ConcurrentModificationException may have nothing to do with threading!. Here I am talking about about the non threading related case.

Here's a simple example to show the problem. We have a class that can register new listeners and when fireChange is called it calls changed on the listeners. So the test listener here removes itself on the change call and boom, we get the ConcurrentModificationException:

public class IteratorTest {
final Collection fListeners;
public IteratorTest(Collection listeners) {
fListeners = listeners;
}
static class Listener {
public void changed(IteratorTest subject) {
subject.removeListener(this);
}
}
public void addListener(Listener listener) {
fListeners.add(listener);
}
public void removeListener(Listener listener) {
fListeners.remove(listener);
}
public void fireChange() {
for (Listener listener : fListeners) {
listener.changed(this);
}
}
static void test(Collection coll) {
IteratorTest t = new IteratorTest(coll);
t.addListener(new Listener());
t.fireChange();
}
public static void main(String[] args) {
test(new ArrayList());
}
}


The problem can happen if you call out to "other code" (code someone else has written) and "other code" can change the collection while you are iterating. One solution is to iterate over a copy of the collection:

public void fireChange() {
Listener[] listeners=(Listener[]) fListeners.toArray();
for (Listener listener : listeners) {
listener.changed(this);
}
}

That helps. But there is a lot of code out there that iterates over a collection and calls "other code" and there is always a chance that the "other code" calls back to modify your collection and you get a ConcurrentModificationException....

The good news is: Unlike threading race conditions it happens deterministically. The bad news is: if you are the client it is often not easy to find a way out.

15 years ago, ET++ (the cool framework created by Erich Gamma and Andree Weinand in the 80ies) suffered form missing robust iterators. "Robust iterators" means robust to changes of the underlying collection. At that time, making a copy of a collection seemed an unacceptable overhead. So, Thomas Kofler added robust iterators to ET++ (the PDF has the pages on reverse order -- here is a readable version). The implementations are efficient and robust.

Robust iterators are so fundamental, I am really surprised that Java does not have them....

.

10 comments:

  1. Java 1.5 added CopyOnWriteArrayList and CopyOnWriteArraySet that gives the robust iterators functionality that you are after.

    ReplyDelete
  2. The CopyOnWrite collections are well-suited for listeners and they would fix the code as shown in the blog, but it's not hard for users to get an UnsupportedOperationException at runtime. They just need to call remove on the iterator.

    It would have been nicer if the Java collections hierarchy didn't force everyone to expose mutators (i.e. an Iterator without remove). Funnily enough that is what the discouraged Enumeration interface looks like.

    Something else that might be nice is to allow internal iteration (most commonly known as foreach in many languages) instead of external iteration. For some types of data structures one can implement internal iteration more efficiently than the external variant.

    ReplyDelete
  3. CopyOnWriteArrayList is a good tip!
    If I have the control over the for loop I can fix it. However, if I am the client (writing the "other code") of a for loop someone else wrote, I have a problem.....

    Michael
    .

    ReplyDelete
  4. > ConcurrentModificationException has nothing to do with threading!

    Nothing *intrinsically*, but they can certainly (and often) occur due to a race condition between two threads.

    ReplyDelete
  5. Some would argue that the concurrent modification exception is what makes the iteration robust. Alternatives call into question issues like whether something that's added after iteration is started should be included or not, or whether something that's removed should still be visited. And of course the specification of the API can have a major impact on implementation choices and their performance...

    ReplyDelete
  6. I think there a 2 main concerns
    1. efficiency
    The paper goes into details of how to make the Iterator for their Set implementation robust. Unfortunately there's no single best implementation for a Set and in java the Implementation of HashMaps and therefore Sets even has changed over time.
    Each implementation would have to have the same behaviour regarding concurrent modifications, which would be very difficult to achieve and maybe impossible.

    2. Being able to replace the implementation of Collections later on

    There are already several implementations for Collections and people are free to add their own with different tradeoffs.
    It could be very hard for some of them to support Iterator robustness.

    Regards,
    Markus (Java performacne blog)

    ReplyDelete
  7. Old Collections classes like Java Arraylist, Java hashmap all has iterators which are fail-fast i.e. they will fail when other thread modify the collection structurally but new collections classes added in Java 5 e.g. CopyOnWriteArraylist or ConcurrentHashMap has fail-safe iterators which works on copy of collection class instead of original collection object.

    ReplyDelete
  8. Wow, just wow. Do you really think that Listener code example is convincing to any thinking person ? Trying to argue against a design decision using such contrived code is silly. Give a good example of where a Listener would need to remove itself from the object as its reaction to state change.

    ReplyDelete
  9. > Do you really think that Listener code example is convincing to any thinking person

    The problem with object oriented code is that you have no control over what a listener does and it might (indirectly) try to remove itself....

    ReplyDelete
  10. what is fail fast? what is fail safe? what is difference between fail fast and fail safe? this is very important interview question. you can find the details on http://www.itsoftpoint.com/?page_id=2736

    ReplyDelete