Blog-Archiv

Sonntag, 17. September 2017

Java List Modification Gotcha

A very old bug, still not died out, is the List modification-in-loop gotcha. That means, you loop a list, and inside the loop body you remove element(s) from the list.

There were always strategies to avoid bugs caused by that. But it's up to skilled programmers, untrained newcomers will most likely cause hard-to-find bugs with indexed lists access.

The Problem

Guess what following code outputs.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
public class Test
{
    public void perform()   {
        final List<String> names = new ArrayList<>();
        names.add("Peter");
        names.add("Paul");
        names.add("Mary");
        
        for (int i = 0; i < names.size(); i++)
            checkForP(names.get(i), names);
        
        System.out.println("List is now: "+names);
    }
    
    private void checkForP(String name, List<String> names) {
        System.out.println("checkForP "+name);
        
        if (name.startsWith("P"))
            names.remove(name);
    }
    
    public static void main(String[] args) {
        new Test().perform();
    }
}

We have a list of names, and we want to delete any name that starts with "P" from the list. The programmer used an indexed iteration of the list, because he heard this performs fast (which is true).

Here is the result:

checkForP Peter
checkForP Mary
List is now: [Paul, Mary]

Why is "Paul" still in list, although that name starts with "P"? Because the list was modified inside the loop. That means, you iterate something that is modified while iterating. In this case you remove the first element "Peter", the second "Paul" slips to first position and thus is left out, the last element "Mary" doesn't match, then the loop is over.

In this case no exception occurred, so the mistake can be detected just by users or good unit tests. That makes it so dangerous.

Workaround

Don't say modification inside an iteration never can work. Here is a valid indexed loop version:

        ....
        for (int i = names.size() - 1; i >= 0; i--)
            checkForP(names.get(i), names);
        ....

When you loop the list from backwards, removing elements will not affect the loop-index. Output is now

checkForP Mary
checkForP Paul
checkForP Peter
List is now: [Mary]

That's what we'd expected.

Prevention

Of course people looked for safer solutions. The best is to use Iterable. You don't need to create or declare it, every Collection and Map is iterable by default since Java 1.5. Here is the right way to do it.

    public void perform()   {
        final List<String> names = new ArrayList<>();
        names.add("Peter");
        names.add("Paul");
        names.add("Mary");
        
        for (String name : names)
            checkForP(name, names);
        
        System.out.println("List is now: "+names);
    }

This is the for-each loop, and it implicitly uses the interface Iterable. Output is now:

checkForP Peter
Exception in thread "main" java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
 at java.util.ArrayList$Itr.next(ArrayList.java:851)
 at Test.perform(Test.java:27)
 at Test.main(Test.java:41)

Isn't this a success? The bug has been detected at runtime, and a unit test may uncover it in a daily build. This also works for Map iterations.
Mind that not any modification causes that exception. If you replace an element by set(index, otherName) instead of deleting one, then this loop will not fail.

When you say, "This is not a success, I want to really delete the element inside the loop", you should not use the for-each Iterable loop. In this case you must use Iterator. This provides an explicit method for removing elements during an iteration.

    public void perform()   {
        final List<String> names = new ArrayList<>();
        names.add("Peter");
        names.add("Paul");
        names.add("Mary");
        
        final Iterator<String> iterator = names.iterator();
        while (iterator.hasNext())
            checkForP(iterator.next(), iterator);
        
        System.out.println("List is now: "+names);
    }
    
    private void checkForP(String name, Iterator<String> iterator) {
        System.out.println("checkForP "+name);
        
        if (name.startsWith("P"))
            iterator.remove();
    }

Result is now:

checkForP Peter
checkForP Paul
checkForP Mary
List is now: [Mary]

Mind that you must pass the Iterator instead of the List to any method that wants to delete elements, and that you can delete only the current element, not any other. The Iterator.remove() method has no parameter, and always deletes the element that has been returned by next() before.

Resume

Collection management normally is not built into a programming language. In Java it is performed by the runtime library. The compiler can not find such bugs.




Keine Kommentare: