Blog-Archiv

Freitag, 22. September 2017

An Editing Visitor in Java

The Visitor Pattern is well-known for its ability to iterate a structure without needing cast-operators for each node-type. That means, it is type-checked at compile-time (Java-like). Cast-operators check the type at runtime only (Smalltalk-like). In real life this means that the compiler detects your errors, not your impatient customer :-)

This Blog is about a visitor that also can modify the structure it is visiting, and thus keeps all iteration logic in just one place.

Structure

I am talking about lists, trees or graphs containing nodes of different types. Example would be an application menu. It contains clickable action-items and/or folder-items, folders again contain actions and/or folders, and so on. The actions could be push-buttons, checkboxes or radio-buttons. We could call that a "heterogenous tree".

Let's assume everything that is in a menu, be it folder or action, is an Item that has a technical id and a human-readable label:

public abstract class Item
{
    /** Immutable id of this item. */
    public final String id;
    private String label;
    
    /** Protected constructor, to be called by sub-classes only. */
    protected Item(String id) {
        assert id != null;
        this.id = id;
    }
    
    /** Human readable label of this item. */
    public String getLabel() {
        return (label != null) ? label : id;
    }
    public void setLabel(String label) {
        this.label = label;
    }
    
    @Override
    public String toString() {
        return id+"@"+hashCode();
    }
}

From this, we will derive something that enables us to iterate the menu by a Visitor.

Visitor

Here is a special visitor that also returns values from its visit() methods. I will explain that speciality later.

/**
 * Callers of <code>Item.accept()</code> need to implement this.
 */
public interface Visitor
{
    /**
     * @param actionItem the currently visited action.
     * @return false by default to visit the whole tree,
     *      true to terminate the iteration now and return the
     *      given item from <code>accept()</code>.
     */
    boolean visit(ActionItem actionItem);
    
    /**
     * @param itemList the currently visited list.
     * @return false by default to visit the whole structure,
     *      true to terminate the iteration now and return the
     *      given itemList from <code>accept()</code>.
     */
    boolean visit(ItemList itemList);
}

And here is the according visitable Item super-class for all types of items:

public abstract class VisitableItem extends Item
{
    protected VisitableItem(String id) {
        super(id);
    }
    
    /**
     * This iterates recursively, with the option to break the iteration.
     * Should encapsulate the iteration once-and-only-once.
     * An implementation would call the visitor's <code>visit(this)</code>,
     * and <code>accept(visitor)</code> for all child items.
     * @param visitor interface implementation that visits the structure and
     *      decides when the iteration is terminated.
     * @return the action item (or list) where the visitor returned true
     *      from <code>visit(item)</code>, or null.
     */
    public abstract VisitableItem accept(Visitor visitor);
}

This is the classical Visitor Pattern where an interface must be implemented for every use-case that wants to iterate the graph. All items of the structure must provide some accept() implementation that, at least, calls back to the visit() method of the given visitor. Folders would also call all their children's accept() method, this implements the iteration.

And that accept() is the place where the return-speciality comes in. The visitor can break the iteration by returning true from its visit() implementation. That saves time when, for example, searching just the first matching item. The item's accept() method would then return itself, being the search result. A simple and useful visitor-convenience, see examples on bottom for more evidence.

A Visitable Menu

Here is the menu-action class:

public class ActionItem extends VisitableItem
{
    public ActionItem(String id) {
        super(id);
    }
    
    /** To be called when user clicks this item. */
    public void perform()   {
        throw new RuntimeException("implement me!");
    }
    
    @Override
    public final VisitableItem accept(Visitor visitor) {
        return visitor.visit(this) ? this : null;
    }
}

Mind that accept() returns itself when the visitor returns true. This is for searching, see below.

Following is the menu-folder:

public class ItemList extends VisitableItem
{
    private List<VisitableItem> items = new ArrayList<>();
    
    /**
     * @param id required, the id of this list-item.
     * @param items the child items to be contained in this list, will be cloned.
     */
    public ItemList(String id, List<VisitableItem> items) {
        super(id);
        this.items = items;
    }
    
    /**
     * Iterate recursively with the option to break the iteration.
     * This should implement the iteration once-and-only-once.
     * @param visitor interface implementation that visits the tree and decides when the iteration is terminated.
     * @return the action item (or list) where the visitor returned true from <code>visit(item)</code>, or null.
     */
    @Override
    public VisitableItem accept(Visitor visitor) {
        if (visitor.visit(this))
            return this;
        
        for (VisitableItem item : items)    {
            final VisitableItem matched = item.accept(visitor);
            if (matched != null)
                return matched;
        }
        
        return null;
    }

    @Override
    public String toString() {
        return items.toString()+"_@"+hashCode();
    }
}

The accept() method first checks whether the visitor returns true. In that case it returns itself as search result, like the menu-action implementation does. Then it checks whether one of its child-items returns not-null from accept(). When yes, this child is returned as search result.

That means, a search for just one item would not need to iterate the whole structure. This comes for the price of a contract between visit() and accept() returns.

The Editing Visitor

Why would you need an editing visitor?

Because sometimes I need to edit the menu. Remove items the user has no permission for. Or add items dynamically in a certain application state.

But why do this with a visitor? For removing items there is the Iterator pattern!

If I have a separate Iterator, the structural logic would be duplicated. I want the go-into-sub-items logic to be implemented just once-and-only-once. Maybe because it will be subject to eager changes, or extensive configuration.

For a visitor with editing capabilities, we must change the classes above a little. First we add a method to the visitor that exposes the children-list to possible editors.

/**
 * Callers of <code>Item.accept()</code> need to implement this.
 */
public interface Visitor
{
    ....
    
    ....
    
    /**
     * Called after an <code>ItemList</code> was visited,
     * and once before each visit of any of its elements.
     * @param items the mutable list about to be iterated,
     *      mind that this iteration could be interrupted
     *      by going into a sub-list. (That's the reason
     *      why it is called each time an item is visited.)
     */
    void setVisitedList(List<VisitableItem> items);
}

The new setVisitedList() method exposes the list of items to visitors. That means the visitor can modify it. From the Iterator pattern we know that modifying a list while iterating it is an error-prone thing.

Here is a simple and concurrent-modification-safe iteration that also calls the new visitor-method setVisitedList():

public class ItemList extends VisitableItem
{
    ....

    @Override
    public VisitableItem accept(Visitor visitor) {
        if (visitor.visit(this))
            return this;
        
        for (VisitableItem item : new ArrayList<>(items))    {    // safely iterate via clone
            visitor.setVisitedList(items);  // expose the parent-list for modifications
            
            final VisitableItem matched = item.accept(visitor);
            if (matched != null)
                return matched;
        }
        
        return null;
    }

    ....
}

By using a clone of the children list the iteration avoids any irritation by the editing visitor. The visitor receives the original list directly before any call to the child's accept() method.

That's it, no more. Enough to iterate the graph, and edit it at the same time. But because this is a very mimimal implementation, Visitor classes may have problems to make useful things out of this. Let's see how far we can go with it.

Demo Application

Here is a skeleton that generates test data. You can paste the different visitors below into it to test them.

public class Main
{
    private ItemList buildMenu() {
        final ActionItem fileSystem = new ActionItem("File System");
        final ActionItem recent = new ActionItem("Recent");
        final ItemList openMenu = new ItemList(
                "Open",
                new ArrayList<>(Arrays.asList(new VisitableItem [] { fileSystem, recent })));
        
        final ActionItem exit = new ActionItem("Exit");
        final ItemList fileMenu = new ItemList(
                "File",
                new ArrayList<>(Arrays.asList(new VisitableItem [] { openMenu, exit })));
        
        final ActionItem cut = new ActionItem("Cut");
        final ActionItem copy = new ActionItem("Copy");
        final ActionItem paste = new ActionItem("Paste");
        final ItemList editMenu = new ItemList(
                "Edit",
                new ArrayList<>(Arrays.asList(new VisitableItem [] { cut, copy, paste })));
        
        final ActionItem help = new ActionItem("Help");
        
        final ItemList menuBar = new ItemList(
                "",
                new ArrayList<>(Arrays.asList(new VisitableItem [] { fileMenu, editMenu, help })));
        
        return menuBar;
    }
    

    .....
   
    
    public static void main(String[] args) {
        final Main menuTest = new Main();
        final ItemList menuBar = menuTest.buildMenu();
        ....
    }
}

Visitor Use Cases

A print-visitor that also counts items:

    private void testPrint(ItemList menuBar) {
        class PrintVisitor implements Visitor
        {
            int count;
            private List<VisitableItem> currentlyVisitedList;
            
            @Override
            public boolean visit(ActionItem actionItem) {
                count++;
                System.out.println("ActionItem = "+actionItem+"\n    parent = "+currentlyVisitedList);
                return false;
            }
            @Override
            public boolean visit(ItemList itemList) {
                count++;
                System.out.println("ItemList = "+itemList+"\n    parent = "+currentlyVisitedList);
                return false;
            }
            @Override
            public void setVisitedList(List<VisitableItem> items) {
                currentlyVisitedList = items;
            }
        };
        final PrintVisitor printVisitor = new PrintVisitor();
        menuBar.accept(printVisitor);
        
        final int numberOfItems = 11;
        assert printVisitor.count == numberOfItems : "Count of menu items is not "+numberOfItems+": "+printVisitor.count;
    }

Following is a search-visitor:

    private void testFind(ItemList menuBar, String id) {
        final VisitableItem found = find(menuBar, id);
        assert found.id.equals(id) : "Did not find item with id '"+id+"'!";
    }
    
    private VisitableItem find(ItemList menuBar, String id) {
        return menuBar.accept(new Visitor() {
            @Override
            public boolean visit(ActionItem actionItem) {
                return compareId(actionItem);
            }
            @Override
            public boolean visit(ItemList actionList) {
                return compareId(actionList);
            }
            @Override
            public void setVisitedList(List<VisitableItem> itemsm) {
            }
            
            private boolean compareId(VisitableItem item) {
                return id.equals(item.id);   // stop when id found
            }
        });
    }

A remove-visitor can be implemented easily:

    private void testRemove(ItemList menuBar, final String id) {
        testFind(menuBar, id);
        
        class RemoveVisitor implements Visitor
        {
            private List<VisitableItem> currentlyVisitedList;
            
            @Override
            public boolean visit(ActionItem actionItem) {
                return removeWhenFound(actionItem);
            }
            @Override
            public boolean visit(ItemList listItem) {
                return removeWhenFound(listItem);
            }
            @Override
            public void setVisitedList(List<VisitableItem> items) {
                currentlyVisitedList = items;
            }
            
            private boolean removeWhenFound(VisitableItem item) {
                if (id.equals(item.id)) {
                    assert currentlyVisitedList != null : "Can not remove "+id+", it seems to be root!";
                    currentlyVisitedList.remove(item);
                    return true;
                }
                return false;
            }
        };
                
        menuBar.accept(new RemoveVisitor());
        assert find(menuBar, id) == null : "Could not remove item with id '"+id+"'!";
    }

Finally a clone-visitor. That was really hard to implement, and also may be a little hard to understand :-)

    private void testClone(ItemList menuBar) {
        /**
         * Builds on the fact that setVisitedList()
         * is called immediately after visit(ListItem listItem).
         */
        class CloneVisitor implements Visitor
        {
            ItemList cloneRoot;
            private Map<List<VisitableItem>,List<VisitableItem>> lists = new HashMap<>();
            private List<VisitableItem> currentlyVisitedList;
            
            @Override
            public boolean visit(ActionItem actionItem) {
                currentlyVisitedList.add(actionItem);
                return false;
            }
            @Override
            public boolean visit(ItemList itemList) {
                final List<VisitableItem> cloneList = new ArrayList<>(); // create a clone
                final ItemList clone = new ItemList(itemList.id, cloneList);
                clone.setLabel(itemList.getLabel());
                
                if (currentlyVisitedList == null)   // root level
                    cloneRoot = clone;
                else
                    currentlyVisitedList.add(clone);    // add the clone to current parent
                
                currentlyVisitedList = cloneList;   // we are going inside now
                return false;
            }
            @Override
            public void setVisitedList(List<VisitableItem> items) {
                List<VisitableItem> cloneItems = lists.get(items);   // get the clone-list of incoming original
                if (cloneItems == null) // create and store it when not found
                    lists.put(items, cloneItems = currentlyVisitedList);
                currentlyVisitedList = cloneItems;  // make sure we are inserting into the correct clone-list
            }
        };
        final CloneVisitor cloneVisitor = new CloneVisitor();
        menuBar.accept(cloneVisitor);
        final ItemList clone = cloneVisitor.cloneRoot;
        
        testPrint(clone);
        
        assert clone != null && clone != menuBar : "Menu bar has not been cloned!";
        assert find(clone, "Paste") != null : "Cloned menu bar does not contain 'Paste'!";
        assert find(clone, "File") != find(menuBar, "File") : "Cloned 'File' menu is not different from original one!";
    }

Main

Here is the main() procedure that calls all these visitor tests.

    public static void main(String[] args) {
        final Main menuTest = new Main();
        final ItemList menuBar = menuTest.buildMenu();
        menuTest.testPrint(menuBar);
        menuTest.testFind(menuBar, "Copy");
        menuTest.testClone(menuBar);
        menuTest.testRemove(menuBar, "Recent");
    }



Sonntag, 17. September 2017

Java List Modification Gotcha

A very old bug, still not died out, is the List modification-in-loop gotcha. That means, you loop a list, and inside the loop body you remove element(s) from the list.

There were always strategies to avoid bugs caused by that. But it's up to skilled programmers, untrained newcomers will most likely cause hard-to-find bugs with indexed lists access.

The Problem

Guess what following code outputs.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
public class Test
{
    public void perform()   {
        final List<String> names = new ArrayList<>();
        names.add("Peter");
        names.add("Paul");
        names.add("Mary");
        
        for (int i = 0; i < names.size(); i++)
            checkForP(names.get(i), names);
        
        System.out.println("List is now: "+names);
    }
    
    private void checkForP(String name, List<String> names) {
        System.out.println("checkForP "+name);
        
        if (name.startsWith("P"))
            names.remove(name);
    }
    
    public static void main(String[] args) {
        new Test().perform();
    }
}

We have a list of names, and we want to delete any name that starts with "P" from the list. The programmer used an indexed iteration of the list, because he heard this performs fast (which is true).

Here is the result:

checkForP Peter
checkForP Mary
List is now: [Paul, Mary]

Why is "Paul" still in list, although that name starts with "P"? Because the list was modified inside the loop. That means, you iterate something that is modified while iterating. In this case you remove the first element "Peter", the second "Paul" slips to first position and thus is left out, the last element "Mary" doesn't match, then the loop is over.

In this case no exception occurred, so the mistake can be detected just by users or good unit tests. That makes it so dangerous.

Workaround

Don't say modification inside an iteration never can work. Here is a valid indexed loop version:

        ....
        for (int i = names.size() - 1; i >= 0; i--)
            checkForP(names.get(i), names);
        ....

When you loop the list from backwards, removing elements will not affect the loop-index. Output is now

checkForP Mary
checkForP Paul
checkForP Peter
List is now: [Mary]

That's what we'd expected.

Prevention

Of course people looked for safer solutions. The best is to use Iterable. You don't need to create or declare it, every Collection and Map is iterable by default since Java 1.5. Here is the right way to do it.

    public void perform()   {
        final List<String> names = new ArrayList<>();
        names.add("Peter");
        names.add("Paul");
        names.add("Mary");
        
        for (String name : names)
            checkForP(name, names);
        
        System.out.println("List is now: "+names);
    }

This is the for-each loop, and it implicitly uses the interface Iterable. Output is now:

checkForP Peter
Exception in thread "main" java.util.ConcurrentModificationException
 at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
 at java.util.ArrayList$Itr.next(ArrayList.java:851)
 at Test.perform(Test.java:27)
 at Test.main(Test.java:41)

Isn't this a success? The bug has been detected at runtime, and a unit test may uncover it in a daily build. This also works for Map iterations.
Mind that not any modification causes that exception. If you replace an element by set(index, otherName) instead of deleting one, then this loop will not fail.

When you say, "This is not a success, I want to really delete the element inside the loop", you should not use the for-each Iterable loop. In this case you must use Iterator. This provides an explicit method for removing elements during an iteration.

    public void perform()   {
        final List<String> names = new ArrayList<>();
        names.add("Peter");
        names.add("Paul");
        names.add("Mary");
        
        final Iterator<String> iterator = names.iterator();
        while (iterator.hasNext())
            checkForP(iterator.next(), iterator);
        
        System.out.println("List is now: "+names);
    }
    
    private void checkForP(String name, Iterator<String> iterator) {
        System.out.println("checkForP "+name);
        
        if (name.startsWith("P"))
            iterator.remove();
    }

Result is now:

checkForP Peter
checkForP Paul
checkForP Mary
List is now: [Mary]

Mind that you must pass the Iterator instead of the List to any method that wants to delete elements, and that you can delete only the current element, not any other. The Iterator.remove() method has no parameter, and always deletes the element that has been returned by next() before.

Resume

Collection management normally is not built into a programming language. In Java it is performed by the runtime library. The compiler can not find such bugs.




String Comparisons in Java, EcmaScript and TypeScript

We always have to shift our minds when developing web applications, because we use several programming languages for them. Currently I am using:

  1. Java

  2. EcmaScript (JavaScript, JS)
    EcmaScript is a superset of JavaScript, so every JS source is also valid ES source, but not vice versa. Only "modern" HTML-5 browsers can interpret ES.

  3. TypeScript
    TypeScript also is a superset of JavaScript, every JS source is valid TS, but not vice versa. You need a compiler to translate from TS to JS.

String Comparisons

I tried to find out whether string comparisons work the same in all three languages. There are two kinds of comparisons, that of identity

if (s == t)

and that of equality

if (s.equals(t))

This was Java code.
In JavaScript (and TypeScript) similar exists, for identity

if (s === t)

and equality (let's call it "JS equality":-)

if (s == t)

Mind that the use of === and !== is strongly recommended, because == and != have a complicated conversion logic, see "Hard and soft comparisons". I will show this gotcha in the following example.

Plain Strings and Objects

In the following example, I distinguish between plain strings like

s1 = "10";

and actually allocated string objects like

s2 = new String("20");

Both is possible in all three languages.

The Java compiler optimizes plain strings in a way that it puts all equal definitions into just one place. It doesn't do this for allocated string objects, because these are constructed at runtime.

Most likely the JavaScript interpreter does similar things. Let's try it out.

EcmaScript Example Source

Four strings are used for this test: s1 and t1 are plain strings, s2 and t2 are string objects. They get compared to each other using both identity- and equality-comparisons.

var s1 = "10";
var t1 = "10";

var s2 = new String("20");
var t2 = new String("20");

alert(
    "\n s1 === t1: "+(s1 === t1)+
    "\n s1 == t1: "+(s1 == t1)+
    "\n s2 === t2: "+(s2 === t2)+
    "\n s2 == t2: "+(s2 == t2)
);

You can try this out on JsFiddle web page.

Here comes the result:

 s1 === t1: true
 s1 == t1: true
 s2 === t2: false
 s2 == t2: false

Line one is not intuitive: s1 === t1, which is an identity comparison. The programmer has written two different strings!

But mind the surprising equality comparison in line four:

new String("20") is not equal to new String("20") in JS, and unfortunately also not in TS!

A strong motivation to avoid the new operator.

TypeScript Example Source

Use the same source as for JS. You can try out on the TypeScript Playground. The result is the same as for JS.

Unfortunately TS does not fix these JS gotchas. Whatever surprise you experience on JS language level, you will have the same in TS.

Java Example Source

You need a Java development kit to try this out.

public class Test
{
     public static void main(String[] args) {
        final String s1 = "10";
        final String t1 = "10";
 
        final String s2 = new String("20");
        final String t2 = new String("20");
        
        System.out.println(
            "\n s1 == t1: "+(s1 == t1)+
            "\n s1 equals t1: "+s1.equals(t1)+
            "\n s2 == t2: "+(s2 == t2)+
            "\n s2 equals t2: "+s2.equals(t2)
        );
    }
}

Output is:

 s1 == t1: true
 s1 equals t1: true
 s2 == t2: false
 s2 equals t2: true

Mind the correct equality comparison in line four, now it is true as expected. Line one and two are exactly the same as in JS, due to compiler-optimized strings.

Nevertheless the difference between plain strings and string objects is a little puzzling, surely it is not intuitive.

Resume

Although both Java and JS are C-style programming languages, there seems to have been no common sense about how to perform comparisons.

Why was it called JavaScript? I always call it JS, and I am glad about the renaming to EcmaScript. The name JavaScript makes newcomers believe it is a Java script-interpreter. What a mistake!