Blog-Archiv

Sonntag, 12. Dezember 2021

Java Functional Programming and Readability

Unfortunately developers tend to be proud when producing code that looks complex. The functional extension that was added to Java 8 makes it easy to write source code that is shorter than a conventional object-oriented solution, but hard to read for people not used to functional programming.

So will it be a "write less, do more" innovation or new ammunition for code wars?

Example Application

We want to find out the most frequent time-offset (from Greenwich mid-time) from a list of TimeZone:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
    public static void main(String[] args) {
        final List<TimeZone> zones = new ArrayList<>();
        for (String timeZoneId : TimeZone.getAvailableIDs())
            zones.add(TimeZone.getTimeZone(timeZoneId));
        
        final int mostFrequentRawOffset = mostFrequentRawOffset(zones);
        
        for (TimeZone zone : zones)
            if (mostFrequentRawOffset == zone.getRawOffset())
                System.out.println(zone.getID()+" (\""+zone.getDisplayName()+"\")");
    }

Line 2 - 4 collects all available time-zones into a list.
Line 6 calculates the time-offset (in millis) that contains the most time-zones by calling mostFrequentRawOffset().
Line 8 - 10 outputs all zones that have exactly this offset.

How can we implement the mostFrequentRawOffset() method?

Conventional Solution

For demonstration that this problem is not so easy to solve, here is a conventional solution using standard Java runtime classes:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
    private static int mostFrequentRawOffset(List<TimeZone> zones)    {
        Map<Integer,Integer> rawOffset2Frequency = new Hashtable<>();
        for (TimeZone zone : zones)    {
            Integer rawOffset = zone.getRawOffset();
            Integer frequency = rawOffset2Frequency.get(rawOffset);
            if (frequency == null)
                frequency = 0;
            rawOffset2Frequency.put(rawOffset, frequency + 1);
        }
        int maxFrequency = -1;
        int mostFrequent = -1;
        for (Map.Entry<Integer,Integer> entry : rawOffset2Frequency.entrySet())    {
            int frequency = entry.getValue();
            if (frequency > maxFrequency)    {
                maxFrequency = frequency;
                mostFrequent = entry.getKey();
            }
        }
        return mostFrequent;
    }

In lines 2 - 9 we collect the time-zones into a map where key is the time-offset and value is the count of time-zones that have exactly this time-offset.
Lines 10 - 18 searches the maximum value in that map.
The most frequent offset (key) is then returned in line 19.

We could make this 2 lines shorter, so let's say it is a 18 line solution.

Functional Solution

Now here is an already optimized functional solution that does exactly the same:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
    private static int mostFrequentRawOffset(List<TimeZone> zones)    {
        return zones
            .stream()
            .collect(
                Collectors.groupingBy(
                    TimeZone::getRawOffset,
                    Collectors.counting()
                )
            )
            .entrySet()
            .stream()
            .max(Map.Entry.comparingByValue())
            .map(Map.Entry::getKey)
            .orElse(-1);
    }

This is a 15 line solution. Not much but anyway shorter than the conventional solution above. The difference is that with the functional solution you need to know the behavior of all involved functions, while with the conventional solution you just need to know the behavior of Map, nothing else.

Because I do not want to explain every single line of this code, here is a variant that may be more readable, because it has intermediate results which expose the return types of the used stream-functions:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
    private static int mostFrequentRawOffset(List<TimeZone> zones)    {
        Map<Integer,Long> rawOffset2Frequency = zones
            .stream()
            .collect(
                Collectors.groupingBy(
                    TimeZone::getRawOffset,
                    Collectors.counting()
                )
            );
        Map.Entry<Integer,Long> maxEntry = rawOffset2Frequency
            .entrySet()
            .stream()
            .max(Map.Entry.comparingByValue())
            .orElse(null);
        return (maxEntry != null) ? maxEntry.getKey() : -1;
    }

In line 3 we turn the list into a stream and send it into the collect() function that builds a map.
This map is built by the static Collectors.groupingBy(), which does exactly the same what we do in the conventional method in lines 2 - 9 (yes, we need to memorize all these functions).
Line 6 tells the lambda to use for the map's key, which is TimeZone::getRawOffset, line 7 tells the lambda for the map's value, which is the static Collectors.counting() function (yes, we need to memorize ...).
In line 10 we fetch the map's entry-set and stream it into a maximum-searcher, represented by the static Map.Entry.comparingByValue() function (yes, we need ...).
If the list is empty, orElse() will produce null in line 14.
Line 15 finally tests for null and returns -1 when so, else it returns the key of the found maximum entry.

This second functional solution differs from the first one above in that it does not map the result in line 10 to the getKey() method but calls orElse(null) instead in line 14. The one above fetches the key from the optional maximum by mapping to the MapEntry::getKey function in line 13. This won't fail if there is no value, in that case the orElse(-1) function will be executed.
→ That's the functional style to write an if-else!

All clear? When not, let it soak in, for a day or so, people like us have plenty of time ...

Resume

So we got a conventional 18-liner against a functional 15-liner.
Could you verify that they both do the same?

It is not a miracle that no functional language is among the 5 most popular programming languages of the world. The most popular (Python, Java, C++, JavaScript/ES6 etc.) are either object-oriented or near to OO. True, popularity is not a quality feature. But what is quality of source code? Shortness? Readability? Comprehensibility? Maintainability? What counts most?

Functional languages are science, having a steep and long learning curve. Functional programmers won't talk about readability: "Of course you can read it" (hope you also understand it). Is Java still Smalltalk for the masses, or are we going back to be an expert realm?