Blog-Archiv

Freitag, 27. Juli 2018

Some Things We Always Do Wrong

It has become an Internet habit: "Five things you must see before you die", or "Ten things that will blow your mind". For software developers, I will contribute "Some things that always go wrong" here and now, hoping to not "blow your mind" :-)

So what do we do wrong every day?

  1. Blowing Up Things
  2. Underestimating Names
  3. Fixing Symptoms Not Causes
  4. Copy and Paste Coding

Maybe I repeat myself. It's a daily struggle to avoid these things, and to make other people realize them as being major obstacles.

Blowing Up Things

When we implement a new feature, we go inside the source and try to find the point where we can insert our solution. We add some parameters, a new loop in the method body, the class gets some new member fields, and that's it, devil-may-care, coffee machine! So coffee goes in and in and in, until the class has 3000 lines of code, and its state is no more controllable, consisting of 30 member fields, full of long methods. Same with packages. Same with modules, libraries, applications, tools. It turns out that the word "implement" is too near to "implant".

Why do we always blow up things from inside? Why don't we build on them, from outside, respecting their responsibilities? Why don't we reuse them, override them, call old methods from new ones, extend classes by creating sub-classes, put new fields into new delegate classes?

Blowing up from inside we violate one of the oldest C programming rules: one function should have just one purpose (single responsibility principle). Instead we should:

  • Refactor until single responsibility is reached (→ "blow down")
  • Implement new features always by new classes, packages, modules
  • Make all new sources comply with single responsibility

What I found useful in real life:

  • No class should have more than 400 lines
  • No method should have more than 1 loop
  • Even the largest method must fit into a 1024 x 768 screen

If you blow up classes to implement several responsibilities, you make redesign difficult up to impossible. Essentially you are breeding the monolith that way. You are doing the opposite of "divide and rule". But breaking down problems to smaller ones is the only way to cope with complexity. And complexity is the primary problem of software development!

Underestimating Names

Every day we create names for classes, functions, fields, variables, packages, modules, libraries, applications, tools, helpers, utilities, .... Does anybody care about these names? Aren't they just for the compiler or reference docs? Following sentence I hear frequently:

Doesn't matter how it's called if it works

So this is a strong sentence, and I need to find an even stronger one now that tells the exact opposite .... what about this:

If names don't matter, why do we use names?

So let's number things. Like they do in the army. Or use completely absurd names like the JS community.

We are still living in times of hand-written source code. Maintenance time for hard-to-read source code is high. Source code is communication, you write primarily for humans, only secondarily for machines. Good names are the key to understandable source code. Following are my advices for better naming:

  • Don't use names that are not in common sense (you are the only one that would understand them)
  • Don't use abbreviations and acronyms (they make reading and understanding much harder)
  • Avoid too general terms like "service", "process", "system", "action", "data" (you may obscure the real intent, and can be misunderstood in many different ways)

Exceptions confirm the rule:
→ abbreviations that everybody uses are acceptable,
→ general terms that are used for high-level abstractions are also fine.

Words should be as precise as possible, and as short as possible.

Yes, this is a contradiction, and it is an art to achieve a compromise of both. You must exercise this for some time, then people will start to understand you fast and easily. Your source should look like a well-structured API manual, not like the code of enigma.

Fixing Symptoms Not Causes

The classic symptom is the NullPointerException. Easy and fast to fix, simply put an if-not-null condition in front. But then the problem occurs again in another place, and in another, and .... because this collection was expected to never be null! So we better should have fixed the cause. Of course the NullPointerException is not the only situation where a symptom may obscure a bug.

It's a little bit like the copy & paste story. As soon as some occurrences have been fixed, and then someone fixes the real cause, it will be not easy to say whether a symptom fix can be removed or not. We end up with code full of if-not-null conditions. Here is my advice how to avoid fixing just the symptoms:

  • Write assertions:
    • Assert parameters
    • Assert returned values
  • Keep methods short and classes small

This helps to narrow mistakes. Early revealed problems will not reach regions where the cause can not be guessed any more. Here is a strategy for finding causes:

  • Make the bug reproducible
  • Set a debugger breakpoint to where it happens, then reproduce it
  • Carefully study the stack trace in the halted debugger, the cause is most likely not on top of the stack!

Understanding the source code around, and its user stories, is absolutely necessary to be able to find causes. Take yourself time for this.

Copy and Paste Coding

Fortunately a well known problem, but also a survival strategy for certain developers. Unfortunately these always write the new code!

Here is the copy & paste story:

  • A buggy block of code has been duplicated 10 times
  • The first bugfix will be done in just one place, the 9 duplicates will stay undetected
  • Tests may uncover further duplicates, other developers will fix them, of course in different ways
  • After 10 developers fixed the dups in 10 different ways, it will not be visible any more that this was all the same functionality
  • The management will say "This software aged rapidly, let's hire some students to write a new one!"

To be fair, I've seen also skilled programmers achieve their feelings of success by copy & paste programming. It is the most wide-spread programming sin. It degrades source down to code. I just can advice to not repeat yourself, stick to DRY.


Conclusion

Even when programmers will be replaced by robots some day, the robots will have to be programmed by hand. Even when it will be possible some day to replace all hand written code by ready-made design patterns, these patterns will have to be programmed by hand. We can't escape. We (we!) have to be clever and avoid these things that always go wrong.




Sonntag, 22. Juli 2018

The Java to JS Transpiler JSweet

JSweet is a Java to JS (JavaScript) translator. It generates significantly less source code than its GWT counterpart, and that source is readable, as you can try out in its sandbox (generally a 1:1 translation). JSweet reads Java source, not class files, facilitating the javac compiler. Essentially it generates TypeScript code, not JS, and uses the TS compiler to generate the final JS output.

JSweet is built component-oriented, that means not only you can write libraries for it (called "candies"), you also can override the way how it generates TypeScript code.

Sounds great! This Blog is about my first steps in trying out JSweet.

The Different Worlds of Java and JS

Programming languages may have different target domains.

  • Java is a general-purpose programming language. The Java world is the computer, its file system, the network, databases, .... Every operating system that provides a JVM can be accessed by Java (at least through JNI).

  • The JS world is the web-browser, and a little bit of the network, cookies, and (since shortly) a local storage. Still different browsers provide different JS interpretations, and although these differences are getting less, you must be able to fix them, like jQuery does.

Thus it should not be a problem to emulate JS in Java, right? Just implement a BOM (browser object model) and a DOM (document object model), some kind of AJAX, and provide browser-specific overrides, that should do it.

But what happens in JS when you open a file in Java? A browser doesn't have access to the file system of the client computer! And what happens when you open a Java AWT Frame?

GWT simply does not support these parts of the Java runtime library (rt.jar). Does JSweet?

Trying it out

I downloaded the Quickstart example using git. To transpile the contained sources, Maven (mvn) must be installed. These were my commands:

cd whereverYouTryOutThings
git clone https://github.com/cincheo/jsweet-quickstart.git
cd jsweet-quickstart
mvn generate-sources

The mvn generate-sources command runs the transpilation from Java to JS. This generates a file target/js/quickstart/QuickStart.js, transpiled from src/main/java/quickstart/QuickStart.java. That JS file is referenced in webapp/index.html, and if you load this into a web browser, you will see what QuickStart.java did.

Then I installed the Eclipse plugin, imported the quickstart Maven project, and tried to modify src/main/java/quickstart/QuickStart.java.

package quickstart;

import static def.dom.Globals.document;
import java.util.ArrayList;
import java.util.List;

public class QuickStart
{
    public static void main(String[] args) {
        final List<String> list = new ArrayList<>();
        list.add("Hello");
        list.add("World");
        document.getElementById("target").innerHTML = ""+list;
    }
}

Mind the ugly access of the public member field innerHTML. That's the way JS does it, and here it invaded Java. Making member fields public is one of the things that you should avoid in OO source code.

The plugin did not generate JS when saving the Java file, so I again had to use the command line to get my changes into JS:

mvn generate-sources

Later I found out that you must activate the plugin for the Eclipse project:

  • select the project in package explorer
  • open the context menu (right mouse button)
  • choose "Configure" - "Enable JSweet builder"

Now every "Save" of a Java file would update the according JS file. Unfortunately not in the target folder but in the js folder, so this plugin is not yet ready, or I missed something.

Transpilation result was the following target/js/quickstart/QuickStart.js. This is already ES6, no more JS, as there is a class definition. The transpiler also can generate ES5, which is closer to JS.

/* Generated from Java with JSweet 2.2.0-SNAPSHOT - http://www.jsweet.org */
var quickstart;
(function (quickstart) {
    /**
     * This class is used within the webapp/index.html file.
     * @class
     */
    class QuickStart {
        static main(args) {
            let list = ([]);
            /* add */ (list.push("Hello") > 0);
            /* add */ (list.push("World") > 0);
            document.getElementById("target").innerHTML = "" + (a => a ? '[' + a.join(', ') + ']' : 'null')(list);
        }
    }
    quickstart.QuickStart = QuickStart;
    QuickStart["__class"] = "quickstart.QuickStart";
})(quickstart || (quickstart = {}));
quickstart.QuickStart.main(null);

Lots of boilerplate, as usual in generated code. Mind the last line where quickstart.QuickStart.main(null) gets called. Thus the page that loads the script doesn't need to initialize anything.

Here comes the webapp/index.html page that loads the script:

<html>
  <head>
    <meta charset="utf-8" />
  </head>
  <body>
    <p id="target"></p>
    <script type="text/javascript" src="../target/js/quickstart/QuickStart.js"></script>
  </body>
</html>

And here is a screenshot of the result:

Concepts

As you may know, you can not transpile every part of the Java runtime library with GWT. JSweet has similar restrictions, it just uses a fork of the GWT Java emulation library. But it provides a way to enlarge the usable range of the Java runtime: "candies". There are candies (libraries) that refer to Java, and such that refer to JS.

  • The Java part is called j4ts (→ "Java for TypeScript"). For programming see the JavaDoc for the core browser API. I saw even candies for AWT and Swing.

  • The JS part is much bigger. It is called jsweet candies. JQuery, Angular, React, and many other JS libraries are available for integration. Thus you could now write Java code that uses these powerful JS components. But mind that the API would be JS-oriented, i.e. you are leaving encapsulation and other object-oriented principles behind. I saw lots of static and public (non-final) in the examples.

You will find more conceptual information on the JSweet specification page. If you want to see source code of JSweet, download it from github, it's all open source. There is also a nice video.

Conclusion

I am a little bit frightened by applying the JS style in Java. It took me a long time to understand and use encapsulation in Java, should I give this up now and change to the fragile and error prone JS style?

Nevertheless the possibility to transpile an AWT application to the web is seducing. I will try this out.




Sonntag, 15. Juli 2018

Java Inner Class Serialization Gotcha

Inner classes were added to Java in version 1.1. Inner classes always occur inside the curly braces of an outer class, and can occur on unlimited nesting levels. Two different kinds exist:

  1. Static inner classes
  2. Non-static inner classes

This Blog is about the difference between these two, and the sometimes fatal consequences of serializing instances of a non-static inner class. I do not cover anonymous and local classes here.

The Difference

Static inner classes behave like most people would expect from an inner class. They do not require the existence of an outer "parent" object. They are like classes of a sub-package, but referenced through the class, and behave like normal classes, except that you can set access modifiers on them (private, protected, public, default), which is quite useful.

Such an object is constructed like the following:

public class StaticInnerClassExample
{
    private static class Cat
    {
    }
    
    public static void main(String[] args) {
        final Cat cat = new StaticInnerClassExample.Cat();
    }
}

So the construction happens by new OuterClass.InnerClass().

Non-static inner classes on the other hand require the existence of an outer "parent" object. Furthermore they also hold an invisible reference to their outer object, generated by the compiler. You can see this hidden reference in the debugger as "this$0".

The construction of a non-static inner object looks a little unusual, you must call new on the parent instance, new OuterClass().new InnerClass():

public class NonStaticInnerClassExample
{
    private class Cat
    {
    }
    
    public static void main(String[] args) {
        final Cat cat = new NonStaticInnerClassExample().new Cat();
    }
}

This makes the difference clear: the non-static always needs an outer object to come to life.

Java Default Serialization

What can we do with serialization? For example, in case all your classes implement Serializable, you could write the whole object graph of your application to a file before terminating it. On next startup you could load the application from that file. It then would be in exactly the same state as it was when you terminated it!

The following is about Java default serialization. Serialization of an object can be overridden by implementing readObject() and writeObject(), which I will not cover here.

Serialization reads and writes the values of the instance-fields of an object, called "the state of the object". Methods do not get serialized, they belong to the class.

If there is a reference to another object, also this object gets serialized, recursively. In case there is something in the resulting object-graph that doesn't implement the interface Serializable, a NotSerializableException will be thrown.

Serialization ignores static fields, and fields tagged as transient (a Java language keyword).

Mind that serializing instances of inner classes is generally discouraged due to compiler-specific mechanisms around it.

Gotcha

Now that we know about inner classes and serialization we can try out what happens when you serialize an object of a non-static inner class.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
public class CatPack implements Serializable
{
    public class Cat implements Serializable
    {
        private final String individualName;
        
        private Cat(String individualName) {
            this.individualName = individualName;
        }

        @Override
        public String toString() {
            return individualName+" from "+packName+", having: "+mateNames();
        }
    }
    
    private final String packName;
    private final Collection<Cat> pack = new HashSet<>();
    
    public CatPack(String packName) {
        this.packName = packName;
    }

    public Cat add(String individualName)  {
        final Cat individual = new Cat(individualName);
        pack.add(individual);
        return individual;
    }

    private String mateNames()  {
        return pack.stream()
                .map((Cat cat) -> cat.individualName)
                .reduce((String soFar, String next) -> soFar+", "+next)
                .orElse("");
    }

}

The CatPack class encapsulates its Cat members using a non-static inner class. The Cat.toString() implementation shows that non-static inner classes have access to private fields and methods of the outer class, like packName and mateNames(). The implicit pointer to the outer object provides that (static inner classes don't have such).

So far so good, but what happens when serializing a Cat instance? Will it still be able to enumerate the names of pack cats after? Here is the according test code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
public class InnerClassSerializationGotcha
{
    public static void main(String[] args) throws Exception {
        final CatPack pack = new CatPack("Garfield Clan");
        pack.add("Garfield");
        final CatPack.Cat individual = pack.add("Catbert");
        
        final String expected = individual.toString();
        
        final ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
        final ObjectOutputStream objectOutputStream = new ObjectOutputStream(outputStream);
        objectOutputStream.writeObject(individual);
        objectOutputStream.close();
        
        final InputStream inputStream = new ByteArrayInputStream(outputStream.toByteArray());
        final ObjectInputStream objectInputStream = new ObjectInputStream(inputStream);
        final CatPack.Cat serializedIndividual = (CatPack.Cat) objectInputStream.readObject();
        
        final String result = serializedIndividual.toString();
        if (result.equals(expected) == false)
            throw new Exception("Error: instance of non-static inner class has been serialized without outer object!");
        
        System.err.println(result);
        // yields "Catbert from Garfield Clan, having: Garfield, Catbert"
    }

}

First a CatPack gets constructed. Two members get added, "Garfield" and "Catbert". The add() method returns the created Cat instance, thus we can serialize "Catbert".

We serialize through an ObjectOutputStream based on an in-memory ByteArrayInputStream, which is then the source for the ObjectInputStream that de-serializes "Catbert". Now the question rises: what exactly has been sent over the line? Just "Catbert", or the whole pack?

Answer is: the whole CatPack went through the line! The output proves this:

Catbert from Garfield Clan, having: Garfield, Catbert

→ How could the Cat know the names of its mates after serialization when they did not travel with it?

Conclusion

Even when you know about this implicit pointer to the outer object, you will forget about it. Thus I recommend to always use static inner classes when there is no good reason for not doing it.

Which doesn't mean that I recommend the static keyword generally, inner classes is just a special case. In any other case try to avoid static, it has lots of disadvantages.