Blog-Archiv

Mittwoch, 12. März 2008

Fashion

Clothings make the man. And fashion makes the clothes. In C that was spelled "I'm too lexy for my yacc". But nowdays we implement in Java. I'm too jlexy for my javacc?
Fashion seems to be not a real important thing compared to the excellent progress we make every day with our great applications. But it strikes, ultimately in code maintenance, which makes up to 70% of software production efforts.

Writing source code that makes computer systems available for human needs was always affected by fashions. Manipulating sourcecode by the C preprocessor was a special hobby of all experts. Out came things like MFC or MAPI sources, that sometimes even lacked the semicolon, until it didn't look like C anymore. In Java there is no more preprocessor, even when aspect-oriented languages took over, but these can not seriously be compared to preprocessors.

Nevertheless Java is affected by fashions like C was. Because it is practiced by so many people. Which is a good thing! I remember a fashion that put the fields to the end of the class, like some decompilers did. This looked like the following:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
public class TooClassyForMyFields
{
  public TooClassyForMyFields() {
    ...
  }

  // imagine several screens of big methods here ....

  private int id;
  private String name;
}

The compiler assembles this into the class file like the following (and the Java virtual machine reads it in that order, too):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
public class TooClassyForMyFields
{
  private int id;
  private String name;

  public TooClassyForMyFields() {
    ...
  }


  // imaginge several screens of big methods here ....

}

But we are not writing source code for the compiler, we write for human beings! So why should we keep that technical order?
That is right, never write source for the compiler. But sometimes the technical order is the same as human beings would prefer. When I'm reading a class, I am, in first instance, interested in the essence of that class, sometimes called its "state", the member variables that it keeps and hopefully manages to be always consistent. This models the relations to other classes, and gives a good picture what the class is about, or what is its target. So when I read the first lines, I get an impression that the class is about an "id" and a "name". After the fields I would like to get to know the public methods that form sort of an interface to collaborating classes. And, of course, any abstract methods, protected or public. When I need to derive the class, I might be interested in protected methods and fields. I am not interested in private methods, because I assume that they are working and running, so they can be on bottom of the source. Fortunately that fields-on-bottom-fashion was a short-termed one.
Besides: why is it cool to write fields at the bottom of a class?

But I'm not a fashion denier! Here are some of my favourite Java fashions:

  • Documenting source code. A nice class header, mentioning the target of the class. A heading comment for all public static final fields, and all public and protected methods. I do not like protected fields, but when they are not encapsulated, they should be commented.
  • Making long names. Writing name instead of n, and findObjectOrientation() instead of foo().
  • Inserting spaces after keywords: if (something) ... instead of if(something)..., or i += a + b instead of i+=a+b
Not popular? Not cool? Hm, that's just because they might smell of morality. But isn't that a sweet smell?
Seems that fashion always goes a little on the immoral side. So lets discuss it.

For example, there is a fashion I dislike particularly:

1
2
if (point.x == 1) builder.makeOne();
if (point.x == 1 && point.y == 2 || point.x * point.y < 5 || ......................./*soon out of sight*/...........................) builder.makeOthers();

Why?

  1. In prehistoric times, tools counted lines of code, which gave a little impression of what is under that project. Ok, we have abandoned them, nowdays we are using complexity analyzers instead (who does it?).
  2. When debugging that source I get no clear visual feedback when stepping over that two instructions. The debug pointer stays on the same line after the first step.
  3. When having a stack trace from some logfile, it contains the line number of TWO statements which could have caused the NullPointerException. I do not know if point was null or builder was null!

Fine, it keeps the source short and well-arranged - but not on maintenance. So i prefer

1
2
if (point.x == 1)
  builder.makeOne();

Besides, I found out a cool thing about assertions.

1
assert x == 0 : "X must be zero!";

When you write it like that

1
2
assert x == 0
  : "X must be zero!";

you can set a breakpoint on the second line and check the field values in the debugger if the assertion is about to be thrown!
Besides, this is my absolute favourite fashion: always write an explaining message after an assert!

Another thing is the good old "!" symbol from C. This is more than a fashion. It is a matter of course for a lot of people. I do not want to antagonize.

1
if (!condition.value)

can be written more readable as

1
if (condition.value == false)

It really makes it more readable, because it is bigger. Like long names are more verbose. We write for humans, not the compiler. And our machines nowdays have enough memory to process that all.

What about that:

1
if (((x instanceof MyClass) && ((b == 2) || (c == 3) || (e ==4))) || (allTrue() && noMatter()))

I mean the lots of parentheses. Now the discussion gets really hot. It is no doubt that parenthesizing the compiler preferences is nice for beginners that do not know that "&&" (AND) binds stronger than "||" (OR). But it lets take a short look at the short version:

1
if (x instanceof MyClass && (b == 2) || (c == 3) || (e ==4) || allTrue() && noMatter())

So what, do we like short code suddenly? Ok, I finish now.

We see that everybody has his/her own fashion. Writing source code is not objective. It is a little like a journalist's work. If it only was so easy. Writing, getting some fee for it, tomorrow being yesterdays papers, bothering nobody. But not on code maintenance!

Modern source code control and versioning systems provide the possibility to format each sourcecode that is checked in. The result is that the author revolts and insists on his/her coding convention, because it is easy to read for him/her, and he/she is working on that code. That is human, and it is not counterproductive (except on maintenance by other people, ahem).
So we would need a system that delivers a source to developer A in A's preferred style, to developer B in B's preferred style, ... and when they check it in, it is formatted to the project conventions again. Imagine checking in a big refactoring with hundreds of files ...
The other thing is that the versioning systems are still line-oriented. That means if you look for differences you might find all lines changed, just because someone formatted your beloved source to make it more readable. Readable for whom?

Fashion is nice if many people find a common sense in it for writing homogenous source code. Lets hope that fashions will go a nice way. The dominance of C* is over. The pain about the lack of a preprocessor is forgotten. Good times are rising. Give me hope, Joanna.


_____________________________________________________________________________________

Freitag, 7. März 2008

Building Blocks

When we were children, we had some kind of wood pieces or stones, and we were modelling the world in our nursery. We were building houses, towers, bridges, streets. Our brothers and sisters may have been with us, together with our parents we had some kind of family. Our family lived in some village or town, which again was part of a district or country, under the flag of a state. A lot of nations populate this planet we call earth.
So it is no miracle that human thinking is primarily an hierarchical one. Our way to solve problems is decomposing them to smaller ones that we can solve separately. "Divide et impera", the old Romans taught us. Remember hierarchical databases, that suffered from redundancy, but still some kind of registries or filesystems follow that plan.

What is not so easy to understand is that hierarchy is not enough. We wanted to build a bridge from our building blocks, but it crashed all the time. Our father was not able to calculate weights, widths and angles, not even the mayor of our town or the president of our state. But our neighbour was, because he was a stress analyst that studied mathematics. So our efforts to solve the problem by hierarchical requests failed, but relations brought the solution. He came with his calculator, and our bridge was standing compact - until mother came in ...
It would be a poor life without relations. Relational databases took over. We try to avoid redundancy where we can, OnceAndOnlyOnce is the code smell we are behind. We change from WinWord to DocBook or DITA when we write bigger documents and want to reuse pieces of it in other articles. We make it available for relations.

So what about the Java programmer? We define fields and methods in classes. Methods encapsulate the fields. Classes encapsulate both. Interfaces abstract classes. But what encapsulates classes and interfaces? Packages? What are packages exactly? Mostly they are a quite arbitrary grouping of related classes. Good software designers access packages by interfaces only, but in practice this is hard to hold. When you print out the package dependency graph of your project, do you see a clear hierarchy?
From the view of UML, packages are deployment units. Administrators are expected to build a customized application by using different packages. Java supports this by a package being a directory in an hierarchical filesystem. Moreover Java provides package-visibility of fields, methods and classes, the so-called default-access modifier (when there is no private, protected or public keyword).
But we see that there is no good hierarchical item above classes. There is a JSR about "Superpackages" on the way, google for "Strawman Proposal for JSR 294 Superpackages". This might become part of the Java programming language, and then you can define package dependencies in separate specification files. Will it help?

Again we expect help from hierarchies. What about relations? Applications consist of both relational and hierarchical constructs.

Hierarchical:

  • The "Is-A" hierarchy: classes extend other classes, interfaces can extend even several other interfaces.
  • The "Has-A" hierarchy in concrete and abstract level:
    • classes contain fields that are references to other classes,
    • both classes and interfaces can contain inner classes and interfaces (simulate packages by that way).

Relational:

  • Classes work together. They construct local variables of other types on the fly, sometimes even by dynamic class-loading when some concrete service implementation is not known at programming time. Remember platform-specific device drivers.
  • Utility classes work together with all levels of hierarchies. Mostly they are static method collections which store no state to instance fields, and thus are quite robust and universal.
  • Cross-cutting concerns are things like access control, logging, loading of language-specific texts, debugging etc. These concerns can appear everywhere. Imagine that you need to check access rights in every method, because you sell your application by features.
  • Usage of external resources like databases, operating systems, network, printers.
  • You try to optimize long-lasting work by using threads that work synchronously.

So what else do we need? Is this not enough to model the world?
The problem rises when maintenance takes over. The application is sold and starts to live. Versioning takes place. Parts of it have to be exchanged. The GUI is reworked every month. New database adapters have to be written, some databases seem to be exotic and do not fit our access interface. Bugs are reported and proof some concepts not to be the best.

The software maintenance claims up to seventy percent of the software production efforts. This is much. Shouldn't we think about a maintainable implementation of our product? But - are object-oriented language not optimized to provide maintenance? Or - have we hired bad designers and programmers?
Requirements are changing over time, even the best designers and programmers might have failed. Better lets look closer at the language fault. A project consists of thousands of classes, at least one million lines of code for an average project. The unclear package concept over it. This is much. And complexity goes further, into the deep. Nothing forced that programming statements that have to be in a certain order should be written into separate methods. A programmer changes the order and causes a bug. Nobody commented why a certain statement has to be here, someone removed it because it did not sound logically in that context - bug.

Complexity is a monster at the gate of chaos. We can't make money with chaos software. We need well-specified and repeatedly testable building blocks from which we can build the application. It comes out that we need something above classes (and packages that proved to be unclear). Besides, we need good coding conventions, design and programming principles, and regular training and communication for developers to come over the depth problems.

Assuming that components would ease our fight against complexity, we will find opinions about software components on the web.
A component ...

Catalysis

  1. is a coherent package of software that can be independently developed and delivered as a unit, and that defines interfaces by which it can be composed with other components to provide and use services.
  2. There is a very sharp distinction between the external interfaces of a component and its internal design and implementation.

UML

  1. has a Specification
  2. has an implementation
  3. conforms to a standard
  4. can be packaged
  5. can be deployed

There are component frameworks around for Java, the most notable are (open source):
  • OSGI (Eclipse plug-ins are built on this) - interface driven, implementations are loadable and exchangeable during runtime, mostly programmed by static factories using OSGI utilities.
  • Spring - XML and interface driven, concrete implementation classnames and values for fields and parameters are written in XML documents.

We see this is evolving. We have to wait a little.
Meanwhile we could think about what will be our requirements for those upcoming components. I made my list of expectations, and I am quite sure that we will need some more terms than just 'component' for that, or split that word into several others.
Now look, I want to divide my application into the following aspects:
  1. Architecture: when some functionality is in a certain "three-tier-architecture" layer (client/server/persistence), which might mean it runs on a different machine in the network. Classical architectural components are
    • user interface logic
    • business logic
    • persistence logic

  2. Environment: when different environments require specialized behaviours, e.g. for different GUI-environments you might need the AWT-, Swing-, SWT-, or even Web-Brower component for your client. Environments include operating systems, database products, legacy applications, rendering devices (printers, screens, PDAs, ...), integrating scientific expert modules, and so on. This is more than just writing adapters.

  3. Separation of concerns: (1) software-technical knowledge and (2) professional business-expert knowledge should be implemented in separate components.

  4. Abstractions (frameworks): a bundle of related classes offers reuseable functionality, accessible by override- or delegation-mechanisms.

  5. Runtime binding (dynamic binding): when the criteria for loading a certain class(-graph) are not known at development time and must be deferred to runtime (e.g. you might not know on which GUI environment the application will run). The class graph created by a static factory is a component.

  6. Concurrency: one (type of) thread is one component.

  7. Global availability: when a piece of logic will be needed by a lot of other components everywhere across all architectural and environmental borders. I want to attach logging, access control and other cross-cutting concerns by adding a component. Can we call these "Commons"?

That's all. Oh my God.
Complexity is a monster at the gate of chaos. What is a component? Where are the times we made it easy with building blocks?



_____________________________________________________________________________________