Blog-Archiv

Donnerstag, 30. Mai 2024

Thinking About a Future Development Environment

In this Blog (my last!) I'd like to fantasize about a future development environment. I have been a convinced Java programmer for 27 years (with a past in PASCAL, C, C++ and others), but I believe that Java is slowly coming to an end. It has become too complex. Not due to recent features like generics or the functional extension, but more due to lots of technical legacy, reflection, runtime-retained annotations, and finally libraries like Spring that turn Java into a runtime-vulnerable script language, uncovering dependency- and deployment-problems. Some responsibility has to move from the developer (dev) to other roles like devops and operators (ops).


Contents

Not Covered Here

  • I will not think about domain-specific languages (DSL), as of course you can construct an easy-to-use programming language for every special purpose (domain), but it will be hard to understand for people unfamiliar with that domain. I don't think that a general purpose language of the future will need to provide embedded DSLs (like Scala does), because the resulting source-code would leave any common sense behind.

  • I am not going to argue about memory management and object reference counting, these topics have been analysed and implemented for decades. A garbage-collector is a must-have nowadays, don't let programmers do that.

  • I am not going to discuss the coolest keyword abbreviations like "fun", "func", "def" or "proc", they must be pronounced "function" or "procedure", nothing else! Only long names will motivate programmers to do the same, and this is sorely needed. "Write less, do more" is somewhat different. Whatever you write should have quality, i.e. be easily understandable!

Hypes

Some talk about the amount of source-code needed to express something. A programming language is considered to be good when the Quicksort algorithm is implementable in very few lines, see Scala. I believe this is not important, comprehensibility will be the main concern of the general programming language of the future. We don't need something that serves the machine, we need something that serves humans.

People like to use luxury tools that have all imaginable features. Multiple inheritance (a class can inherit from more than one class) is a sometimes desirable feature, but it opens up ambiguities about the inherited fields and methods, and thus makes the source-code hard to understand. I would call multiple inheritance feature-craziness. You aren't gonna need it.

Python dropped the tokens to mark block boundaries (like curly braces or BEGIN and END keywords), instead it uses indentation only. Result is that Python source-code has just identation, while other languages have both. Which is more readable? You may argue that less text is quicker readable, but I argue that having both tokens and indentation makes it safer readable. Having start- and end-markers is an old telecommunication principle. Less is not always more. The same applies to the ES6 / JavaScript semicolon.

People talk about the null pointer fault, and that Kotlin fixed this. This is not true, also Kotlin programmers explicitly must avoid any possible null pointer being de-referenced. Kotlin just provides an inconspicuous syntax to do it, with the price of less readability.

There are a lot more hypes around. Don't let the media fool you. Judge by actions, not by words.

Mistakes done by Language Designers

  1. Try to please programmers.
    A programming language should please those that do not write source-code, and restrict the freedom of programmers, so that they can not "code" solutions down to the point of illegibility. Programming languages are not for "coding", this is a very negative legacy term that should not be used any more.

  2. Skip compile-time checks (favor interpreter- or script-languages with runtime-checks only).
    You won't be able to overlook many thousands of lines of source-code, you need a compiler to automate this. Alternative to compiler safety is covering 100% by unit-tests. But this is some kind of code-duplication, and it consumes about 1/3 of implementation efforts. Nothing against unit-testing, but it should not become an obligation, just to skip the compiler.

  3. Let express things in many ways.
    E.g. Perl has a trailing "unless" keyword that can be used instead of a leading "if". Source should be unambiguous and precise, not confusing. We expect the condition to be before the statement, not afterwards. The programming language of the future will allow only one way to express something. Freedom is the last thing I'd want in source-code that should represent common sense.

  4. Promote dialects and cryptography.
    Scala lets define symbols as method names. This enables developers to create their personal dialect that is hardly readable for others. A programming language must protect itself against abuse. Source-code is common property and must be easily readable for everybody. Also any kind of pre-processor (like cpp in C) would undermine that.

  5. Leave out protection mechanisms.
    Access modifiers like private, protected and public are there to let the programmer protect its source-code against (intended or unintended) abuse. In our times of heavy code-reusage they should be available everywhere, and be a medium of expression for every programmer.

  6. Use abbreviations as keywords.
    Having "fun" as keyword motivates programmers to do the same with field- and method-names.

  7. Target experts, not the masses.
    Specialist circles mostly are closed societies with limited lifetime.

Software Production Problems

Programming units are (thinking in Java, from small to big):

  1. Lambdas (closures, anonymous functions, methods without name)

  2. Methods (functions)
    contain conditions, loops, assignments, and calls to other methods

  3. Classes (templates for objects)
    contain fields, i.e. class-related constants and variables (possibly also relating to other classes), and methods working on them

  4. Packages (Java tradition that relates to access modifiers of classes, fields and methods)
    contain classes that are related to each other somehow

  5. Modules (since Java 9, JPMS)
    contain packages and their access rules.
    Some say that also methods and classes, and even closures, are "modules", because they encapsulate complex internals by access modifiers, and communicate with the outer world via interface contracts (lambdas get translated into classes by the Java compiler)

  6. Applications
    are special cases of modules, containing other modules and source code that binds them together to something that humans could use

The more programming units we have, the more complex software production will be.

Complexity

It takes many thousands lines of source-code to create a useful computer application, and hundreds of dependencies that must be present at runtime. Complexity is the beast we are fighting with, and it is controllable only by means of tools that we can trust.

The object-oriented idea recommended to encapsulate fields and methods into classes, and that classes should not expose variable fields at all, and just those methods that are really needed outside. This helped to reduce complexity, because then a class-user needs no knowledge of class-internal mechanisms.

Maintaining source-code means reading, understanding and remembering its meaning. If we can't understand the source-code of an application, we won't be able to control what it does. The "Magical Seven Plus/Minus 2" rule comes from psychology and shows us our human bounds. We can not remember more than 7 ± 2 elements of a container. If we can not remember the parts, we can not remember the whole thing. Thus there should be no more than 7 statements in a method, 7 methods in a class, 7 classes in a package, 7 packages in a module. Sounds unfeasible, but actually connects to our humanity.

Thinking in such ways, some questions rise: Can there be more than 7 modules (software-components) in an application? Do we need more levels than classes and modules to build an application? Or shouldn't we better think of a new kind of atomar OO-class that also encapsulates the purposes of packages and modules, and shouldn't we also automate the component assembly, so that we can put thousands of components into an application? If a class was a software-component, we'd have less levels of programming units where human understanding is necessary. (By the way, that was the goal of the Law of Demeter class: to be replaceabe without dependency violation risks, an interesting experiment.)

Dependencies by Code Reusage

By integrating a library that e.g. implements an HTML to PDF conversion for you, you create a dependency to a provider that may give up bug-fixing and feature-request support at some point in time, or change the interface contracts of the library completely due to refactoring. The free publicly available code-sharing repositories (like Maven for Java, or CPAN for Perl) work quite well, but this could change. Remember the itext library for PDF that went commercial, or the jakarta initiative that renamed packages. Worst case: when the Java class specification gives up backward-compatibility, the whole open-source world has to be recompiled!

So how do we build out castles from sand in that dependency hell?

  1. Imports
    These are source statements on top of each class. We create packages. A class inside a package does'nt need to explicitly import classes in the same package, but it needs to do that for classes in other packages. The hierarchical shape of packages does not influence their availability, that means a package "a.b" can be imported also by a sub-package "a.b.c" (which is not quite intuitive).

  2. Compile Dependencies
    Not part of the Java language. We have Maven pom.xml files that contain the artifact coordinates for all libraries needed. Maven gives the compiler the CLASSPATH to enable type checking against all public external classes. It can also run unit-tests, and build the application (library, module) from the resulting classes, as .jar or .war file (deployment unit).

  3. Dependency Injection (DI)
    Currently done by annotations inside the source-code. As soon as the application starts up, the application-context is built from classes annotated as "beans". Traditionally these are data-access-objects (DAO, persistence layer) or services (business logic classes). Generally spoken, an implementation for an interface is searched and loaded by the DI container (e.g. Spring). The Java compiler does not evaluate if there is an implementation present for an interface. The Maven application-builder does not know about dependency injection, hopefully all needed dependencies have been added by a developer. If that went wrong (deployment), the application may crash at startup time (singleton beans), or as soon as some runtime-bound prototype bean gets allocated and does not find its dependencies. Currently more time goes into fixing the application-context (bean container) than into fixing compile errors.
    It must be mentioned that injection is possible also without Spring, by the (now simplified) Java service providing (ServiceLoader class).

  4. Module Access Rules
    Finally modules were added to the Java environment. They restrict reflection, access to classes from other modules, and also names of packages. Inside a module-info.java file are requirements (dependencies) to other modules, and exports of own packages. Only exported packages can be accessed by other modules. Parts of this is checked by the compiler, parts by the Java virtual machine. The syntax of module-info.java is not part of the Java language specification.

These are our dependency definition levels in the Java world (without mentioning OSGI). Imports and compile dependencies are already covered (automated, checked) by what is called "Maven Build", but I wonder how dependency-injection will fit together with module access-rules.

Java packages are obsolete, but can not be removed because they relate to access modifiers in source-code. Modules are the better packages, but do not relate to a source-code folder structure (like the standard Maven directory layout), and provide no deployment tool either. Dependency injection is completely up to the developer, the compiler does not care if an implementation for an interface is present or not.

Runtime Binding and Configuration

The phases every software production goes through are (all kinds of test ignored)

  1. implementation (programming)
  2. assembly (build)
  3. deployment (installation)
  4. configuration (customization)

The Java compiler checks the source-code and its calls to libraries used by the compile-unit (normally a Maven dependency-tree). Libraries do not expose source-code, so the compiler does not dive deeper to check their correctness and completeness. It happens that libraries load their dependencies dynamically via reflection, that means they may have hardcoded class names of other libraries within them, including package names, sometimes even method names. So, if everything is fine at compile-time, it doesn't mean that the application will start up and work. And even when the unit- and integration-tests were fine, it doesn't mean that the application will work at a customer's installation, deployment and configuration still may bring it down.

In the Java world, modules are what is called "software component" sometimes. These are deployment units (JAR files), replacable at a customer's installation. This implies that installations can be different per customer, which opens up a support problem: bugs may occur at one customer while all others work perfectly. If you think that over, you may come to the result that the version number of an application needs to be calculated from the version numbers of all contained components. Currently developers set that version number in good hope that no one will manipulate any installation, and thus violate dependencies. Theory and practice diverge here.

Since the times of safe statically bound applications written in C, the trend towards flexible dynamically bound applications has become stronger and stronger. Java once was called "Smalltalk for the masses" by James Gosling. Smalltalk is dynamically bound, while Java is a strictly typed language, checked by the compiler. In Java, reflection is just an option for programmers, but was used heavily by libraries like Spring that implements dependency-injection, practicing "inversion of control" (IOC). That means, not the developer decides which class is used to instantiate an interface implementation at runtime, instead the dependency-injection container does this at runtime by configuration. Caution: configuration can be different per customer!

Configuration always was a world parallel to development. Although developers have to implement all configuration options, the customer-specific configuration values mostly are set by some consultant. There were times when "configuration manager" was a job designation. Mature widely used software always is highly configurable.

Configurators mostly work with name-value pairs (properties) provided by developers. Until now there is no other standardized configuration language or environment in the Java world. Maybe creating such was one of the original ideas behind Spring, but it is not reality, Spring XML is obsolete and was moved to developer source bound annotations, and such annotations are not changeable at runtime (like it was with XML).

Can a New Programming Language Solve these Problems?

Let's summarize goals:

  • Encapsulaton of complexity (programming units with access modifiers)
  • Deployment safety (configuration / dependency injection without risks)
  • Runtime flexibility (replaceable software components)

There were experiments with aspect-oriented programming between year 2006 and 2010, but nothing usable resulted from this except AspectJ. Which I would not call an aspect-oriented programming language, because it always works on top of some other programming language. A real aspect-oriented programming language would not need things like pointcuts (referring to the underlying language), it would let implement algorithms as well as let define related aspects it can work together with. Some bind tool would then build an application from a repository of aspects.

What people do with Java annotations currently is quite similar to aspects. They let annotations refer to classes that implement a certain logic or algorithm. These annotations are then processed at runtime and dynamically bind together the application. (Mind that this is compile-safe only when class-references are used instead of fully-qualified class-name strings.)
Originally annotations were intended for cross-cutting concerns, but you can regard everything as cross-cutting concern, and you can regard a cross-cutting concern being an aspect.

Unfortunately the annotation programming-style makes souce-code really hard to understand. You need to know the semantic of each annotation to be able to understand what the source-code does. This is like the operator-overloading of C++ or Scala. Mostly there is no directly implemented control-flow you can follow. All runtime-retained annotations are implemented through reflection, possibly skipping compiler control.

This is the fate of Java: it will die due to uncontrollable runtime-binding and the entanglement of conflicting dependency definition levels. The Java world uses different means of expression for implementation, build, deployment. If you want to separate implementation from configuration completely, configuration needs to have its own programming language, and has to be done via module selection, overriding deployment. Thus modules would need to be very small, and lots of business logic responsibility would shift from implementation to depyloment. In such a world it might be impossible to separate deployment from configuration.

Dependency injection always should work through Java interfaces. If a deployer or configurator wants to replace an interface implementation provided by an application, that replacement must fulfill the interface. But Java interfaces do not specify completely how to implement an interface. If there are more methods than one in an interface, there are no means to express the order in which the methods have to be called. A future programming language somehow must fix this problem.

The answer (I would give to the title of this chapter) is:

  • No, a new programming language alone can not solve these problems. It needs a whole development environment, including runtime libraries and tools, similar to what the Java Development Kit (JDK) is. A programming language is just a part of such an environment.

Conclusion

If a new development environment

  • can untangle the access-modifier / package / module problem that Java currently has

  • reduces the dependency-hell to just one DRY import statement and provides according build- and deployment-tools that also check service implementation (dependency injection) availability

  • is platform (operating-system) independent

  • has a runtime library that provides user-interfaces that run on
    1. desktops
    2. web-browsers
    3. mobile devices

it will be adopted and used immediately by the world, like Java was.

Divide et impera. Divide and rule.
This concept would inspire one programming language for algorithms (dev), another one for assembly / build (devops), another one for deployment and configuration (ops). While currently the key-player for all bug- or feature-requests is the developer (dev), lots of responsibility then will move to other software production roles. The resulting challenge would be to spread business knowledge across all participating roles without loss of productivity.

When the responsibility for deplyoment and configuration moves away from the developer, a programming language of the future can be simpler than Java and leave out runtime-retained annotations. When the Java trend to bind classes together at runtime was not just a fashion, then aspect-oriented programming, together with a good bind-tool (beyond the compiler), may become standard.

Finally the term "Software Component" should be determined and specified exactly. This word goes around for many decades now, but still it is unclear what you mean when you talk about it. And most likely also the term "Software Container" will need its differentiation and specification then.