Blog-Archiv

Freitag, 22. Januar 2021

Why Java Static on Methods Is Bad

In this article I'd like to argue about the Java static keyword. Static is an access modifier that binds a field or method to its class instead of its object instance. It has become an unpleasant obsession to use statics, because implementing with globals is much easier than to follow the object-oriented idea of small, comprehensible and controllable scopes. Java is OO, and we should not let us throw back to structured programming as done with old languages like C.

Mind that the static keyword on inner classes has a complete different semantic. It's not bad, it's actually good. It makes the instances of the inner class independent of the outer class instance, because they don't have a (hidden) reference to the outer object. The garbage collector will be able to easily collect such instances, whereby it never could collect instances of non-static inner classes as long as the outer instance is still in use.
Also the static keyword on import statements is not subject of this article.

I won't discuss static fields. Anybody that used Java for some years will tell you that you should not use static non-final fields. Always make them final. Read these quotes from Stackoverflow:

Quotes

You cannot override static methods. They don't participate in runtime polymorphism. Interfaces and abstract classes only define non-static methods. Static methods thus don't fit very well into inheritance. Statics have a lifetime that matches the entire runtime of the program. The memory consumed by all those static variables can never be garbage collected. In object-oriented programming, each object has its own state, represented by non-static instance variables. Static variables, on the other hand, represent state across instances which can be much more difficult to unit-test. Statics can’t be easily mocked. Since all statics fields live in just one space, all threads wishing to use them must go through a slow synchronization control that you have to implement on each of them. Static fields wouldn't be serialized. The tighter the scope of something, the easier it is to reason about. We're good at thinking about small things, but it's hard to reason about the state of a system made from millions lines of code. Statics tend to produce spaghetti code and don't easily allow refactoring or testing.

Override Example

Following example shows that code in static methods can not be overridden.

public class PrintUtil
{
    public static void print(String text)    {
        final String printText = "["+text+"]";
        System.out.println(printText);
    }
}

public class CustomPrintUtil extends PrintUtil
{
    public static void print(String text)    {
        final String printText = ">"+text+"<";
        System.err.println(printText);
    }
}

public class Main
{
    public static void main(String[] arguments)    {
    	PrintUtil printUtil = new CustomPrintUtil();
    	printUtil.print("Hello World");
    }
}

Output of Main.main() is, on System.out:

[Hello World]

Expected was, on System.err:

>Hello World<

Reason is that main() assigned the new object to a variable typed as PrintUtil. Had it been CustomPrintUtil, it would have worked as expected, but - who can control that?

Constructor Class Example

Important to understand: constructor calls are static references, although there is no static keyword anywhere!

This example shows that it is impossible to replace class Line by IndentedLine inside the MultilineText class, because there is a static class reference, i.e. a constructor call with the new operator, that has not been put into an overridable factory method.

Here is the element class Line:

public class Line
{
    private final String text;
	
    public Line(String text)    {
    	this.text = text;
    }
    
    public void output(PrintStream printStream)	{
    	printStream.println(text);
    }
}

It lives inside MultilineText:

public class MultilineText
{
    private final List<Line> lines = new ArrayList<>();
	
    public void addLine(String line)    {
    	lines.add(new Line(line));
    }
    
    public Iterable<Line> lines()	{
    	return Collections.unmodifiableList(lines);
    }
}

There is no static modifier anywhere, but the new Line() constructor call in line 6 must be seen as static class reference, basically it is the same as a static LineFactory.create() call:

public final class LineFactory
{
    public static Line create(String text) {
        return new Line(line);
    }
}
(This static factory is useless and here just to illustrate the static nature of constructors.)
....
    public void addLine(String line)    {
    	lines.add(LineFactory.create(line));
    }
....

Here comes the IndentedLine class that we would like to have inside the MultilineText class:

public class IndentedLine extends Line
{
    public IndentedLine(String text)    {
    	super(text);
    }
    
    @Override
    public void output(PrintStream printStream)	{
    	printStream.print("\t");
    	super.output(printStream);
    }
}

The following procedure tries to replace Line inside MultilineText with an anonymous class override:

        MultilineText multilineText = new MultilineText()    {
            @Override
            public void addLine(String line) {
                lines.add(new IndentedLine(line));  // <- compile error ...
                // ... because lines is private!
            }
        };
        multilineText.addLine("Hello World!");
        multilineText.addLine("Goodbye World!");
        
        for (Line line : multilineText.lines())
            line.output(System.out);

The override of the addLine() method causes a compile error because the private lines list is not accessible for sub-classes. If it was protected, sub-classes could access it, but also corrupt it.


How to fix that? Break encapsulation and make line-list protected? I know a better solution:

public class MultilineText
{
    private final List<Line> lines = new ArrayList<>();
	
    public void addLine(String line)    {
        lines.add(createLine(line));
    }
    
    protected Line createLine(String line)    {
        return new Line(line);
    }
    
    public Iterable<Line> lines()	{
        return Collections.unmodifiableList(lines);
    }
}

We encapsulated the Line constructor into an overridable factory-method protected Line createLine(). Now we can easily override the factory method and generate another type of Line:

        MultilineText multilineText = new MultilineText()    {
            @Override
            public Line createLine(String line) {
                return new IndentedLine(line));
            }
        };
        multilineText.addLine("Hello World!");
        multilineText.addLine("Goodbye World!");
        
        for (Line line : multilineText.lines())
            line.output(System.out);

And we get it indented:

    Hello World!
    Goodbye World!

Mind what a powerful techique factory methods are: we can override/customize not just a certain class but a whole class graph by embedding further anonymous overrides!

Conclusion

It is really hard to give a convincing proof that static methods, or constructor calls without factory-method, are a bad practice. They are so frequent, everybody uses them.

Historically seen, using static is a technical throwback of 30 years. Code inside static methods is not overridable, and only reusable when all mutable state is given as parameters, in other words, when it is a "pure function".

The Spring framework tried to solve the problem of statics and globals. You should not use the Spring context as global static space, moreover you should have one context per class-graph (component, module). Then apply dependency injection wherever you are tempted to use the static keyword.

The object-oriented vision is that we solve problems using object graphs that wrap any new operator or static call into a protected overridable (non-final) factory-method. Such graphs are then fully customizable and thus reusable frameworks. One such graph can represent a component (or Java 9 JPMS-module), and components can be bound together at runtime over their interfaces using service loading or dependency injection. JPMS modules support service declarations.




Dienstag, 19. Januar 2021

Thoughts About the Future of Programming and Java

In this Blog I will write about the softare development world and where it may be going to. In my 30 years experience I had several programming languages under my fingers. My work evolved from self-written leap year functions in C to reusing Java libraries for anything and everything. Will it stay like this, or are there new programming concepts and languages ahead?

Reusable Building Blocks

First let's see where we are coming from. Java builds upon decades of language experience and offers following capsules for our precious source code:

  1. Methods receive parameters and can return a result built from these parameters and calls to other methods. Since Java 8, methods can be passed around like functions. Pure functions do not use global variables, and thus have no side-effects in case all called functions are pure too.

    public int add(int a, int b) {
      return a + b;
    }
    

    A function is a reusable atom, you can not extend it in the sense of inheritance.

  2. Classes are templates for dynamically constructed objects that can be used through their methods. Classes are the context for fields (data) and methods (functions) that work on these fields. A class can contain inheritable inner classes down to any depth. It can import other classes it depends on.

    package my.arithmetics;
    
    public class Addition
    {
      private final int a;
    
      public Addition(int a) {
        this.a = a;
      }
    
      public int calculate(int b) {
        return a + b;
      }
    }
    

    You can extend a class and override methods to customize the class to a new behavior, thus it promotes code reusage.

  3. Modules (available since Java 9) are sets of classes inside package folders, one or more packages being exported, maybe depending on other modules. Modules are compilation units, but modules that offer services are also separate deployment units.

    module my.arithmetic
    {
        exports my.arithmetic;
        
        requires my.logging;
    }
    

    Modules are more like atoms. They do not allow packages of same name in other modules ("split package" constraint). Although you can extend a single exported class from a module, you can't extend (customize) a class-graph unless it implemented the factory-method pattern consequently (which is rarely the case), and all wrapped graph classes are overridable and have been exported.

Once applications consisted of fields and methods. The object-oriented idea bound these two together to classes. Now modules are containers of classes. Are there more layers to come? Wouldn't it be possible to have just one construct that covers all these levels, recursively, so that we won't need hypermodules any more?

To get a better feeling for a possible future constructs, let's have a look at programming paradigms.

Programming Paradigms

Programming paradigms (like domain-specific languages) reflect real-world conditions and determine the way how we solve problems. Here are some paradigms that impressed me:

  1. Object-oriented:
    Classes are customizable compilation units, access modifiers reduce complexity, applications are graphs of dynamically built objects; OO languages deliver well maintainable and big applications when being strongly typed.

  2. Functional, Function-level:
    No variables, just constants, functions that, other than procedures, return values, functions can be passed around like objects; functional languages have proved to deliver the most failsafe implementations.

  3. Generic:
    Algorithms do not specify the types on which they operate, so that they can be reused for various types of data; the archetype of a framework.

  4. Aspect-oriented:
    Global pieces of logic (crosscutting concerns, advices) to be done at different places (join-points, defined by pointcut) of an already existing application, like access-control, logging, transaction management; source code would get redundant and cluttered if join-points implemented these aspects by themselves.

  5. Adaptive:
    Following the "Law of Demeter", implementation units don't talk to strangers, just to their immediate friends; a way of complexity reduction through strong encapsulation, has now merged into aspect-oriented programming.

  6. Automata-based:
    Every computer application is a state-determined automaton, thus it should be implemented in terms of states, events, transitions and actions.

  7. Parallel programming:
    Modern computers with several processors can be used efficiently only when the software enables that.

  8. Literate:
    Donald Knuth's convincing idea of having source code embedded into documentation, not the other way round.

Design Patterns

One way to solve programming problems was called "Design Patterns", published in the Nineties for C++ ("Gang of Four"). They still play a role because they do not depend on a specific language, although often being object-oriented. Somehow they relate to programming paradigms, especially to generic programming. It's not possible to provide superclasses or frameworks for design patterns.

Design patterns didn't get very popular because they require advanced programming skills, and developers love to create their own solutions and patterns. Moreover they are so close to each other that it is hard to understand the differences between them, see Builder and Abstract Factory. Nevertheless studying design pattterns broadens the horizon.

Now let's have a look at the current Tower of Babel.


The Tower of Java

Java was designed to be an object-oriented language, but meanwhile it supports a lot of paradigms and also some patterns:

  • functional extensions provide lambdas,
  • streams provide parallelity and dataflows,
  • annotations can be used to carry aspects,
  • generics allow classes and methods to abstract the types they work with,
  • access modifiers and the new modules provide "adaptive" complexity reduction,
  • Proxy allows to implement interfaces generically,
  • the Memento pattern is coming as Java 14 record (immutable data-class),
  • and we should do literate programming by extensively writing JavaDoc.

Last not least, the biggest feature, Java apps can be run on any platform that provides a Java virtual machine.

So, what's the point, why do I bother about the future of programming and Java?

Java was a real simple language back in 1998 when the JRE had 20 MB. Although lacking most of above features, Java source-code was well readable and robust, much better than C and C++, and thus maintenance-friendly. Nowadays, due to the many features added, Java has become quite a complex language, and paradigms flowing in made it an expert realm. Let me give examples.


There are Java libraries that use reflective techniques where you can write code that seems to make no sense for the average programmer eye, like Mockito unit test code:

Iterator<String> iterator = Mockito.mock(Iterator.class);
Mockito.when(iterator.next()).thenReturn("Mockito");

This code, technically, can not be understood just with knowledge about object-oriented languages. You need knowledge about the Mockito library and Java proxying techniques, in other words, you possibly will have to read complex documentations to be able to maintain such code. Of course you can say "just read the words and believe them" (intuitive programming?), but developers are not paid to be believers, and they do not read code, they scan it with the language syntax in mind.

What we see here is called "stubbing". It has nothing to do with DLL stubs. In context of the mockist style of test-driven development, "stubbing" means "teaching behavior". The static Mockito.when() method teaches the Mockito-generated Iterator to return "Mockito" on first call of next().


Spring, also a Java library, and especially Spring Boot, are on another level. The latter not just uses inversion of control (IOC), it completely takes over the program flow and puts it into annotations. In other words, you can not understand a Spring Boot application without knowing the semantic of Spring annotations. Spring Boot is almost irresistable, because it leads Java towards cloud computing.

What was that about "open source"? We are allowed to read it. But can we also understand it?

@Configuration
public class OutputStdoutConfiguration
{
    @Bean
    @ConditionalOnProperty(name = "output", havingValue = "stdout", matchIfMissing = true)
    public PrintStream outputStream()   {
        return System.out;
    }
}

@Configuration
public class OutputStderrConfiguration
{
    @Bean
    @ConditionalOnProperty(name = "output", havingValue = "stderr")
    public PrintStream outputStream()   {
        return System.err;
    }
}

@Service
public class OutputServiceImpl implements OutputService
{
    @Inject
    @Qualifier("outputStream")
    private PrintStream stream;
    
    @Override
    public void println(String line) {
        stream.println(line);
    }
}

What this Spring source is for: You can configure either stderr or stdout as output stream through an application-[profileName].properties file containing either output=stdout or output=stderr, or neither. Logic packed into annotations and conventions, spiced with magic string programming. 50% of the lines are annotations.

Spring started as IOC container, which is a concept opposite to the object-oriented idea, but useful for customer-specific configuration. Today Spring is a big developer movement providing functionality for nearly everything, focusing on Metaprogramming, whereby IOC's dependency injection (DI) is just the base technique for integrating that functionality. I call Spring "Java for superheroes":-)

How DI works: The Java compiler checks the correct usage of Java interfaces but doesn't require an implementation being behind it. When a class uses another class through its interface, DI can fulfill that interface by one of its implementations at runtime (loose coupling) via reflection. This is used for runtime-determined configuration. The application will be compile-clean due to the interfaces, but it won't function unless deployment puts the configured interface implementation somehow onto the CLASSPATH.

Unfortunately Spring is not a deployment tool. As script languages are runtime-determined-only, I could say that Spring turns Java into a script language. Compilation always succeeds, startup often fails. SmallTalk, one of the greatest models for Java, died of that disease.

Global variables: one of the oldest problems of the programming world, causing hard-to-find and annoying errors. Programmers love it, because they don't need to define a parameter when it is global. Object-oriented languages reduce globals to class scope. Nevertheless, in Java, globals can be defined through public static fields/methods inside classes.

A Spring application context is a container for singleton objects, and an instance factory for "protoype" objects. Although Spring contexts are not singletons by themselves, I have never seen an application that uses several contexts. The Spring context is always used as a static singleton "programmer's heaven", with all the consequences of globality coming back.

But we can't blame Spring. It's not the tool, it's us that are the fools. We would need fool-proof tools.


Oracle promotes a "polyglot" language environment called GraalVM, replacing the Java VM, supporting following languages:

  • Java, Kotlin, Scala (JVM languages)
  • JavaScript (not yet ES6)
  • Python
  • Ruby
  • R
  • WebAssembly (machine code that can run in modern web-browsers instead of JS, compiled from C, C++, Rust, Go, ...)
  • LLVM Bitcode (machine code, compiled from many programming languages, can be converted to many processor instruction-sets and even WebAssembly)

Let's see what overwhelming freedom causes. Most likely every developer will want to distinguish himself through his own programming language. What is a dream for developers can be a nightmare for software maintenance. Be sure that this won't reduce the complexity of your project.


All these things make me call information-technology the Tower of Babel. The Java Tower is already crumbling, because a tower with too much freedom will come down one day, that's what Murphy taught us.

Compilation versus Deployment

The compiler is the tool to tackle complexity. It can check thousands of functions, classes and modules whether they fit to each other. But it won't go beyond interface boundaries, because it is impossible to check what only deployment will decide. In times of componented-based software development we need this runtime-determined borderline.

Speaking in JPMS modules, this line will be drawn through services, speaking in Spring, the line is drawn wherever Spring beans are used. Once again we have this ambiguity that makes up the cracks in the Java Tower.

Library Dependencies

It's not the language, it's the libraries that make us great. We do not implement quicksort and search-replace by hand any more. The many open-source libraries are one of the reasons why the Java Tower grew so high.

Consequence is a big hierarchy of library dependencies that tend to get out of control due to their maintenance versions. It is the proverbial DLL hell, LINUX has its own variant of the same problem:

External library X version 1 contains class MyX, but in X version 2 it was renamed to OurX.
You use external libraries A and B, A uses X version 1, but B uses X version 2.
You end up having two different X versions on your CLASSPATH, and which one gets loaded at runtime is not predictable.

The deployment tool Maven does its best to keep Java dependencies under control, but some problems can not be solved. This is the reason why Java introduced modules. The Java 9 runtime library (JRE) has been refactored to be modules, Java libraries around the world are following slowly in case supporters are still present. Applications around the world wait for tools that make their modular life simpler. Modules are a wonderful stabilizing support for the Java Tower, but they come a little late.

In future, Java dependencies will be defined in following places, most likely redundantly and contradictory:

  • Import statements in classes
  • META-INF/MANIFEST.MF
  • Spring XML application-contexts (although deprecated it is still everywhere)
  • OSGI plugin bundle descriptions
  • Maven project-object-models
  • JPMS module descriptors

I can't say whether this is the head or the basement of the tower, but it is crumbling anyway.

Platform Is Not Platform

Ten years ago the word "platform" designated operating systems, i.e. WINDOWS, MAC and several UNIX systems like LINUX. Nowadays smartphones took their place in our lives. None of them supports Java applications. Although Android allows programming in the Java language, its class file format and virtual machine is different from original Java. You may be able to compile your Java business logic to Android, but you will have to rewrite any user-interface based on AWT, Swing or SWT.

This is the reason why many web pages complain about the broken Java promise to be platform-independent: the meaning of "platform" has changed since that promise.

Conclusion

Providing several programming paradigms in a language leads to different ways how developers solve problems. Staying purely object-oriented leads to introduction of domain-specific languages that can solve problems more elegant. Mastering several programming languages at the same time requires advanced skills because of the different language grammars, therefore I would prefer just one language representing several paradigms. But this question always seems to be about a few lines of code less. Of course shortness is a criterion, less lines of code are less bugs, but a little boilerplate doesn't hurt anybody and may increase readability.

I don't believe in something like "intuitive programming". A programming language must provide a precise and restrictive syntax to avoid all those human mistakes that happen during development, along with an unambiguous common sense how problem soutions should look like. There should be just one way to do it, freedom is inappropriate when it comes to languages. What counts is:

  • readability (more words, less symbols)
  • separation of concerns (decomposition facilities like inheritance and aspects)
  • error proofing (more immutability)
  • complexity reduction (encapsulation, access control)
  • portability into other languages (simplicity)

Although the Java Tower is crumbling I don't see anything better at the moment. Kotlin provides a little less lines of code and a trendy syntax. Only the aspect-oriented idea bears innovative power. At the time being it is a language that sits on top of another language. Would it be possible to compose an application from aspects only?




Freitag, 8. Januar 2021

A Subtle Problem with AWK Pipe Statements

When using my video-scripts I came across a subtle problem with awk pipe statements. If you forget to close a pipe-command, the next one may deliver wrong results under certain circumstances (which makes this problem subtle).

This is about GNU Awk 5.0.1, API: 2.0 (GNU MPFR 4.0.2, GNU MP 6.2.0) on Ubuntu 20.04.1 with LINUX 5.4.0-59.

AWK is an ancestor of perl and a great tool for quick data interpretation, better than perl because simpler. But, like most script languages, it has its peculiarities that may cause undetected bugs. In my case a video length was reported to be too short to contain a given timestamp, which was not true, so I had to check the responsible awk-script.

AWK Pipe Statement

Example:

file = "a.txt"
sizeCommand = "stat --printf='%s' " file
sizeCommand | getline size
print "Size of " file " is " size

This code fetches the size of the file a.txt through an external command that is piped into getline to read the first line of its output.

The GNU documentation puts a close() immediately after the pipe statement. So the correct form would be:

....
sizeCommand | getline size
close(sizeCommand)
....

→ In case you forget to close(), you may experience strange results!

Problem Reproduction

Here is a reproduction of what I encountered. You need two text files:

  1. a.txt contaning the single character 'a' (size = 1), and
  2. ab.txt contaning 'ab' (size = 2).

Then put following AWK script pipe-statement-problem.awk into the same directory and make it executable:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
#!/usr/bin/awk -f

BEGIN {
  files[0] = "a.txt"
  files[1] = "ab.txt"
  files[2] = "a.txt"
  
  sizes[0] = 1
  sizes[1] = 2
  sizes[2] = 1
  
  for (i in files) {
     externalCommand = "stat --printf='%s' "  files[i]
     externalCommand | getline size
     print "size of " files[i] " = " size
     if (size != sizes[i])
       print "ERROR in size of " files[i] ", should be " sizes[i]
  }
}

The shebang #!/usr/bin/awk -f in first line tells the UNIX-shell to use /usr/bin/awk for execution.

As no data are processed by this script, everything happens in the BEGIN rule that is executed on script start.

The script builds two arrays, one for file names and one for the expected sizes of these files.

The for-loop opens all files in the array, which are a.txt, ab.txt, and again a.txt, and fetches their sizes. An ERROR message is printed if the size doesn't match what was expected.
This script doesn't make any practical sense, but it is something like a unit test for the pipe-statement.

Output is ('$' is the UNIX command prompt):

$ pipe-statement-problem.awk
size of a.txt = 1
size of ab.txt = 2
size of a.txt = 2
ERROR in size of a.txt, should be 1

The error happens when, once again, executing the pipe for file a.txt. For some reason awk then delivers the size of file ab.txt, which is the one that preceded this pipe-statement.

Fix: Close Any Pipe!

The bug can be fixed by inserting a close() immediately after the pipe-statement on line 14:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
#!/usr/bin/awk -f

BEGIN {
  files[0] = "a.txt"
  files[1] = "ab.txt"
  files[2] = "a.txt"
  
  sizes[0] = 1
  sizes[1] = 2
  sizes[2] = 1
  
  for (i in files) {
     externalCommand = "stat --printf='%s' "  files[i]
     externalCommand | getline size
     close(externalCommand)   # MUST close any of such statements!
     print "size of " files[i] " = " size
     if (size != sizes[i])
       print "ERROR in size of " files[i] ", should be " sizes[i]
  }
}

Running the fixed script you see:

$ pipe-statement-problem.awk
size of a.txt = 1
size of ab.txt = 2
size of a.txt = 1

This is the right output. Now the size of file a.txt has been read correctly.

Conclusion

When I got to know the AWK pipe-statement, I didn't even know that you can (or must) close it. The resulting problems may stay undetected a long time, because there is no warning and no error message, simply the result of getline is wrong. I didn't find out why the result is always that of the preceding pipe, and why it happens only when repeating a pipe that was already executed once.




Dienstag, 5. Januar 2021

Crossfade Transition Between Two Videos with ffmpeg

You may have noticed that video effect when one scene slowly fades out while the next fades in, looking like they grow into each other. The open-source tool ffmpeg can generate such crossfade transitions (I used version 4.2.4 for LINUX). Creating such a transition is very slow, because due to work on pixel-level, ffmpeg needs to demux and mux everything. Mind that a new filter called xfade is soon to come.

In this Blog I will present a UNIX shell script that joins two videos using a crossfade-transition, and I will try to explain the complex_filter language a little. ffmpeg is not a graphical video editor, you need to cope with complex command lines.

Transition Nature

A crossfade transition makes the result video shorter than the sum of both video parts, because the fade-out of the first video will be overlapped with the fade-in of the second video. So, if both videos last 5 seconds, and the transition was set to 2 seconds, the result video will be 5 + 5 - 2 = 8 seconds, not 10 seconds.

Shell Script

Below is the complete source code of my shell script joining two videos using a crossfade transition.

Click to expand script
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
# default definitions

fadeSeconds=2    # number of seconds the transition will last
outputVideo=output.MP4    # file path of the result video

syntax() {
    echo "SYNTAX: $0 firstVideo.MP4 secondVideo.MP4 [outputVideoFilename [fadeSeconds]]" >&2
    echo "    Concatenates firstVideo.MP4 and secondVideo.MP4" >&2
    echo "    with a crossfade-transition of $fadeSeconds seconds to file $outputVideo" >&2
    exit 1
}

# evaluate command arguments

[ -z "$1" -o ! -f "$1" -o -z "$2" -o ! -f "$2" ] && {
    echo "ERROR: make sure both input files exist: first=$1, second=$2" >&2
    syntax
}
inputVideo1=$1
inputVideo2=$2

[ -n "$3" ] && outputVideo=$3

[ -n "$4" ] && {    # test for number
    case $4 in
    \.*|*[!0-9\.]*|*\.*\.*)    # leading dot, or not numbers (with dot), or several dots, or leading dot
        echo "ERROR: not a number of seconds: $4" >&2
        syntax
        ;;
    *)    # is number with optional dot
        fadeSeconds=$4
        ;;
    esac
}

# analyze input videos

streamProperty()    {    # $1 = property to display, $2 = stream identifier, $3 = video file
    ffprobe -v error -select_streams $2 -show_entries stream=$1 -of default=noprint_wrappers=1:nokey=1 $3
}

pixelFormat1=`streamProperty pix_fmt v:0 \$inputVideo1`
duration1=`streamProperty duration v:0 \$inputVideo1`
width1=`streamProperty width v:0 \$inputVideo1`
height1=`streamProperty height v:0 \$inputVideo1`
widthXHeight1=${width1}x${height1}

pixelFormat2=`streamProperty pix_fmt v:0 \$inputVideo2`
duration2=`streamProperty duration v:0 \$inputVideo2`
width2=`streamProperty width v:0 \$inputVideo2`
height2=`streamProperty height v:0 \$inputVideo2`
widthXHeight2=${width2}x${height2}

# exit if fade time is bigger than one of the video durations
validateFadeSeconds()    {
    echo "$1" | awk '{ if ($1 > '$fadeSeconds') print "true"; else print "false" }'
}
[ `validateFadeSeconds \$duration1` != "true" -o `validateFadeSeconds \$duration2` != "true" ] && {
    echo "Both video durations ($duration1, $duration2) must be bigger than fade time ($fadeSeconds)!" >&2
    exit 2
}

# exit if videos are of different format
[ "$pixelFormat1" != "$pixelFormat2" -o "$widthXHeight1" != "$widthXHeight2" ] && {
    echo "Videos can not be combined, pixelformats: $pixelFormat1 - $pixelFormat2, dimensions: $widthXHeight1 - $widthXHeight2" >&2
    exit 3
}
echo "widthXHeight = $widthXHeight1, pixelFormat = $pixelFormat1"

# calculate the time when fade-out of first video starts
startFadeOut=`echo \$duration1 | awk '{ print $1 - '\$fadeSeconds' }'`
echo "fadeSeconds = $fadeSeconds, duration1 = $duration1, startFadeOut = $startFadeOut, duration2 = $duration2"

# join videos
ffmpeg -v error -y \
    -i $inputVideo1 -i $inputVideo2 \
    -filter_complex "\
        [0:v] fade=t=out:st=$startFadeOut:d=$fadeSeconds:alpha=1 [video1];\
        [1:v] fade=t=in: st=0:            d=$fadeSeconds:alpha=1, setpts=PTS-STARTPTS+$startFadeOut/TB [video2];\
        [video1][video2] overlay [resultVideo];\
        [0:a][1:a] acrossfade=d=$fadeSeconds:overlap=1 [resultAudio]" \
    -map "[resultVideo]" \
    -map "[resultAudio]" \
    $outputVideo || exit $?
    
echo "Created file $outputVideo"

I won't explain the whole script, because most parts are just about argument checking and avoiding usage errors, like all software needs to have. Moreover I want to focus on ffmpeg -filter_complex that creates the transition, and how to read the contained filtergraph. This starts on line 75.

ffmpeg Command

Here is the part that I want to document. The lines of the command are concatenated through the trailing backslash ("\"), which is the newline-escape for the UNIX shell. The line breaks are necessary to keep ffmpeg commands readable, as are the blanks inside the filter_complex filtergraph. When executed, the whole command will be expanded to one single line, and the $variables will be substituted into it.

 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
ffmpeg -v error -y \
    -i $inputVideo1 -i $inputVideo2 \
    -filter_complex "\
        [0:v] fade=t=out:st=$startFadeOut:d=$fadeSeconds:alpha=1 [video1];\
        [1:v] fade=t=in: st=0:            d=$fadeSeconds:alpha=1, setpts=PTS-STARTPTS+$startFadeOut/TB [video2];\
        [video1][video2] overlay [resultVideo];\
        [0:a][1:a] acrossfade=d=$fadeSeconds:overlap=1 [resultAudio]" \
    -map "[resultVideo]" \
    -map "[resultAudio]" \
    $outputVideo || exit $?

Line 75 starts ffmpeg with some common options: -v error reduces the logging output to errors, -y makes ffmpeg overwrite any existing output file without interactive confirmation.

Line 76: the -i options list the two input video files.

Line 77 opens a filtergraph specification by the -filter_complex option. This graph is enclosed into double quotes, because it may contain shell meta-characters. Nevertheless $variables will be inserted also here by the shell.

Line 78 references the 1st video-stream inside $inputVideo1 as [0:v]. Read [0:v] as "stream from first (0) input file of type video (v)", this is called stream specifier. The stream is pushed into a filter called "fade", after the "=" its parameters are given:
The "t" parameter is the fade's "type", in this case a fade-out.
The "st" parameter gives the "starttime" when to apply the filter.
The "d" parameter gives the "duration" of the fade-out.
The "alpha" parameter "1" tells the video to fade just the transparency (= alpha channel).
Finally the filtered stream is named [video1], which is an arbitrary internal label.

Line 79 references the 1st video-stream inside $inputVideo2 as [1:v]. It does the same as line 78, but sets the fade-type to "in", starting at begin ("0"). After the "," the "setpts" filter takes over. It shifts the timestamps of all frames relatively to the beginning of the fade-out of the first video.
Inside this calculation, "PTS" is the presentation timestamp of any frame, "STARTPTS" is the presentation timestamp of the video's first frame, and "TB" is the time base of the timestamps, most likely 1.
The resulting stream finally is named [video2].

Line 80 takes the streams [video1] and [video2] and combines them using the "overlay" filter. The result is called [resultVideo]. The video part is ready now for mapping into the output file, but audio is missing.

Line 81 references the 1st audio-stream inside $inputVideo1 as [0:a] and the one inside $inputVideo2 as [1:a]. It joins them using the "acrossfade" filter ("a" for audio).
The duration of the crossfade is given in "d" parameter.
The "overlap" (or "o") parameter says that the streams should overlap. This is not really necessary, because the default is overlap, but defaults change sometimes.
The audio result is labeled as [resultAudio].

Line 82 and
Line 83 map the streams labeled [resultVideo] and [resultAudio] into the output file, in that order, meaning video will be the first stream in output and audio the second.

Line 84 names the output file. If the whole ffmpeg command fails, the script would now exit with the exit-code of ffmpeg (due to the "||" indirection that is executed only when the preceding command failed).

Conclusion

ffmpeg bears all the hallmarks of hackware, but is surprisingly comprehensive and flexible. What is missing is a use-case-oriented documentation. The filtergraph specifications are really hard to read, thus errors inside them are difficult to find. It took me two days to get into that again after two months since my last Blogs about ffmpeg, and find out how I can generate crossfade transitions including audio.

There are lots of forum entries about simple video manipulations, but for transitions I was more or less left alone with the tool documentation. Also I noticed a kind of "garbage symptom" that you often find in CSS forum entries too: developers deliver lots of code in their examples that is actually not needed, but you need to understand that garbage to find out whether it is meaningless or not. In case of ffmpeg this really takes time.

I won't use crossfade transitions for my private video production, because that conversion takes too much time. I turned to ffmpeg because I wanted to see video cut results quickly. For any other case, OpenShot is a sufficient graphical video editor.




Montag, 4. Januar 2021

Accelerate LINUX Boot by Disabling Unneeded Services

My Ubuntu 20.04 LINUX takes about 70 seconds to boot on my 4 x 1.70 GHz cores laptop with 8 GB memory. This is much slower than WINDOWS, so I wondered if I can make this faster. On the web I found some useful commands. (I summarize them here in case I will need them again - that's what Blogs are for !-)

Finding out how much time every service startup took:

systemd-analyze blame

Parts of the output:

34.526s postgresql@9.3-main.service
34.086s snapd.service
33.170s postgresql@12-main.service
33.078s postgresql@9.5-main.service
33.024s postgresql@10-main.service
24.547s docker.service
....

Postgreslq database needed 34 seconds startup time, Snap 34, Docker 25. Docker and Snap are deployment tools that I don't use, Postgres I use rarely for testing JPA functionality. Although these startups are done in background, why do them on boot?

Displaying all services and their status:

service --status-all

This gives you a list of all service names, either running [ + ] or stopped [ - ]. These names you can use to manage the services:

 ....
 [ + ]  docker
 ....
 [ + ]  postgresql
 ....

Surprisingly Snap was not in this list .... looks like this is not so easy to get rid of. You can list your applications depending on Snap using this command:

snap list

Chromium, Gimp and Ksnip may be in this list, so better don't uninstall this!

Here are the commands to stop and disable a service (example for docker):

sudo systemctl stop docker
sudo systemctl disable docker

Other disable-candidates would be cups (a print-service, in case you don't use printers) and brltty ("Braille teletyper", console support for blind people), but I couldn't find their startup time in output of systemd-analyze blame, so it may not matter.

If I ever will need postgres again, I can launch these commands:

sudo systemctl enable postgresql
sudo systemctl start postgresql

Query the status of a specific service (example for snapd):

service snapd status