fritzthecat-blog: August 2016

Sonntag, 21. August 2016

Monads in Java

All the Internet agrees: it's quite hard to explain what a monad is. From reading all their attempts I finally could understand - that it's hard to explain :-)

This Blog is my attempt to explain what a monad is. I am a software developer, not a mathematician. I apply some mechanism, and if it is useful, I keep it. Monads, in context of multi-processor computers, are a useful mechanism.

You can find good examples for monad usages in the Java Lambda Tutorial. For understanding lambdas, read also my latest Blog.

What Is a Monad?

Expressed in object-oriented terms, a monad is the decoupling of object and method (as "function").
You do not call someObject.someMethod() directly any more, you tell the monad to do that.
This is beyond object-oriented thinking.

Why Monads?

Monads can

help to avoid method-calls on objects that are null, and
provide implicit loops, redistributable onto multiple processors.

Moreover they enable us to program in a Fluent Interface style. Other authors call this "pipeline", or "programmed semicolons".

Null Object

Ever heard about the "Billion Dollar Bug"? In old times I saw "segmentation violation" when the application crashed, nowadays in Java it is NullPointerException, and it doesn't crash any more. I am talking about calling a method on an object that is null, meaning it does not exist. Robot stepping into the void.

What's wrong when that happens? Is it developers fault? Developers write source code, hopefully checked by a compiler. Is it data fault? Data are known at runtime only, and thus can not be checked by any compiler. The source code needs to avoid data faults. So it is developer fault anyway, although there are discussions whether this really is a Billion-Dollar Bug.

The Java Optional monad gives us relief. This is a new class in java.util. Kind of null-object pattern.

Collections

Modern hardware architecture with multiple CPUs makes it necessary to rewrite our software to effectively use the hardware. The essential thing is to perform loops in parallel threads. Thus we need a new kind of loops where we do not write for, for-in, while, do-while statements explicitly any more. Only a wrapped, implicit loop can be redistributed onto multiple threads automatically.

Instead of writing

while (iterator.hasNext())
  processElement(iterator.next())

we would need a "programmable" loop-method like

collection.forEach(element -> processElement(element))

The Java Stream monad gives us relief. This is a new interface in java.util.stream, supported by Java's collection-framework.

How Do Monads Work?

A monad is a wrapper over a given object. That given object also could be a collection of objects.

Wrapping 0-1 Object

In Java, there is always a factory for monad objects. This is called the monad's "unit":

Person person = ....; // could be null
Optional<Person> optionalPerson = Optional.ofNullable(person);

Optional is a class that holds a reference to a wrapped object or null. Optional.ofNullable() and Optional.of() are static factories. After the object was wrapped into its monad, operations on it should be done through that monad. This is then called the monad's "bind":

Optional<Address> optionalAddress = optionalPerson.map(person -> person.getAddress());

The return of map() would be a new monad, and you can call further functions on it. Here we get out an Optional wrapping the person's address, or Optional.EMPTY when one of person or address has been null. Mind that this is not a good strategy when address is a required field, the application should detect such errors.

A monad could provide different bindings. Stereotype bindings, seen on lots of monads, are:

map(function)
flatMap(function)
calls a function on the wrapped object and returns a new monad built from its return,

filter(predicate)
returns an empty monad when the wrapped object does not conform to the condition given by predicate,

get()
returns the wrapped value, or throws an exception when null,

So let's navigate through Person down to city without NullPointerException (see also source code on bottom):

Person testPerson = ...; // may be null!
String city = Optional.ofNullable(testPerson)
    .map(person -> person.getAddress())
    .map(address -> address.getCity())
    .orElse("No city!");

The orElse() method returns the wrapped object in case it is not empty, else it returns the given default value "No city!".

The same without monad would look like this:

Person testPerson = ...; // may be null!
final String DEFAULT_CITY = "No city";
String city = (testPerson == null) ? DEFAULT_CITY :
    (testPerson.getAddress() == null) ? DEFAULT_CITY :
    (testPerson.getAddress().getCity() == null) ? DEFAULT_CITY :
    testPerson.getAddress().getCity();

The monad variant wins with 10 points advance (of 100:-)

`map()` versus `flatMap()`

While map() produces a new monad to return, flatMap() requires the given function to do this. Further flatMap() does NOT allow null to be returned from the given function. Here are their implementations from JDK class Optional:

    public<U> Optional<U> map(Function<? super T, ? extends U> mapper) {
        Objects.requireNonNull(mapper);
        if ( ! isPresent() )
            return empty();
        else {
            return Optional.ofNullable(mapper.apply(value));
        }
    }

    public<U> Optional<U> flatMap(Function<? super T, Optional<U>> mapper) {
        Objects.requireNonNull(mapper);
        if ( ! isPresent() )
            return empty();
        else {
            return Objects.requireNonNull(mapper.apply(value));
        }
    }

A little code-duplication, but the first and last lines are different:-)

Wrapping 0-n Objects

Collections can be processed in parallel by wrapping them into monads. The interface Collection provides factories as "default methods", directly inside the interface (this is a new Java 8 feature), to create a Stream monad.

Collection<Person> persons = getPersons(); // must not return null!
Stream personStream = persons.stream();

The stream() method is the factory used here. The resulting monad provides lots of bindings you need for list processing.

String [] cities = personStream
    .filter(person -> person.getName().startsWith("J"))
    .sorted((person1, person2) -> person1.getName().compareTo(person2.getName()))
    .map(person -> person.getAddress().getCity())
    .toArray(length -> new String[length]);

This creates a stream of Person objects from a collection, then filters out all persons whose name doesn't start with "J", then sorts it by the person's name, then converts it to a String stream of city names, and finally produces an array of city-names.
Mind that this is not safe against getAddress() nulls, and there could be getCity() nulls in the resulting array!

The Stream monad has two types of operations:

intermediate, produces a stream again, suitable for method-chaining
terminal, closes the streaming, no method can be chained any more.

Here are some intermediates:

map(function) flatMap(function)
like described above

filter(predicate)
returns a monad-stream containing only those elements that conform to given predicate (condition)

sorted(comparator)
returns a monad-stream containing the elements in a sort order produced by given comparator

distinct()
returns a stream that does not contain duplicates any more

And here are some terminals:

reduce(initial, accumulator)
computes a result object from given initial value and what the accumulator returns from being called on each element

forEach(consumer)
loops the stream with given element-consumer

count()
returns the number of elements in the stream

toArray()
returns an array of the objects in the stream, optionally typed when you give an IntFunction lambda

Parallel Processing?

This is what it is all about. Quite easy to do now after wrapping loops:

getPersons().parallelStream()
    ....
// or:
personStream.parallel()
    ....

When you want to process multi-threaded, you need to make sure that the mapper-functions do not have side-effects.

Example Source

Click here to see an example class you can copy & paste for exploring monads.

public class MonadsExample
{
    public static void main(String[] args) {
        Person testPerson = null;
        //Person testPerson = new Person("Joe", new Address("Tokio"));
        
        String city = Optional.ofNullable(testPerson)
            .map(person -> person.getAddress())
            .map(address -> address.getCity())
            .orElseGet(() -> "No city!");
        
        System.err.println("city = "+city);
        
        
        Collection<Person> persons = getPersons();
        String [] cities = persons.stream()
            .filter(person -> person.getName().startsWith("J"))
            .sorted((person1, person2) -> person1.getName().compareTo(person2.getName()))
            .map(person -> person.getAddress().getCity())
            .toArray(length -> new String[length]);
        
        System.err.println("cities = "+Arrays.toString(cities));
    }
    
    
    private static Collection<Person> getPersons()    {
        final Collection<Person> names = new ArrayList<>();
        names.add(new Person("Tim", new Address("Teamtown")));
        names.add(new Person("Jack", new Address("Backtown")));
        names.add(new Person("Joe", new Address("Downtown")));
        return names;
    }
    
    
    private static class Person
    {
        private final String name;
        private final Address address;
        
        public Person(String name, Address address) {
            this.name = name;
            this.address = address;
        }
        
        public String getName() {
            return name;
        }
        public Address getAddress() {
            return address;
        }
    }
    
    private static class Address
    {
        private final String city;
        
        public Address(String city) {
            this.city = city;
        }
        
        public String getCity() {
            return city;
        }
    }
    
}

The output of it is

city = No city!
cities = [Backtown, Downtown]

Summary

So do we have to implement Java applications now in monadic style?

We should just rewrite those loops in our applications to Stream that really are performance bottlenecks. We can use lambdas wherever they are supported, but keep in mind that such source code is not reusable. The Optional (null-object-pattern) surely also makes sense in many cases.

Wouldn't it be better to change to some real functional programming language?

If there would be one that was programmer-friendly enough, yes. But I know none. Functional source code tends to be unreadable. Given the fact that a big part of the programming world works in languages like Python (which is a dynamically typed script language), I would say we can stay on Java another decade :-)

Freitag, 19. August 2016

Lambdas or Closures in Java

The heart of our computers are the "central processing units" (CPU). These grew faster and faster in the passed decades, but the limit seems to be reached. Nowaday's computers have multiple processors that work in parallel to improve performance. But what about all the existing software? Is it able to distribute the work among multiple processors?

Unfortunately no. Traditionally written applications are not able to efficiently use multiple processors. Their loops, which are mostly responsible for slow performance, run in a single thread. Work can be distributed among several CPUs only when the application explicitly runs multiple threads, optimally as many threads as there are CPUs.

But how can we run a loop simultaneously in multiple threads?
By using the new functional-programming features in Java 8, called "lambda" and "monad".

This Blog is about lambdas, next one will be about monads. But I won't write about parallel programming in either of them. There is also a very good introduction into lambdas on the Java tutorial page.

What Is a Lambda?

The most understandable translation I found is "anonymous function". That means, it is a function without a name. But, optionally, with parameters. And a context.

Originally, Java lambdas should have been called "closures", but finally the vendors decided for the more abstract term.

The term "continuation" is also very close, but this is more general. A continuation is something that will be executed later, be it a named class-method or a closure.

Nevertheless this explanation does not make sense for Java, because ....

We Don't Have Functions in OO

Functions do not exist in an object-oriented language. Just methods. Methods are bound to classes, and they work on objects which are instantiations of classes. A method would be a function if you could pass it around as parameter to other methods or constructors.

Can we work around this?
Yes, and that's the way how functions were implemented in Java 8:

create an interface,
put the method signature into it,
make some class implement the interface,
instantiate the class

That way you can pass the method around, packed into the object of the class implementing the interface.
Here is an example of a pre-Java-8 closure (or lambda):

        Collections.sort(
            personList(),
            new Comparator<Person>() {
                public int compare(Person person1, Person person2) {
                    return person1.getName().compareTo(person2.getName());
                }
            }
        );

The method Collections.sort() is a static implementation that sorts a List given as 1st parameter, using the Comparator object given as 2nd parameter.

Let's say the personList() method produces a list of Person objects (find source code on bottom).

The anonymous Comparator implementation makes it possible to pass the compare() method to Collections.sort() like a function.

In JavaScript it is different. Here no classes exist, and you can give a function call any context you wish. You can pass functions as parameters simply by using their name, without parentheses.

But don't believe that JS is a functional language. It is not. JS (before EcmaScript 6) does not have immutable fields (constants), and immutability is one of the most important features of a functional language. It is the admission-price for multi-threaded programming, where only functions without side-effects are safe, preferably working on immutable data, always producing new data instead of modifying the existing.

Moreover, JS is a single-threaded language, there is no synchronized keyword!

A Function with No Name?

This is completely new in Java 8, although there have been "anonymous classes" (see Comparator above) since 1.1. Here is an example for a lambda that does exactly the same as the code above:

        Collections.sort(
            personList(),
            (person1, person2) -> person1.getName().compareTo(person2.getName())
        );

The lambda is the 2nd parameter, it is the Comparator. Lambdas are mostly described by the operator "->" (new in Java 8). In Java 8, the interface Comparator has been annotated as @FunctionalInterface. That way the Java compiler accepts the lambda as Comparator implementation, without declaring any class.

(person1, person2) -> person1.getName().compareTo(person2.getName())

As you can see, this has no name, but optional parameters. The person1.getName().compareTo(person2.getName()) lambda-body will be executed any time the sort() method compares two objects while sorting the collection.

The return value of a lambda is implicitly what the single statement produces, in this case a boolean. But there are also other shapes of lambdas, having several statements. Then you need an explicit return statement.

Different Shapes

All of the following examples do the same thing. This time I will use the Stream monad, although I did not yet explain that.

In short, the collection of persons is turned into a "stream". This hides the loop over all contained persons. It is necessary to hide the loop to optionally process it multi-threaded. You can call different methods on streams, one of them is map(), and this produces a new stream with members returned by the lambda. Another method would be parallel(), which enables multi-threading. (Be sure to have no side-effects in your lambdas when using that.) You can turn a stream into an array by calling toArray(), this would also end any parallel processing.

That means, the lambda inside the map() call is executed for each person in the Person-stream, and the resulting String-stream then contains the names of these persons (the name was returned by the lambda).

        Stream<String> names;
        
        names = personList()
            .stream()
            .map(Person::getName);  // class-bound function pointer
    
        names = personList()
            .stream()
            .map(person->person.getName());   // classic lambda
        
        names = personList()
            .stream()
            .map((Person person) -> person.getName());  // typed parameter
    
        names = personList()
            .stream()
            .map(person -> {    // more than one statement could be in braces
                return person.getName();
            });
        
        names = personList()
            .stream()
            .map(new Function<Person, String>() {   // old-style closure
                public String apply(Person person) {
                    return person.getName();
                }
            });

Person::getName

A direct reference to the getName method in class Person, done by the new Java 8 operator "::". That method will be called in context of the Person-instance given to the lambda as parameter.

person->person.getName()

Minimal shape, often seen in examples.
Without spaces, this is a misleading reminiscence of the C-language member-selection operator.

(Person person) -> person.getName()

Optionally you can give types to parameters. When not present, the compiler will deduce them.

person -> { .... }

When the lambda has more than one statement, you must enclose them into curly braces. Should it have a return, this must be explicitly given as statement.

new Function() { .... }

This is the old way how it also could be done in Java prior to 8, should it have the Function interface.

() -> ....

            // lambda with no parameters

            () -> System.out.println("Hello World")

If a lambda has no parameters, you must use "()" instead.

Context

What is available to a Java lambda, which fields and methods of its context can it see? Class member fields? Local outside variables? Parameters of the enclosing method?

All of them are available. Local variables and parameters of the enclosing method not even need to be final any more.

But the lambda body can not assign a new value to an outer local variable or an enclosing method parameter. These are implicitly final now. You will get a compile-error when trying such. It only can assign a new value to one of its own parameters.

But the lambda-body is allowed to assign new values to member fields of the enclosing class. Which is a contradiction to the functional principle of absent side-effects. That's life with Java :-)

Functional Interfaces

With Java 8 you always can create your own functional interfaces. Be sure that it is annotated by @FunctionalInterface, and only one method signature is contained (SAM = single abstract method), all others must be either default or static implementations, or overrides of Object methods like equals(). That single abstract method is the lambda's missing name!

Java 8 provides several functional interfaces for working with streamed lists and other monads, the most important being ....

Example Source Code

Click here to expand the example source code.

/**
 * Different ways to write lambdas.
 */
public class LambdaShapes
{
    public static void main(String[] args) {
        new LambdaShapes();
    }
    
    LambdaShapes() {
        Collections.sort(
            personList(),
            new Comparator<Person>() {
                public int compare(Person person1, Person person2) {
                    return person1.getName().compareTo(person2.getName());
                }
            }
        );
            
        Collections.sort(
            personList(),
            (person1, person2) -> person1.getName().compareTo(person2.getName())
        );
        
        Stream<String> names;
        
        names = personList()
            .stream()
            .map(Person::getName);  // class-bound function pointer
    
        names = personList()
            .stream()
            .map(person->person.getName());   // classic lambda
        
        names = personList()
            .stream()
            .map((Person person) -> person.getName());  // typed parameter
    
        names = personList()
            .stream()
            .map(person -> {    // more than one statement could be in braces
                return person.getName();
            });
        
        names = personList()
            .stream()
            .map(new Function<Person, String>() {   // old-style closure
                public String apply(Person person) {
                    return person.getName();
                }
            });
    }
    
    private List<Person> personList()    {
        final List<Person> persons = new ArrayList<>();
        persons.add(new Person("Tim"));
        persons.add(new Person("Jack"));
        persons.add(new Person("Joshua"));
        return persons;
    }
    

    private static class Person
    {
        private final String name;
        
        public Person(String name) {
            this.name = name;
        }
        
        public String getName() {
            return name;
        }
    }
    
}

Freitag, 12. August 2016

Remove Browser Ads by User CSS

On some Internet sites the advertisement weight is really heavy. You even have problems to find the content between the many colorful and image-rich ads, often even turning on sound when you hover them by chance.

So what to do about this? There are browser-add-ons, with names like AdBlock or AdBlockPlus, and they may be quite useful, but we consumers do not know what they actually do. I installed one of them, but the ads were still there. Maybe they just block the ads of Y-a-h-o-o, not those of G-o-o-g-l-e ?

So I decided to try this on my own. Just for those pages I use nearly every day, like dictionaries etc. With a little knowledge of CSS it should be possible to set any advertisement invisible. I've to take care to not remove something unintentionally, because these rules will be applied to ANY page!

But how can I merge my own CSS into a web page I navigate to with my browser? And where do I have to write that CSS?

Finding the Browser's User Style Sheet

The w3c CSS specification states that there is a user style sheet that takes a certain precedence in CSS "Cascading". So browser vendors should have implemented that opportunity, and I intend to use it for removing ads.

But it seems that browser vendors do not like to support that. There is a quite outdated web page that describes how to achieve user style sheets for different browsers:

https://en.wikibooks.org/wiki/Cascading_Style_Sheets/User_Style_Sheets

Here is my update (hope it will hold some time):

for Chrome, use the "Stylish" add-on
For Opera, use the "Custom Style Script" add-on.

For Firefox, you can provide a file named userContent.css at the profile-location in filesystem. Open a new browser tab and enter

about:profiles

in its address line. Following screenshot shows the resulting page.

The "Root Directory" is the profile location.

Open a native file manager, navigate to the given directory, create a sub-directory chrome, and in it a file called userContent.css. With some text editor like gedit

gedit $HOME/.mozilla/firefox/wk8nmy3w.default/chrome/userContent.css

add a simple test-rule to see if this file really works:

body {
    background-color: yellow;
}

Reloading a simple page (that declares no color on its own) showed no effect. But when I terminated the browser and launched it again, the page was yellow. Now I could start to write CSS ad-blockers.

Just restart Firefox after each modification of userContent.css, and any ad you rule out should be gone.

Ruling Out Ads by CSS

You can get informed about CSS rules on the Internet, or you could read my Blog articles about that. Especially you may need to know the attribute selectors.

Here are some of my ad-removal rules that relieved really a lot.

*[id ^= 'google_ads'],
#download-offer,
div.adsense-aside,
div[id ^= 'ulp-'],
div[id ^= 'adzerk'],
div[id ^= 'ad_ifrcontainer'],
div[id ^= 'div-gpt-ad-'],
a[href *= 'www.advertising.'],
a[href *= 'pubads.g.doubleclick.net'],
div#adv-banner,
div#topMenu + div#topBranding,
div#main > div#mainLeaderboard,
.clc-jobs-multi,
.hero-box.docs-announcement,
.article-adsponsor
{
    display: none !important;
}

All these rules concatenated by comma would set the element they describe to invisible. Mind that the last rule must not have a trailing comma.

How did I find these rules? Here is a step-by-step guide:

when no debugger is installed in the browser of your choice, find out how to install and do it now
open the debugger by F12, or by right-mouse-click popup-menu on the browser window, choosing the "Inspect" item
in top left corner of the debugger window you have the element-picker, click on it to get into picking-state
move the mouse over the advertisement and click on it, the debugger will select the according HTML element in its tree view
most likely you will need to search upwards in the element tree to the topmost element of the advertisement
hovering it will select its shape on the browser window, so you can check whether you are on the right level
when you found the element you want to remove, study its attributes, primarily id and class
when one of them expresses advertisement, write a rule about it into the user style sheet
hide any element by the directive display: none !important; inside the rule

Following screenshot shows the Firefox context menu (right mouse click):

So essentially you will spend some time to write rules about your favourite Internet pages, and then be relieved of the advertisements awaiting you there every day.

Example

Here is an example to illustrate the guide above.

I right-clicked the ad, chose popup menu item "Inspect", then the debugger showed me the details of the element.

How can I make this ad disappear? I would suggest following rule:

a[href *= 'pubads.g.doubleclick.net']
{
 display: none !important;
}

This would set any hyperlink (<a>, anchor) invisible that references an URL containing 'pubads.g.doubleclick.net'. You can see this reference in the href attribute of the <a> element. As the <img> element is contained in the link, also the image would disappear. Maybe the container would stay, but it would be empty.

When you found a good rule, maybe all other ads on the page also will have disappeared!

Hopefully I did not displease anybody by showing our browser user opportunities, and hopefully anyone will be able to see the web in his/her own way, at any time in future!

So, here's my current userContent.css:

.advertisement,
.SuperBannerGrosz,
#top-ads,
#middle-ads,
#bottom-ads,
#download-offer,
#nuggadLayer,
#cookieChoiceInfo,
#cookieprompt,
#privacypolicy,
div.IL_BASE,
div.adsense-aside,
div.adContainer,
div.top-ad-container,
*[id ^= 'google_ad'],
div[id ^= 'ulp-'],
div[id ^= 'adslot_'],
div[id ^= 'adzerk'],
div[id ^= 'ad_ifrcontainer'],
div[id ^= 'div-gpt-ad-'],
div[id ^= 'inArticlePlayer_'],
div[id ^= 'inArticleVideo_'],
div[id ^= 'adzone'],
a[href *= 'pubads.g.doubleclick.net'],
a[href *= 'www.advertising.'],
a[href *= 'facebook.com'],
a[href *= 'twitter.com'],
a[href *= 'plus.google.com'],
a[href *= 'linkedin.com'],
a[href *= 'reddit.com'],
a[href *= 'stumbleupon.com'],
a[href *= 'ycombinator.com'],
a[href *= '/ad1.adfarm1.adition.com/'],
a[href *= 'trends.revcontent.com/'],
a[href *= '/www.dpbolvw.net/'],
a#facebook_link,
#social-icons-desktop,
#ad_under_article,
div#snppopup-welcome,
div.cc_banner.cc_container.cc_container--open,
div#adv-banner,
div.video-ads,
div.f_sponsorbox > div,
div#AdSense,
div.adDisplay,
div.WerberahmenOben,
div#socialshareprivacy,
div#l-social-float,
div.specific-social,
div#topMenu + div#topBranding,
div#main > div#mainLeaderboard,
div.insert_post_ads,
div.lrshare_interfacebox,
div.lr_logininterface_container,
div#BannerWrap > div#StickyBanner,
div.pyv-afc-ads-container,
img[src ^= '/_storage/asset/'],
script[src *= 'intermarkets-d.openx.net'] ~ iframe[id ^= 'ox_'],
script[src *= 'adform.net'] ~ div[id ^= 'vidazoo_'],
.clc-jobs-multi,
.hero-box.docs-announcement,
.article-adsponsor,
.social-media-icons,
.a2a_kit,
.j-tos-update-banner.tos-update-banner,
.section.m-bottom-large,
.fixedAdv1,
.OUTBRAIN,
.ad_con,
.ad_con_desktop,
.ad_marker,
.eigenwerbung-wrapper,
iframe.rp-creative-iframe-wrapper,
iframe[id ^= 'iframe_creativeIframe'],
iframe[id ^= 'ad_creative_iframe_'],
iframe[src *= '.adfarm'],
iframe[src *= '/googleads.g.doubleclick.net/'],
iframe[src *= '/ad.doubleclick.net/'],
iframe[src *= '/pubads.g.doubleclick.net'],
iframe[src *= '/ads.w55c.net/'],
iframe[src *= '/eu.bidswitch.net/'],
iframe[src *= '/AdServer/'],
iframe[src *= '/MetaAdServer/'],
iframe[src *= '/mam.ad-balancer.'],
iframe[src *= '/servedby.flashtalking.com/'],
iframe[src *= '/streaming.grm-pro.com/'],
iframe[src *= '/streaming.ad-balancer.'],
iframe[src *= '/cdn.adspirit.de/'],
html[class *= 'adform-']
{
 display: none !important;
}

Donnerstag, 11. August 2016

Optimize Java Code Using Inner Classes

Java is an object-oriented language providing everything for code-reusage except multiple inheritance in implementation-classes. Nevertheless even skilled programmers sometimes have problems to avoid code duplications. One of the cases I saw very often is caused by the fact that it is impossible to return more than one value from a Java method (like we can pass several parameters to it).

Example

Imagine an application that needs to find the lowest index of several alternative strings in a text. For example, we need to find one of "select", "from", "where" in a query text, and we need to find the one that is leftmost. We want to know

the string that matched, and
its index.

To be able to do this more than once (to find all of "select", "from", "where" sequentially), we pass a search-start-index as parameter.

Here is an implementation that I called SearchMultipleTerms1.

public class SearchMultipleTerms1
{
    private String text;
    private String [] searchTerms;
    
    public SearchMultipleTerms1(String text, String [] searchTerms) {
        this.text = text;
        this.searchTerms = searchTerms;
    }
    
    public int getLowestIndex(int searchStartIndex) {
        final String matchingTerm = getMatchingTerm(searchStartIndex);
        return text.indexOf(matchingTerm, searchStartIndex);
    }
    
    public String getMatchingTerm(int searchStartIndex) {
        return indexOf(searchStartIndex);
    }
    
    private String indexOf(int startIndex) {
        int leftMostIndex = Integer.MAX_VALUE;
        String leftMostToken = null;
        for (String searchTerm :  searchTerms)    {
            final int i = text.indexOf(searchTerm, startIndex);
            if (i >= 0 && i < leftMostIndex)    {
                leftMostIndex = i;
                leftMostToken = searchTerm;
            }
        }
        return leftMostToken;
    }

}

The constructor receives the text to search and the search-terms. It provides the search through the public methods getLowestIndex() and getMatchingTerm(). Mind that the indexOf() implementation can not break the loop at the first match, because it needs to find the match with the lowest index, thus it needs to try all search-terms.

Now this looks quite straightforward, but it has a small code-smell (that I use here to demonstrate how useful inner classes are:-). The implementation of getLowestIndex() first searches the matching term, and then, again, estimates its index in text, although the preceding search already did this. This is not only a code-duplication, it is also a performance-leak, because text.indexOf() is called twice.

Reason for that problem is that the indexOf(startIndex) method can return just one value, not both index and term.

Refactoring

To fix this I create an inner class that holds both index and the matched term. I use a static inner class, because I do not need access to instance-fields of the outer object.

    private static class SearchTermAndIndex
    {
        public final int index;
        public final String searchTerm;
        
        public SearchTermAndIndex(int index, String searchTerm) {
            this.index = index;
            this.searchTerm = searchTerm;
        }
    }

Mind that this class does not violate encapsulation principles because the member fields are public. The final modifier makes them immutable, thus nothing can change them after construction, and we don't need to provide public getter methods.

Now the indexOf() method must be re-written to return an instance of that class.

    private SearchTermAndIndex indexOf(int startIndex) {
        int leftMostIndex = Integer.MAX_VALUE;
        String leftMostToken = null;
        for (String searchTerm :  searchTerms)    {
            final int i = text.indexOf(searchTerm, startIndex);
            if (i >= 0 && i < leftMostIndex)    {
                leftMostIndex = i;
                leftMostToken = searchTerm;
            }
        }
        return new SearchTermAndIndex(
                (leftMostToken == null) ? -1 : leftMostIndex,
                leftMostToken);
    }

Of course also the callers of indexOf() must be re-written to use the inner class.

Here is the complete new implementation, called SearchMultipleTerms2.

public class SearchMultipleTerms2
{
    private String text;
    private String [] searchTerms;
    
    public SearchMultipleTerms2(String text, String [] searchTerms) {
        this.text = text;
        this.searchTerms = searchTerms;
    }
    
    public int getLowestIndex(int searchStartIndex) {
        return indexOf(searchStartIndex).index;
    }
    
    public String getMatchingTerm(int searchStartIndex) {
        return indexOf(searchStartIndex).searchTerm;
    }
    
    
    private static class SearchTermAndIndex
    {
        public final int index;
        public final String searchTerm;
        
        public SearchTermAndIndex(int index, String searchTerm) {
            this.index = index;
            this.searchTerm = searchTerm;
        }
    }
    
    
    private SearchTermAndIndex indexOf(int startIndex) {
        int leftMostIndex = Integer.MAX_VALUE;
        String leftMostToken = null;
        for (String searchTerm :  searchTerms)    {
            final int i = text.indexOf(searchTerm, startIndex);
            if (i >= 0 && i < leftMostIndex)    {
                leftMostIndex = i;
                leftMostToken = searchTerm;
            }
        }
        return new SearchTermAndIndex(
                (leftMostIndex == Integer.MAX_VALUE) ? -1 : leftMostIndex,
                leftMostToken);
    }

}

Summary

Don't hesitate to create inner classes, that's the way it should be in an object-oriented language. Make them static by default, unless you need the fields of the outer object.

fritzthecat-blog

Blog-Archiv

Sonntag, 21. August 2016

Monads in Java

What Is a Monad?

Why Monads?

Null Object

Collections

How Do Monads Work?

Wrapping 0-1 Object

map() versus flatMap()

Wrapping 0-n Objects

Parallel Processing?

Example Source

Summary

Freitag, 19. August 2016

Lambdas or Closures in Java

What Is a Lambda?

We Don't Have Functions in OO

A Function with No Name?

Different Shapes

Person::getName

person->person.getName()

(Person person) -> person.getName()

person -> { .... }

new Function() { .... }

() -> ....

Context

Functional Interfaces

Example Source Code

Freitag, 12. August 2016

Remove Browser Ads by User CSS

Finding the Browser's User Style Sheet

Ruling Out Ads by CSS

Example

Donnerstag, 11. August 2016

Optimize Java Code Using Inner Classes

Example

Refactoring

Summary

Über mich

`map()` versus `flatMap()`