Tuesday, September 15, 2009

How would Linq look like in Java (2)

After thinking a bit more about my recent attempt in bringing Linq to Java and looking deeper into the JGA library, one thing starts to bug me: The Queryable<T> interface lets you only adapt specific operations by overriding specific methods. Although this is a powerful possibility, it kind of prevents the main use case to extend the Linq mechanism: An overload for a specific type T of Iterable<T>. In .NET  this is possible through the extension method feature which does not exist in Java. You simple would provide a new method for your specific Iterable<YourType>.

Basically extension methods are static methods that get glued to a type by some compiler magic and look like instance methods. In Java you can use static methods without a qualifying type by importing it statically. They look and feel like global methods in C++. Since I have seen how the hamcrest matchers use this feature in JUnit 4.x to create a somewhat fluent syntax, I’m wondering if this can’t be used to create a Linq like syntax:

import static xox.foo.Queryable.*;//defines select, … 

UnaryFunction<Float, Integer> xform =…;
UnaryFunction<Integer, Boolean> lessThan5 = …;

Iterable<Integer> i2 = 
    from(collection, select(xform, where(lessThan5)));

With this design, a contributor would be able to provide his/her type specific overloaded implementation of the operation. It is really interesting to think about a possible implementation for this. How can the iterators be chained correctly, as the static methods are evaluated in the exact opposite order as the iterator chain is expected to work?

I first experimented passing a LinkedList<Iterable<T>> through the method chain, but it turned out that I had to know the specific Iterable type to link the iterators together. So I switched to provide a simple base class that implements Iterable<T>. It provides service to its inheritors to access the next and previous Iterable.

abstract class LinkedIterable<TIn, TOut> implements Iterable<TOut>{
   
LinkedIterable<?, TIn> getPrev() {// … omitted
    LinkedIterable<TOut, ?> getNext(){// … omitted }
    public abstract Iterator<TOut> iterator();

}

A typical implementation would  use it like this:

class SelectIterable<TIn, TOut> extends LinkedIterable<TIn, TOut> {
   
private final UnaryFunction<TIn, TOut> select;

    public SelectIterable(LinkedIterable<TOut, ?> next,
                          UnaryFunction<TIn, TOut> select) {
        super(next);
        this.select = select;
    }

    public Iterator<TOut> iterator() {
        return new SelectIterator<TIn, TOut>(super.getPrev().iterator(),
                                             select);
    }
}

This linked list liked construct gets passed through the method chain. Each method takes one as input,creates a specific new implementation, adds it to the front and returns the head of the list.

public static <TIn> LinkedIterable<TIn, TIn>
   
where(UnaryFunction<TIn, Boolean> sel,
          LinkedIterable<TIn, ?> dec) {

        return new WhereIterable<TIn>(dec, sel);
}

The last method in the method chain to be called is the from function. It simply has to add the input range to the LinkedIterable<T> chain, run to the tail of the tail of the list and return it. As a LinkedIterable<T> is an Iterable<T>, this works seamless.

Now I just have to think about whether this syntax is better to read than my former approach. Especially for longer statements, all the methods and their braces might become unhandy:

Grouping managersBySalary = 
   
from(employes, where(isManager, select(toManager, groupBy(salary)));

No comments: