Eureka moment on design patterns for functional programming

Understanding design patterns for object-oriented programming made my life easier as a Java programmer. So I have been looking for a comparable book for functional programming ever since my sojourn into this age-old paradigm. It looks as though I'm not the only one looking too. But the thing is, I think I've just had a revelation of sort.

There is one and only one guideline to designing functional architectures -- Keep it simple. Simple as in keeping your functions having a single purpose only. Simple as in work with the data directly and don't conjure unnecessary intermediaries. Simple, as elaborated by Rich Hickey in his talk, Simple Made Easy. Much of this is conveyed in Bloch's Effective Java:

item \#5 \– Avoid creating unnecessary objects
item \#13 \– minimize accessibility of classes and members

, for examples.

As Majewski said in a stackoverflow reply (Update 2013 – question no longer available on stackoverflow),

The patterns movement started because object-oriented programming 
was often turning into spaghetti objects ... Functional programming 
is a restricted style of programming, so it didn't need to grow a set
of restricted conventions to limit the chaos.

As such, there is no design pattern book for functional programming. I didn't get that earlier this year. But something clicked recently. During the past few months, I've been doing some consulting and open source projects solving algorithmic problems with Clojure.

One of the problems in a project that I was faced with this week is calculating the occurrence of each distinctive element within a list of elements. Say we have a list, coll = ("orange", "bottle", "coke", "bottle"). The output would be something like [("orange", "bottle", "coke") (1 2 1)]

This is my first solution.

(defn eval-decompose
  [coll]
  (let [super-d  (distinct coll)
        freqs    (loop [ps  []
                        d   super-d]
                   (if (seq d)
                     (let [c  (count (filter (partial = (first d)) coll))]
                       (recur (conj ps c) (rest d)))
                     ps))]
    (map #(vector %1 %2) super-d freqs)))

The specs are not exactly as I described but the concept remains. What I did is to use tail calls (it's supposed to be fast, isn't it?) to aggregate each counter to produce a vector of counts. Then I map each pair of fragment with its corresponding count to generate a final output collection. Sounds overly complicated, doesn't it?

This is the first warning of a bad functional design. For a collection of 30,000 items, this function took 11 minutes to compute on my notebook. This looks like a good place to exploit the parallel nature of this problem.

Specifically, the counting of each fragment is independent of other fragments. Thus, there's no need for the program to wait for one fragment to finish to process the next. I simplified the program to remove this inherent assumption of procedural processing. Here is the gist of the refactored code where each function only does one job. Since the processing are modularised, I can parallelize the algorithm easily with the use of pmap instead of map on the last line as shown below.

(defn match-count
" Given key, k, returns number of occurrences of k in collection, coll.
"
  [k coll]
  (let [match?  (fn [i x]
                  (if (= k x)   ;; closure on k
                    (inc i)
                    i))]
    (reduce match? 0 coll)))

(defn calc-counts
" Returns a list of counts for the occurrences of each key of keys, ks,
  within the collection, coll.
"
  [ks coll]
  (pmap #(match-count % coll) ks))

I've split the first function into 3 functions (2 shown here). As Hickey said in his talk, simplifying can often produce more, not less, functions. Yet, the program is not only easier to read and runs in less than a minute. An order of magnitude faster! There are still lots for me to learn. I want to find more challenging projects to push my own limits. But rather than solving arbitrary problems, I prefer to tackle real-world challenges. So if you know of anyone that can benefit from collaborating with a functional developer to build robust and scalable software, please pass along my contact.

Follow up: Kevin Lynagh showed me three better ways of doing this in a follow-up post – Algorithmimc ownage. Humbled.

Software design, trading development process, and Ikea

I spent a few months between 2010 and 2011 not on the market, not on coming up with new strategies, but on developing a software framework for my trading system. For my trading strategy development, I make use of a continuous systems development life cycle process from the engineering realm. Starting from planning, to analysis, to design, to implementation, to maintenance, and finally, back to planning, and so on for each and every new concept that I have. (see Sir James Dyson's guest column on Wired) Imagine that if you need to write a new strategy file for each and every idea and for each of its design iteration. Pretty soon you'll have many strategy files and a lot of boilerplate codes. What if you found a bug or figured out some enhancements to a particular component of your strategy? Then you'll have to sift through all those files and make the changes to all those relevant codes to make the update across the board. Hopefully, this isn't the approach you're taking. Yet, a few trading API that I have used before inherently encourage or even limit you (EasyLanguage and MQL4) to this archaic procedural programming paradigm. Luckily, Java is an object-oriented programming language. However, it is only as object-oriented as you make it to be. JForex, the trading API that I use, actually runs on the edge of breaking this object-orientedness as I have recently lamented. The bad news for the beginning developer is that practically every published strategy which I have seen or guilty of posting myself are illustrations of what not to do with software design. In that there's none in it whatsoever. Think of common published strategies like the Ikea mini-model showrooms. Everything is crammed into a tiny space but it is a simple way to convey the gist of a room design (read: the trading algorithm). However, they are not built for practical use. Real strategies are like suites in an apartment building. There is an underlying architectural commonality for easy management and maintenance while enabling diversity in the strategies. My recently completed software framework dramatically made my life easier in the tasks of implementation and maintenance. Software maintenance, in particular, is most often needlessly and overly complicated in trading strategy development because of the lack of a good design. Because at the end of the day, most programmers spend the majority of their time debugging, maintaining, and extending their code. A true object-oriented design decouples all its components so that you can use a divide and conquer approach with no repetition of work. Do you have a development process in place? I'd like to hear how you tackle this problem.

More trouble with JForex IIndicators

JFUtil offers [an elegant way of requesting indicators from the JForex IIndicators interface][]. But the new design for JForex IIndicators through JFUtil is only fixing one side of the problem. There is still the question of what to do with the multiple return types from the calculations. In JForex, indicators like RSI returns a one-dimensional array. Other indicators like Stochastic Oscillator returns a two-dimensional array. And some other ones return three or more dimensional arrays. Java doesn't allow overloading a method with multiple return types, as the program wouldn't know what to expect. So how can we get rid of coding mess like this in our JForex strategy? [java] Stochastic stochBean = IndicatorBeanFactory.getStochastic(); Object[] objs = Indicating.calculateMultiDimension(instrument, Period.ONE_MIN, stochBean, 1); double[][] sto = new double[2][]; sto[0] = (double[])objs[0]; // %K values sto[1] = (double[])objs[1]; // %D values [/java] The default calculation method in JForex returns a generic Object array. It's up to the programmer to know what to expect from the call and cast the Object into something useful. Obviously, this is a recipe for programming headaches and runtime errors. Having said this, one of the biggest benefits of JForex is that it uses a standard programming language like Java. So if there's something that you don't like about the API, you can probably change it (e.g. Facade pattern). This is the purpose of JFUtil to some extent, to simplify the JForex API behaviour. In any case, I'm sure other Java programmers have faced this problem before and have come up with good solutions for it. A search on stackoverflow.com doesn't yield a quick fix solution at first glimpse. My guess is that this require leveraging the knowledge about the program structure. We know in advance what dimension of the calculation result we can expect based on the indicator itself, perhaps I can use something like a Command pattern to choose a calculation sub-routine and then return a Map object with named values? I have yet to try implementing this. I am open to design suggestions to encapsulate multiple return values through a single interface. So that any indicator bean can use the same calculation interface with an easy to use output. In the mean time, getting multi-dimensional indicators results through JFUtil 2.1.2 isn't pretty.

[an elegant way of requesting indicators from the JForex IIndicators interface]: http://www.quantisan.com/conjuring-beans-to-simplify-jforex-iindicators/

Conjuring beans to simplify JForex IIndicators

The JForex API is not perfect. Like any application programming interface (API), some parts of the JForex API is better designed than others. One of the most gripping problems with JForex is its use of technical analysis (TA) indicators. Here's an example of using an exponential moving average, one of the most simplest indicators. ema(instrument, Period.TEN_SECS, OfferSide.BID, IIndicators.AppliedPrice.MEDIAN_PRICE, 14, 0); And the corresponding javadoc explaining the parameters, Parameters: instrument instrument of the bar period period of the bar side bid or ask side of the bar appliedPrice type of input data timePeriod time period value shift number of candle back in time staring from current bar. 0 - current bar (currently generated from ticks), 1 - previous bar (last formed bar), 2 - current bar minus 2 bars and so on It's not intuitive but it's usable. Recall that in my JForex programming tutorial, I suggested using the JForex API javadoc to find out information about the API. However, for the case of IIndicators, if you take a look at its javadoc, you will only be led to more confusion by the hundred or so mysterious indicator methods with cryptic parameter names. In JForex API, if you want to calculate an indicator, you need to look through a long list of abbreviated method names in the javadoc. Many of which are not easy to decipher. Secondly, you need to figure out the generic parameters and set them correctly, which is different for almost every indicator. Lastly, as there is no standard interface for indicators, you need to hardcode these into your strategy with little flexibility. In contrast, here's the same call using JFUtil 2.1.0 to get an EMA value. It has notably more lines of code. It is designed deliberately so using an object oriented approach by encapsulating the indicator parameters into a bean object and abstracting the calculation into a single generic function call. [java] // get an EMA indicator value by building an indicator bean MovingAverage maBean = IndicatorBeanFactory.getMovingAverage(); // then sets its parameters with obvious method names maBean.setAppliedPrice(IIndicators.AppliedPrice.MEDIAN_PRICE) .setMAType(IIndicators.MaType.EMA) .setWidth(14); // all of these are optional parameters // feed the bean into a generic calculation method to get the result double ema = Indicating.calculate(instrument, Period.ONE_MIN, maBean); [/java] With JFUtil, to calculate an indicator, you create an indicator bean with the intuitive bean name that corresponds with the full name of an indicator. For example, a moving average will be MovingAverage. Then you set the indicator-specific parameters using clearly defined methods with useful javadoc descriptions. One method for one parameter. Lastly, you feed this indicator bean into a generic calculation function to get your value. It took some thinking to abstract and generalize such a rigidly structured API. I am quite pleased with this new design as the current JFUtil opens up a lot of design flexibility. Such as dynamic indicator selection with genetic algorithm and interchangeable indicators use for runtime adaptive algorithms.