Numerical calculation performance matters if you're calculating gigantic
datasets or have limited processing capability. I belong in the latter
case as I intend to run my trading system on a free cloud server in
the future. There are plenty of research and studies to push the
boundary of high-performance numerical computing in Java (see ref. 1 and
2). Most of the latest and greatest is beyond the scope of my
quantitative work. What I seek is performance with simplicity and
efficiency. That is how I became aware of the Colt Project by CERN,
and its multi-threaded branch. Colt is a set of open source Java
libraries for scientific and technical computing problems characterized
by large data sizes while demanding high performance and low memory
footprint. Just what I needed. Out of curiosity, I tested its
implementation of double arrays and compared its performance to standard
Java libraries.
Method
A random sample of a million decimal values (e.g. prices) are generated
and stored in four Java data structures. All data structures have the
same one million values in their own respective memory spaces.
- Primitive double array
- ArrayList of Double objects
- Colt
- Parallel Colt
Three tests were ran.
- A simple diff function. A traversal of the million data points and
calculating array[i] - array[i - shift] for all i = shift to
1,000,000. Where shift is a random integer from 0 to 50. This
type of calculation is common place in technical analysis
indicators.
- A sort function sorting the million data points in the array in
ascending order. This come in handy for some statistical analysis.
In the case of the primitive array, I am using
java.util.Arrays.sort(). Otherwise I'm using the sort() method
implicit in the other interfaces.
- A binary search function on the sorted arrays. For the primitive
array, I am using java.util.Arrays.binarySearch().
Each test is ran 3 times with different random seeds and the results are
averaged. The system runs Java SE build 1.6.0_22-b04 and Linux kernel
2.6.35-22 on an Intel Core 2 Duo 6300 @ 1.86 GHz.
Result
For the diff, primitive array averaged 9.3 millisecond; ArrayList 27
ms.; Colt 14 ms.; Parallel Colt 15 ms. For sort, primitive 354.7 ms.;
ArrayList 1,316 ms.; Colt 270.3 ms.; Parallel Colt 273 ms. For search,
primitive 559.7 ms.; ArrayList 1,561.7 ms.; Colt 575.7 ms.; Parallel
Colt 551.3 ms.
Discussion
As this isn't a vigorous test, I wouldn't mind too much on the actual
numbers. The take home message here is that the performance of Colt is
at least in the same magnitude of a primitive array. Whereas a standard
ArrayList object is a magnitude worse. However, where Colt really shines
is in its methods for handling matrices. It offers plenty of convenient
functions for matrix calculations while guaranteeing good performance.
To work with matrices in Java, you either use third-party libraries such
as Colt or manage a bunch of for loops. Which isn't pretty.
References
[1] Moreira, et al., "Java programming for
high-performance numerical computing," IBM Systems Journal, Vol. 39,
No. 1, pp. 21-56, 2000. [2] Wendykier and Nagy, "Parallel Colt: A
High-Performance Java Library for Scientific Computing and Image
Processing," ACM Transactions on Mathematical Software, Vol. 37, No. 3,
September 2010.