It should be a given that autoboxing carries a penalty, and that data structure choice (when it comes to lists) is important. But, how much does it really matter? I wrote a quick test to find out. Skim for the highlighted TLDRs if you don’t want to read details.
I used an ArrayList
All tests were run from Intellij Idea, Sun Java 1.0.6_26-b03, running on 32 bit Ubuntu 11.10. My desktop CPU is an Intel 2.5Ghz quad core, although the tests only involved a single core.
The TIntArrayList (the Trove equivalent of ArrayList for ints) provides the methods setQuick/getQuick, which allow you to set/get indexed values of the backing array; these only work if you know the overall size of the array ahead of time, but avoid the overhead of the array size check implicit in the add/get methods of both TIntArrayList and ArrayList. I included them in the test; we’ll see how they perform shortly.
Each of the tests was constructed as follows:
public double[] test1(int iterations, int[] randoms, boolean seek) {
final Mean addTimes = new Mean();
final Mean seqTimes = new Mean();
final StopWatch s = new StopWatch();
for (int test = 0; test < 10; test++) {
s.start();
final TIntArrayList tal = new TIntArrayList(iterations);
for (int i = 0; i < iterations; i++) {
tal.add(i);
}
s.stop();
addTimes.increment(s.getTime());
s.reset();
if (seek) {
s.start();
for (int i : randoms) {
int j = tal.get(i);
}
s.stop();
seqTimes.increment(s.getTime());
s.reset();
}
}
return new double[]{addTimes.getResult(), seqTimes.getResult()};
}
Each test was run 10 times, and the mean of the results returned (the Mean class is provided by commons-math). I included the memory allocation time in each measurement so as not to unfairly penalize the linked list (although at the end, I’ll also show you times without array allocation time for comparison.) You will see why I needed the seek boolean next.
Results #1
Here’s the first set of results.
You can see that the LinkedList.get and TIntLinkedList.get timings (the blue lines at mid-left) quickly get out of control; average time for 125,000 random access lookups (and then un-boxing of each value) using java.util.LinkedList was 20.4 seconds! TIntLinkedList, Trove’s library, fared better; the equivalent average was 7.1 seconds. Like most programmers I’m impatient, so I capped the LinkedList tests at 125,000 iterations.
You can also see that the right of the chart is dominated by linked list insertions (both Trove’s and java.util’s.) Recall that I am including the allocation of the array based alternatives in the timings, so clearly linked list insertion involves extra work. What exactly are they doing? Let’s look at the source. Here’s what java.util.LinkedList does when you call add(Integer i):
public void add(E e) {
checkForComodification();
lastReturned = header;
addBefore(e, next);
nextIndex++;
expectedModCount++;
}
final void checkForComodification() {
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
}
}
private Entry<E> addBefore(E e, Entry<E> entry) {
Entry<E> newEntry = new Entry<E>(e, entry, entry.previous);
newEntry.previous.next = newEntry;
newEntry.next.previous = newEntry;
size++;
modCount++;
return newEntry;
}
All needed, clearly, to support the data structure. However, it’s a lot of overhead (both in terms of memory and CPU) if you’re operating on a large dataset.
(If you’re curious about when LinkedLists actually do make sense to use, see this useful StackOverflow discussion.)
Results #2
Let’s now ignore linked lists, and focus in on performance differences among the array based alternatives.
Here the effects of using Integer objects as opposed to primitives are clear. The chart is dominated by the timings of a) allocation and buildout of the initial array backing the ArrayList (purple), followed by b) the random access lookup operations on the ArrayList (blue). Allocation and insertion of 32,000,000 ints into an ArrayList
@Test
public void testArrayList() {
final int size = 32000000;
final Mean time1 = new Mean();
final Mean time2 = new Mean();
final Mean time3 = new Mean();
final Mean time4 = new Mean();
final StopWatch s1 = new StopWatch();
for (int i = 0; i < 10; i++) {
s1.start();
final ArrayList<Integer> ar = new ArrayList<Integer>(size);
s1.stop();
time1.increment(s1.getTime());
s1.reset();
s1.start();
for (int j = 0; j < size; j++) {
ar.add(j);
}
s1.stop();
time2.increment(s1.getTime());
s1.reset();
s1.start();
final int[] ir = new int[size];
s1.stop();
time3.increment(s1.getTime());
s1.reset();
s1.start();
for (int j = 0; j < size; j++) {
ir[j] = j;
}
s1.stop();
time4.increment(s1.getTime());
s1.reset();
}
System.out.println(time1.getResult());
System.out.println(time2.getResult());
System.out.println(time3.getResult());
System.out.println(time4.getResult());
}
Results:
- ArrayList allocation: 167.29ms
- Arraylist build: 11610.4ms
- int[] allocation: 587.0ms
- int[] build: 123.8ms
These are interesting results.
First of all, note that allocation of the ArrayList
However, the more pronounced contrast is between the actual build out of the two arrays: building a primitive array is approximately 100x as fast as the object equivalent! This is more obviously understandable. The Integer objects are not contiguous in memory, and so for each, the JVM has to find and allocate a new location, and then write the Integer value to it; by contrast, the primitive array already has allocated the memory space for the value. Integer objects are also larger than primitives, so more memory is being written to. Finally, the value of each Integer has to be converted (Autoboxed) from int.
Just how much bigger are Integers than ints? According to Java engineer John Rose, in this highly informative blog post:
In the case of a Java Integer on a 32-bit Hotspot JVM, the 32-bit payload (a Integer.value field) is accompanied by a 96 additional bits, a mark, a klass, and a word of alignment padding, for a total of 128 bits. Moreover, if there are (say) six references to this integer in the world (threads plus heap), those references also occupy 192 bits, for a total of 320 bits. On a 64-bit machine, everything is twice as big, at least at present: 256 bits in the object (which now includes 96 bits of padding), and 384 bits elsewhere.
And, the Java guide has this to say about Autoboxing:
The performance of the resulting list is likely to be poor, as it boxes or unboxes on every get or set operation. It is plenty fast enough for occasional use, but it would be folly to use it in a performance critical inner loop.
So when should you use autoboxing and unboxing? Use them only when there is an “impedance mismatch” between reference types and primitives, for example, when you have to put numerical values into a collection. It is not appropriate to use autoboxing and unboxing for scientific computing, or other performance-sensitive numerical code. An Integer is not a substitute for an int; autoboxing and unboxing blur the distinction between primitive types and reference types, but they do not eliminate it.
- Autoboxing is expensive
- ArrayLists of Integers are relatively efficient to initally allocate, but relatively expensive to assign values to, compared to primitive alternatives
- Integer objects are significantly larger than int primitives
Results #3
Alright, let’s eliminate the ArrayLists and Integers, and focus only on primitive ints.
It’s worth pointing out, since it’s not obvious, that the green (tintarraylist.getQuick) line almost exactly overlays the light blue (int[] get) line.
Here we are looking at Trove vs int[]; note that I have again ramped up the iterations, now that we’re dealing with more performant data structures. The big surprise is that Trove carries almost no overhead over primitive int arrays, especially when using the setQuick and getQuick methods. Allocating and building a 64,000,000 int array using Trove took 509ms; doing the same with an int array took 500.8ms. On the other hand, randomly retrieving 64,000,000 ints using Trove took 70.8ms, whereas doing the same with the int array took 69ms.
If that level of performance difference matters to you, then it makes sense to use primitives. For the vast majority of developers, on the other hand, Trove’s added convenience (automatic reallocation if you lose track of the array size) make it worthwhile. Here’s a code sample illustrating this:
final TIntArrayList arr = new TIntArrayList(10);
for (int i = 0; i < 10; i++) {
arr.setQuick(i, i);
}
// note that the first add will cause reallocation of the backing array
for (int i = 0; i < 10; i++) {
arr.add(i);
}
Results #4
Finally, I promised to show the effects of array allocation. If your array already exists and you can reuse it, how much time will that save?
@Test
public void testAllocationPenalty() {
final int size = 64000000;
final Mean time1 = new Mean();
final Mean time2 = new Mean();
final Mean time3 = new Mean();
final Mean time4 = new Mean();
final StopWatch s1 = new StopWatch();
for (int i = 0; i < 10; i++) {
s1.start();
TIntArrayList tarr1 = new TIntArrayList(size);
for (int j = 0; j < size; j++) {
tarr1.setQuick(j, j);
}
s1.stop();
time1.increment(s1.getTime());
s1.reset();
TIntArrayList tarr2 = new TIntArrayList(size);
s1.start();
for (int j = 0; j < size; j++) {
tarr1.setQuick(j, j);
}
s1.stop();
time2.increment(s1.getTime());
s1.reset();
s1.start();
int[] parr1 = new int[size];
for (int j = 0; j < size; j++) {
parr1[j] = j;
}
s1.stop();
time3.increment(s1.getTime());
s1.reset();
int[] parr2 = new int[size];
s1.start();
for (int j = 0; j < size; j++) {
parr2[j] = j;
}
s1.stop();
time4.increment(s1.getTime());
s1.reset();
}
System.out.println(time1.getResult());
System.out.println(time2.getResult());
System.out.println(time3.getResult());
System.out.println(time4.getResult());
}
Results:
- TIntArrayList allocation & build: 597.0ms
- TIntArrayList build only: 441.4ms
- int[] allocation & build: 601.8ms
- int[] build only: 303.2ms
So, clearly, if you want maximum performance when dealing with large numeric datasets, you should reuse existing primitive array structures. However, the overhead of keeping track of which indices belong to which generation obviously adds significant complexity to the code, and feasibly may end up costing as much time as you save.
I hope you found some useful insights in this writeup. If you’d like to use Trove, you can grab it from Maven repos using:
<dependency>
<groupId>net.sf.trove4j</groupId>
<artifactId>trove4j</artifactId>
<version>3.0.2</version>
</dependency>
And, if you’d like to play around with my test code, you can download it here.


