I was reading this excellent post on google collections by Sune Simonsen and decided to re-factor some code of mine. I have an immutable results object used in a multi-threaded application. I wanted to feel the difference using the google ImmutableList. My existing class provides immutability in a very similar way to the google classes. Static methods are used to instantiate the object. An array is used to back a list collection. Defensive copies are made to protect the immutability of the object. Moving to google collections is easy and makes my implementation simple whilst reducing test overhead.
Steps taken to refactor
There was actually very little to do.
Step 1, add a maven dependency
com.google.collections google-collections 1.0-rc3 jar
Step 2, replace the array with an ImmutableList
private static final LineResult[] EMPTY_LINERESULT_ARRAY = new LineResult[0]; private LineResult[] lineResults = EMPTY_LINERESULT_ARRAY;
could become:
private List<LineResult> lineResults = ImmutableList.of();
In the end I used ImmutableList to clearly my intent. The usage is internal to the class so there is no advantage in using the List interface.
private ImmutableListst<LineResult> lineResults = ImmutableList.of();
Step 3, The constructor and static factory methods are simplified
We are no longer using an array to back the list, so the protective copy can be moved from the constructor into the static method. This reduces the window of vulnerability you are exposed to before capturing the data held in the list. This is a direct result of the google class providing a static factory method. Nb. Its a shame that the standard collections don’t provide static factory methods.
Prior to the changes the protective copy was done in construction like this.
this.lineResults = lineResults.toArray(EMPTY_LINERESULT_ARRAY);
This result is a much simpler constructor.
private ResultsFileSearch(File fileFound, ImmutableList<LineResult> lineResults) { this(fileFound); this.lineResults = lineResults; }
My newInstance method makes use of the ImmutableList.copyOf static factory method resulting in a very clean conversion of List into Immutable list. Note that generics is inferring the type.
public static ResultsFileSearch newInstance(File fileFound , List<LineResult> lineResults) { return new ResultsFileSearch(fileFound ,ImmutableList.copyOf(lineResults)); }
Step 4, the getLineResults method no longer has to deal with the array or conversion.
The original array backed object had to deal with returning either an empty list or conversion to an unmodifiable list. There are two things I did not like about this when I wrote the original.
- The code has to check the array resulting in some conditional logic. This increases the test overhead slightly.
- The use of Collections.unmodifiableList(Arrays.asList(lineResults)); is very ugly and slow.
This was the original, note the subtle bug when returning an empty list.
public List<LineResult> getLineResults() { if ((this.lineResults.length == 0)) { return Collections.emptyList(); } else { return Collections.unmodifiableList(Arrays.asList(lineResults)); } return this.lineResults; }
I hope you agree its a much simpler version using the google code. As we are now using the Immutable implementation, we don’t have to worry about exposing the reference.
public List<LineResult> getLineResults() { return this.lineResults; }
Conclusions
Using the google collections makes my implementation much cleaner. Several complexities are given over to the google collections classes. Moreover testing of my class is simpler. The changes are contained within the class. None of the method signatures change, which has two consequences. On the plus side the re-factor is contained within this class. On the negative side, any code modifying the list will raise an UnsupportedOperationException which was no different to my original version. The construction of my class might be slightly slower, the memory overhead slightly bigger now that we are not using the array. Counter to this, the getLineResults method is quicker as it now only returns a reference. In my program this is more important as architecturally it uses bounded queues to hold these objects for processing by threads. So although its using a little extra memory, its constrained. The performance gain is worth it.
I recommend using the com.google.common.collect package but adhere to the caveats i.e don’t expose the types in an API etc.