SF/JavaOne, Day 3, Java 1.6 Collections

I got to go see the BirdsOfAFeather talk with Josh Bloch concerning the new collection in Java1.5 (which you can read about: here) and the upcoming collections they have planned for the future (which you can read about: here). 
I've long been a fan of the java collections and i've found that
they've normally held a good balance between simplicity and
power.  It's actually a case of "Goldilocks and the Three Bears"
for me.  stl is too complex and painful to use, .Net is too
simplistic and limited in power, whereas java get's it just about
right.  It's not perfect, but it's flexible enough to handle
almost anything you'd ever need. I always found the deep use of
abstractions to be enormously helpful for writing your own special
collections while only writing 1/10th or evern 1/100th of the code
necessary to implement the full interface.  It's also not
cluttered with a gazillion interfaces like i've seen in other packages
which isn't especially helpful in a language like java which doesn't
have unions.

However, it's starting to seem like the new collections are starting to
buckle under their own weight in recent java releases.  With the java.util.concurrent
package it seems like the number of collections grew enormously with
new axes of concurrent access and blocking access being threaded
through (no pun intended) every type of collection and interface. 
In my experience, having a collection surrounded by a reader/writer lock
usually fit the bill just fine.  Of course, i only ever had to
handle a couple of hundred threads at a time, and it's quite possible
that to be able to scale to thousands better and deeper integration of
lock-free data structures are really needed.  I also question the
need for skip lists to be introduced into the core collection
implementation methods.  I think it's important to provide core
implementations that offer linear, logarythmic and
amortized-constant/constant time performance with the associated
benefits and drawbacks.  But i don't see why skip-lists
are needed in the core collections, but i suppose the fact that Bill
Pugh is on the JSR166 expert group might have something to do with it.

However, i shouldnt' knock it until i've tried it.  I'll have to
dig up my massively overthreaded web app that i built at one point and
see if these new collections can make that code much simpler, safer,
easier to understand, and possibly faster.  If it achieves any of
those benfits without any significant drawbacks, then i guess it will
have been worth it.

Comments

  • Anonymous
    June 28, 2005
    The comment has been removed

  • Anonymous
    June 28, 2005
    Skip lists are logarithmic too, but have better average complexity & cache locality than red-black trees, the traditional logarithmic collection base. That's what I've been able to gather.

    .NET could definitely be improved with better collections.

  • Anonymous
    July 01, 2005
    Have you checked PowerCollections [1]? How does it fit in the power-simplicity balance?
    http://wintellect.com/powercollections/

  • Anonymous
    July 01, 2005
    Brad: I have lots of problems with it. Primarily, it seems to be lacking a lot of consistancy in it's design, and i find consistancy to be highly valuable in an api, (as long as is doesn't have a major negative effect).

  • Anonymous
    July 01, 2005
    barkkel: "Skip lists are logarithmic too, but have better average complexity & cache locality than red-black trees, the traditional logarithmic collection base"

    True. But then, i would have just preferred that the internals of TreeSet and TreeMap get replaces with SkipLists. I don't see the need to actually introduce a whole new collection for this. Were there that many people saying "i need collections with the same algorithmic complexity as the current sorted collections, but i must have the constant factore reduced!"?

  • Anonymous
    July 03, 2005
    "True. But then, i would have just preferred that the internals of TreeSet and TreeMap get replaces with SkipLists. I don't see the need to actually introduce a whole new collection for this. Were there that many people saying "i need collections with the same algorithmic complexity as the current sorted collections, but i must have the constant factore reduced!"? "
    Ah, but that's not the trade-off being better. Rather, it's closer to the "QuickSort vs. Merge Sort" trade-off. Skip lists give you logarithmic access on average (just as QS gives nlogn sorting on average) but can give you linear access in the worst case (just as QS can give you n^2 sorting in the worst case).

    One may decide that it's worth the risk for the better constant factors--just as one may decide QS is worth the risk, again for its better constant factors--but one should not be forced to make the switch. It does indeed need a whole new collection.

  • Anonymous
    July 04, 2005
    DrPizza: I disagree. By this measure we should have AVL trees, Skip Lists, Red black trees, and every other collection prescribed in every data structure book out there. I think a balance needs to be struck so that you don't have a collections API with 1000 different collections in it, all subtly different.

    One of the balances that the java API's had previously was that they chose collections based on algorithmic complxity. So you could say "i want a list with fast adding and fast lookup, but poor insertion" or "i need somethign that's sorted, but i need better than linear lookup" or "i want constant lookup in the expected case" etc. etc. etc. Now, do we need an AVLTreeSet? After all, it would give you different performance characteristics from a TreeSet and one might want to weigh one against eachother. Personally, i would say no. Especially since i've never seen any bugs with any significant amount of votes at java.sun.com saying "i need a set that has logarithmic access in the normal case, but with a better constant factor than TreeSet, but i'm willing to have it be linear in the worst case". So they added a collection and cluttered an API when there did not seem to be any real need to.

    On the other hand, they could have done stuff like add collections where there are currently no existing collections that suffice. For example, a Graph interface would be very appreciated.

  • Anonymous
    July 11, 2005
    The comment has been removed

  • Anonymous
    July 13, 2005
    DrPizza: Sorry for the delay! You got marked as spam (go outlook!) and i only check that folder once in a while.

    You've made some very good points and i'll definitley have to mull over them for a while.

    Thanks!

  • Anonymous
    June 07, 2009
    PingBack from http://greenteafatburner.info/story.php?id=5619

  • Anonymous
    June 08, 2009
    PingBack from http://insomniacuresite.info/story.php?id=5171