Do we want a java.util.Pair ?

28

An interesting discussion popped up this week about whether it would be useful to add a java.util.Pair in JDK 7. This has been requested many times in the past and I know I’ve heard some talk about it on the Java Posse as well. Pair is of course a special case for two elements and suggests possibly a Triple or Tuple as well, although Pair seems vastly more common than either of those.

I believe by far the most common use case for a Pair is to return multiple values from a single function. Dick Wall made a case for Pair in his Funky Java talk at Devoxx. I certainly found a number of Pair implementors and fans on Twitter. Joe Darcy seemed to think the large number of Pairs in the wild were an indication that having a single version in the JDK might be better than many different implementations and he submitted a strawman version.

It was rightly pointed out that both mutable (AbstractMap.SimpleEntry<K,V>) and immutable (AbstractMap.SimpleImmutableEntry<K,V>) implementations exist in java.util now although they are obviously a bit buried in their current incarnation and probably even less usefully named than Pair.

I believe the major dissent focuses on how Pair is a poor class name and has poor method names. “Pair” does not name the thing you are returning in any useful way other than telling you it’s “two things”. The methods of Pair are usually also generic as in “getA” or “getFirst”. If there is a real meaning behind putting those values together, then there is probably also a better conceptual name to use instead. If the two values really aren’t related in any useful way, then the code is probably poorly factored to start with due to a bad separation of concerns.

Bob Lee made a strong plea against Pair (with support from Mark Reinhold) as it would encourage the proliferation of bad code. Kevin Bourrillion and Josh Bloch were also arguing against Pair for JDK 7 but thought that a more general solution of n-arg value types might be a useful thing to attack in a future JDK.

I must admit that I have created and used a Pair at least once in the past, but it made me feel dirty and I eventually refactored it away. Interestingly, I’ve seen the same issue even in Clojure. When I’ve written code that returns a seq that I must pick apart with first, second, etc, it feels equally dirty. Working with collections of homogenous elements is great but as soon as positional behavior matters, I feel uncomfortable. In Clojure, it can still be ok, but only if the scope happens within a single function and even then I prefer to use destructuring rather than first or second.

In the end, I guess my own personal view is that the case for Pair is weak – I certainly don’t think it’s necessary.

Comments

28 Responses to “Do we want a java.util.Pair ?”
  1. Howard Lovatt says:

    Hi,

    I suggested a general Tuple library and included a sample library that I wrote to the Google Collections people:

    http://code.google.com/p/google-collections/issues/detail?id=43

    They weren’t interested :(

    If you are interested then the code is still there if you want to download it. It does more than a simple tuple library would in that the tuples are also lists and they can be processed recursively. The classes aren’t final so you can also use them as a base class for your own multiple return values and that way add meaningful names to the elements.

    — Howard.

  2. Alex Cruise says:

    Without pattern matching, tuples are messy. With pattern matching, they’re natural.

    for ((k,v) <- aMap) // scala.collection.Map[K,V] extends Iterable[Tuple2[K,V]]; the LHS of a for comprehension is a pattern match

    val (one, two) = methodThatReturnsTwoValues

  3. I have to agree with you. I’ve used the Pair a couple of times, but it usually does point to a code smell. I don’t think there is a strong enough case to include it in the Java APIs.

    Dushyanth

  4. I personally think that including this into the core of Java simply encourages engineers/developers to become code monkeys. Using java.util.Pair shows there was little thought done on behalf of the developer. Most developers are bad at designing APIs and anything that contributes to this is going to do more harm than good.

    On that note, Joshua Bloch – How to Design a Good API and why it Matters

    Best Regards,
    Richard L. Burton III

  5. Toby Jungen says:

    While I agree that Pair can lead to some really ugly signatures (as Kevin Bourrillion pointed out), it is a very useful construct I’ve ended up using many times. Specifically, I think the case where you need a function to be able to return more than one value is immensely prevalent, and I think code that falsely uses exceptions to “return” multiple values is many times worse than code that would just use Pair.

    Additionally, while the argument against Pair in favor of creating your own types is valid, I think there is a strong counterargument that creating “throw-away” types that are used in only one instance are bad because they pollute the namespace and create code bloat. You only need to look as far as any number of SOAP frameworks before you start seeing an enormous pile of different types that ultimately serve the same function.

    Having very deep nested generic types is definitely bad, but needing to dereference 10 levels deep (performing null checks along the way) to obtain the result of a function call isn’t much better.

  6. Toby Jungen says:

    (Pushed submit too soon)

    I should add however, that I agree with Kevin and Josh’s comments that Pair is only a stopgap solution, and that the deeper problem (creating value types in Java is messy and extremely verbose) should be addressed. The Bean pattern is awful and in my opinion its proliferation is one of Java’s biggest flaws.

  7. Phil says:

    Surely it should be called ‘cons’ ? :-)

  8. ahmet says:

    i was tempted to use pair two times recently. But at the end, i created two classes with better identities, such as “ScoredItem” instead of Pair and “IndexedSentence” instead of Pair . i think over all code feels better now.

  9. ahmet says:

    sigh. my comment is garbled.
    ….
    such as “ScoredItem” instead of Pair[T,Double] and “IndexedSentence” instead of Pair[Integer,Sentence]
    ….

  10. Cedric says:

    Have you considered the cost of that extra class you write instead of using a generic Pair class?

    Every single class loaded by the classloader consumes memory which is never garbage collected.
    According to tests I ran about two years ago, a very small class consumes between 1.7 and 2KB.
    See http://blog.dandoy.org/2007/01/20/class-size/

  11. Mario Aquino says:

    I read Kevin Bourrillion’s examples of confusing uses of a Pair facility and I am still unconvinced that adding it is a bad idea. From Bourrillion example, it looks like the developers just created it themselves and went forward with designing bad API signatures. Should the JDK protect us from ourselves? Bollocks! It should provide powerful tools that we use as we see fit, to our benefit or our peril.

  12. Andrei says:

    Pairs are evil. I avoid them in C++ since it is unclear what’s what.

    “First” and “Second” is just BAD code.

  13. Bryan Headley says:

    Not in the API, but in the language itself. In other words, have the ability to create functions that return more than one value,

    public (float, int) functionThatReturnsTwoValues() {

    return (1.0, 5);
    }
    ….
    float a;
    int i;
    (a, i) = functionThatReturnsTwoValues();

    You otherwise deal with Tuple.getFirst(), and find its easy to forget what the first element in the Tuple signifies.

  14. phil swenson says:

    I love how Ruby does it:

    a,b = def blah return 1,2 end

    Groovy is similar
    [a,b] = def blah{ return 1,2 }

    There are many cases where you do an expensive lookup and have access to two things at once that are needed. So w/o a pair concept you do one of two things: create 2 methods that both do the expensive lookup or create a one-off class to return them in a data structure.

    Both of these solutions result in more code and more complexity than is needed.

  15. dm3 says:

    Tuples and N-Tuples are ubiquitous in functional languages which support lightweigth type aliases and pattern matching (as someone already noted). Without this machinery they should be rarely used as there is no way to tag or know what the given tuple means in the context it is used.
    Consider
    Pair f = getFrequencyOfWord(“some sentence”, “some”)
    if (f.first() == 1 && f.second().equals(“some”) {
    Success!
    } else {
    Failure!
    }
    and
    type Frequency = Pair
    Frequency f = getFrequencyOfWord(“some sentence”, “some”)
    f match {
    case (1, “some”) => Success!
    case _ => Failure!
    }

    For a functional implementation of N-tuples (pair is fj.P2) and other functional stuff in java see http://code.google.com/p/functionaljava/

  16. dm3 says:

    2-Tuples (pairs) and N-Tuples are ubiquitous in functional languages which
    support lightweight type aliases and pattern matching. Without this machinery
    they should be rarely used as there is no way to tag or know what the given
    tuple means in the context it is used in.
    Consider:

    Pair f = getFrequencyOfWord(“some sentence”, “some”)
    if (f.first() == 1 && f.second().equals(“some”)) {
    Success!
    } else {
    Failure!
    }

    and

    type Frequency = Pair
    Frequency f = getFrequencyOfWord(“some sentence”, “some”)
    f match {
    case (1, “some”) => Success!
    case _ => Failure!
    }

    Examples seem to be considerably equal, but imagine that the { getFrequencyOfWord }
    is called throughout the whole codebase, tens and hundreds of times.
    Do you really want to see it as a { Pair } or rather as a { Frequency } ?

    For an implementation of N-tuples (2-tuple or pair is fj.P2) and other functional stuff in java see (functionaljava)[http://code.google.com/p/functionaljava/].

  17. dm3 says:

    Angle brackets.
    Please read Pair as Pair[Int, String].

  18. mrOhad says:

    How about supporting multi-return type?
    i.e. a method can return more than one object..

    what are the disadvantages?

  19. Tony Morris says:

    You’re going to hit up against the strong aversion to abstraction that is rife among Java users. I recommend taking a look at Arrows (John Hughes); a model of computation that involves pairs. Also, note that the Pair type constructor denotes conjunction under the Curry-Howard Correspondence. Equally important is Either (disjunction), but now you’re really getting on the goat of the abstraction detractors.

    Java is a language designed specifically for endless repetition of solving not-even-problems. Perhaps it’s best to leave it that way.

    See Functional Java for an effort to help anyway.

  20. The solution is Scala’s case classes (which is what Kevin Bourrillion and Josh Bloch are getting to). Enough said.

  21. Pair ..hmm… we can still live without. But headless Tuples (John Rose) is something worth waiting for – http://blogs.sun.com/jrose/entry/tuples_in_the_vm

  22. I experimented with 2-tuples in Scala to have multiple return values in public methods of some API I created, and at the time also had my doubts as to whether this was a good choice. I wrote about it here:

    http://janvanbesien.blogspot.com/2010/01/scala-library-for-ipv4-related-concepts.html

    Now that I read this blog and some comments, I am convinced that I should have gone for Scala case classes in stead.

    In other words, I also believe adding a Pair to java is probably not the best solution.

  23. John R. Williams says:

    I’m seeing a lot of comments to the effect that returning pairs is bad because the field names of a pair object don’t indicate what the values represent. This is usually accompanied by an exhortation to refactor the code to avoid returning multiple values, or to define a new class to package up the return values with meaningful field names.

    I’m left wondering how returning a tuple is any worse than passing multiple arguments by position. The parameters have names where they are defined, but at the call site, the meaning of the arguments is just as mysterious as the members of a tuple. I have yet to see anyone claim that code should be refactored to avoid methods with multiple arguments. I also haven’t seen anyone recommend bundling method parameters into an object (except when the number of parameters is very large).

    Is there any reason why return values need to be more self-descriptive than arguments are?

  24. Keith says:

    Why can’t we have methods that actually return two or more objects?

    It’s not like they are locked down to always return one as we can return zero with the use of “void”.

  25. JB says:

    Why can’t we have methods that actually return two or more objects?

    I think this is mainly a historical inflexible implementation of return values in some languages (compilers?)… Also, telling people than they are not clever enough to count above one and that return values are named on the caller, which by the way can use a better name …
    Tuple return values should be a base feature of any language (as Williams notes, why do we have no problem with tuple (named) arguments?)

  26. Kricket says:

    Can anybody give a reason why Pair is worse than, say, String?

    We don’t know what Pair.first and Pair.second represent…BUT we also don’t know what str.charAt(3) represents, either. And yet, charAt is part of the standard String API.

    The argument that we don’t know what the Pair is supposed to represent, applies to any generic class – we don’t know if an int represents money, number of students, etc. Yet every day people write stuff like “int numberOfStudents” without problem…

  27. Tinlyx says:

    Can’t agree more with Williams and JB.

    Tuples have well-defined meanings/semantics. That is, you know in advance the number of components that need to be stored in the structure.
    For example, if one wants to parse a string into “a current token” and “the rest”, the only thing
    of interest is that the result is a pair.
    It’s silly to require that a concrete type has to be specified for the “token” type or the “rest” type.

    I can’t really see a meaningful way to refactor this. Of course, there is the brute-force way of writing a custom type for each possible combination of the “token” type (e.g. Int, Char, String etc.) and the “rest” type (e.g. an Array, String, Heap etc.). But this is prehistoric.

  28. elg says:

    If Java supported an immutable array then I’d agree with some of the objections to the Pair class and other tuples. There are many times when small, fixed size sets of things need to be passed around. Writing a new class to support every one is often just egregious wheel reinvention.