Why YOU should use Integer.valueOf(int)

24

In particular, why you should use Integer.valueOf(int) instead of new Integer(int): CACHING.

This variant of valueOf was added in JDK 5 to Byte, Short, Integer, and Long (it already existed in the trivial case in Boolean since JDK 1.4). All of these are, of course, immutable objects in Java. Used to be that if you needed an Integer object from an int, you’d construct a new Integer. But in JDK 5+, you should really use valueOf because Integer now caches Integer objects between -128 and 127 and can hand you back the same exact Integer(0) object every time instead of wasting an object construction on a brand new identical Integer object.

    private static class IntegerCache {
	private IntegerCache(){}

	static final Integer cache[] = new Integer[-(-128) + 127 + 1];

	static {
	    for(int i = 0; i < cache.length; i++)
		cache[i] = new Integer(i - 128);
	}
    }

    public static Integer valueOf(int i) {
	final int offset = 128;
	if (i >= -128 && i <= 127) { // must cache 
	    return IntegerCache.cache[i + offset];
	}
        return new Integer(i);
    }

A side note is that the cache is contained in a static class which will not be instantiated until the first time it is needed, so if valueOf() is never called, the cache is never created. And conversely, the first time it's called, it's going to suck (as it will really create 256 objects). Also of interest is that because of the synchronization guarantees of the JVM when creating a static class instance, there is NO need for synchronization when creating the cache! This is the same basic trick mentioned in Bob Lee's recent post on lazy-loading singletons. Hot stuff.

Seems like a textbook case for why you'd want to hide object construction behind a factory method - it allows you to later on decide to add a cache for commonly constructed immutable objects.

I have not read this anywhere but I wonder if the addition of autoboxing in JDK 5 (which automatically creates Integer objects from ints in some cases) prompted the change as these primitive objects were being created much more frequently.

Actually, if you want to see a whole bunch of really complicated optimized code, take a look at the Integer.java source. Interesting stuff.

Update 1: I intended to mention that the awesome and wonderful FindBugs tool will find uses of the Integer constructor in your code for you. I'm not sure whether other analysis tools like PMD and Checkstyle will do so but it certainly seems that they could.

Update 2: I also wanted to mention that you can look, in comparison, at the Gnu Classpath implementation:

private static final int MIN_CACHE = -128;
private static final int MAX_CACHE = 127;
private static Integer[] intCache = new Integer[MAX_CACHE - MIN_CACHE + 1];

public static Integer valueOf(int val)
  {
    if (val < MIN_CACHE || val > MAX_CACHE)
        return new Integer(val);
    synchronized (intCache)
      {
	if (intCache[val - MIN_CACHE] == null)
	    intCache[val - MIN_CACHE] = new Integer(val);
	return intCache[val - MIN_CACHE];
      }
  }

And here's the Apache Harmony implementation:

private static final Integer[] CACHE = new Integer[256];

public static Integer valueOf(int i) {
        if (i < -128 || i > 127) {
            return new Integer(i);
        }
        synchronized (CACHE) {
            int idx = 128 + i; // 128 matches a cache size of 256
            Integer result = CACHE[idx];
            return (result == null ? CACHE[idx] = new Integer(i) : result);
        }
    }

You'll note in both cases that the static class is not used and thus synchronization is required, making these implementations possibly worse in multi-threaded environments (although likely not in a way that you'd notice). You'll also notice however that gcj and Harmony cache on demand instead of pre-populating the cache as the Sun JDK version is forced to when using the static class trick to avoid the synchronization.

I also find the use of magic numbers interesting across these examples. Here we've got several magic numbers (-128, 127, and 256). Conventional wisdom (from something like Code Complete) is that magic numbers in the code should be lifted into constants and named for clarity. Also, of these three constants, any one (take your pick) can be derived from the others, so could be defined as a static calculation. The Sun and Harmony versions eschew any pretense of doing this and simply use them all as literals, which is at least consistent. In Sun's case it's actually a little better as the related code blocks using the constants are next to each other whereas in Harmony's the static definition is at the top of the class, far from the usage of the cache. gcj did replace all magic numbers with constants and calculated the cache size (256) at the point of need. Seems like the val - MIN_CACHE calculation for the offset could have been pulled out into a local variable though, if not for performance (as the compiler would presumably optimize this), at least for maintenance (less copied code to change).

Update 3: To satisfy Mr. John Smith in the comments, I ran a small performance test. It prints the time (in nanoseconds) to get the first Integer, the second Integer, and and the total and average while creating 1000000000 Integers using either new Integer() or Integer.valueOf().

Here's the results:

> java PerfTest n
new Integer
first = 34921 ns
second = 2514 ns
all = 15778328252 ns
avg = 15 ns

> java PerfTest v
valueof
first = 729981 ns
second = 2514 ns
all = 7729155801 ns
avg = 7 ns

So, as expected valueOf sucks on the first call. The second call is a tie, which seems odd. But really, this is most likely just a fluke of the resolution of the clock vs how fast it is to construct a single object. Seems exceedingly odd that the numbers are the same - I'd guess it's most likely you're seeing the the smallest "tick" of the nanos clock, not a real value (esp in light of the final averages). On the totals and averages we see the full-term story: valueOf takes an anverage 7 ns vs 15 ns for constructing a new Object, so half the time. If I were less lazy, I'd write another program to calculate the break-even point but it would probably be way too biased by my environment. Here's the code if you want to use it: PerfTest.java.

Update 4: If you're interested in more of the history, check out section 5.1.7 of the Java Language Specification which talks about boxing conversions, which describes the rule that boxing (which calls valueOf()) must return the identical instance for integers in this range. Ideally, they would like the result of all boxing conversions to return identical instances, but that's not practical, so this is part of a compromise.

Update 5: Joe Darcy, the author of the Sun code in question posted a response to this blog, which is worth the read as to the alternatives that were considered and performance benchmarking.

Comments

24 Responses to “Why YOU should use Integer.valueOf(int)”
  1. john smith says:

    Sorry, but you are wrong. Did you test this before you posted?

    I know this will be formatted poorly, but:

    package test;

    import java.util.Random;

    public class ValueOfTest {
    public static void main(String[] args) {
    Random r = new Random();

    long start, end;

    start = System.currentTimeMillis();

    int size = 3000000;
    Integer i[] = new Integer[size];
    for (int x = 0; x

  2. john smith says:

    Does this even work?

  3. Alex says:

    Sorry, John looks like you might have hit a comment max length limit, not sure. The comment was delayed as it picked up by my spam filter for some reason. Are you saying that the post is wrong in terms of a performance improvement? I suspect detecting that probably depends on the JVM, hardware, app, hotspot, GC params, and everything else. I assume that the biggest gain is not in avoiding the object construction (which is very fast) but really in avoiding the memory bloat and GC load from creating many objects. I also assume that detecting this difference is difficult.

    But from looking at the actual source code, the description in this post is factually correct. And the only possible reason I can imagine for doing it is performance. The IntegerCache class is used nowhere else within Integer. Of course, “performance” is a slippery thing and here it may really not be a question of speed but of memory conservation. Your sample size looks awfully small to measure anything useful (given that you’re talking about things that execute in nanoseconds and that hotspot can take minutes to warm up).

  4. Tom Copeland says:

    Yup, PMD has an IntegerInstantiation rule in the migrating ruleset.

  5. john smith says:

    My test showed that 3 million new Integer(x)’s were FASTER than 3 million Integer.valueOf() calls. It was only 50 milliseconds faster, but it was faster. So basically for 99.99% of all apps out there, there is no difference at all between the two methods of converting an int to Integer.

    Prove me wrong.

  6. Alex says:

    You’re a tenacious man, John Smith. Check out Update 3 up above for a performance test.

  7. Peter Lawrey says:

    Why you shouldn’t use Integer.valueOf(int).
    Because that is exactly what auto boxing does

    int n;
    Integer n2 = n; // Calls Integer.valueOf(int) for you.

    The point is don’t use new Boolean(boolean), new Short(short), new Integer(int), new Character(char) and new Long(long) as all these have preallocated objects.

    Additionally don’t use new String(String) unless you know what you doing.

  8. Peter Lawrey says:

    I forgot to include new Byte(byte) in the list of don’ts.

  9. john smith says:

    The point I am making is that it doesn’t matter at all which you do in 99.99% of apps. Your blog entry says that you must use valueOf or you’ll have poor performing apps. If you use new Integer() everywhere, no one will ever notice the difference in speed in most apps. It makes absolutely no difference.

  10. john smith says:

    I just looked at your test and it is extremely flawed. If all your app did is get an Integer for the value of 0 then you might save a few nano-seconds by using valueOf(). If it is a real app and you are using all integer values, new Integer() is faster by milli-seconds.

    Again, we are talking nano and milli-seconds.

  11. Alex says:

    Of course it’s extremely flawed! It’s a micro-benchmark! Micro-benchmarks are inherently flawed.

    I’m using 0 purely because that illustrates the benefit of using the cache. Because all of the values are cached (between -127 and 128) the performance is identical if the values are in that range (which is probably common).

    If you read my blog again, I think you’ll see that I make absolutely no claim that your app will perform poorly if you don’t use valueOf. That’s a wild exaggeration. The point is that using a cached value *is* faster (after the first call) and more space-efficient (after the 256th call) and most programs that call it at all will likely cross both thresholds. So, given a choice between two options, one that is faster and more space efficient, I’ll pick the faster one as my default behavior.

    I think it’s interesting to understand the details of how this works because the range of the cache indicates that cases where your int values are outside the cache range are going to incur a couple unnecessary range checks. So if you know you will be using Integers to represent values predominantly outside this range, it would be faster to construct new Integers. My point about the factory method though is that going through valueOf() allows the JDK to further optimize this in the future (a wider cache, native implementation, etc) without you changing your code.

  12. Jonathan Allen says:

    That is wrong in so many ways.

    1. It only works on a very, very small subset of integers. Most of the time it is a needless check followed by the same allocation you would be anyways.
    2. Allocations are cheap in a GC system and Integers are small, so it is solving the wrong problem.
    3. Since the Interger objects tend to allocated randomly and live forever, you lose locality. Looping through numbers 1 to 10 could literally mean ten page faults.
    4. It can kill performance on a multi-threaded application. If you have two or more threads that need to create Integers, they will be constantly blocking one-another. (I wonder if multi-core would make it even worse.)

    The bench marks are not going to show this because, unlike most real applications, it doesn’t expose the memory locality or the multi-threading issues.

  13. Anonymous Coward says:

    The first call to valueof under the section “Update 3″ did not take over one second. You measured it to 1742679 ns which is approximate to 1,74 thousandth of a second.

    The first call doesn’t suck at all — it just illustrates how blazingly fast object creation has become.

  14. Alex says:

    Hey Anonymous Coward, you’re absolutely correct (it was late and math is hard). Fixed in the text. I’d still say >1 ms (especially in comparison to the later averages of a few ns) but point taken.

  15. Brian says:

    What happens now if one uses an Integer created with a ‘valueOf’ as a key into a WeakHashMap? It seems that it will now be a strong reference that will never go away.

  16. Alex says:

    I think that’s correct. Both WeakReferences and SoftReferences are cleared only when no strong references exist to the object and a strong reference will always exist via the IntegerCache. So, if you wanted to use an Integer as a key in a WeakHashMap, you’d need to use new Integer().

  17. Tim Vernum says:

    Your benchmark doesn’t appear to be very useful.

    Firstly – your code is wrong – the “newInt()” benchmark is calling new Integer() AND Integer.valueOf(), so it’s doing twice as much work as the “valueOf” benchmark.

    Secondly, as others have said, valueOf *might* be useful if you know that all your numbers are in the range -127 to 128, but now many apps have that feature?
    For real world cases you also needs to test the cost of calling valueOf(x) when x is outside that range.
    Half a benchmark is worse than no benchmark – your results may lead someone to change their code – but you haven’t done enough testing to warrant that.

  18. Alex says:

    Tim, good catch on the first point – had a copy/paste error there. I’ve re-run and updated the numbers (not much difference really).

    Second, I based my original recommendation on the JDK javadocs which say “If a new Integer instance is not required, this method should generally be used in preference to the constructor Integer(int), as this method is likely to yield significantly better space and time performance by caching frequently requested values.” I still think this is good advice if you heed the first clause regarding whether a new Integer is required (see the prior WeakHashMap comment).

    Also, I did run a quick test to look at the difference if you are requesting a value in the cached range vs out of the cached range:

    new (regardless): 15 ns
    valueOf (in range): 7 ns
    valueOf (not in range): 16 ns

    So, if you always use valueOf and you request a value out of the cached range, you are costing yourself a nanosecond each time. It’s hard to see how 1 ns (or even an 8 ns savings will help you that much), but as I’ve said above, I think the benefit is more in the space savings than the speed. And the larger point is that using a factory method allows the implementation to better optimize the implementation (better caching, native calls, magic JDK hooks, who knows what) in the future without you changing your own code.

  19. Illia says:

    I tested not in range (actually most of the data we have are such). I used for cycle variable of the same type (int for Integer, long for Long)

    new Integer
    avg = 11 ns
    Integer valueof
    avg = 13 ns

    you say 2 nanoseconds?
    13/11 = 1.18, neraly 20% or 1/5 longer

    Long is different (you can see source code why)

    new Long
    avg = 12 ns

    Long.valueof
    avg = 20 ns

    now nearly twice longer

  20. Prasad says:

    In the actual projects, we very rarely (I like to say ‘never’ almost) come across a scenario where we need the int values between the range -128 and 127. If there is such appln, valueOf() saves time.
    So, in general, instead of using valueOf(), we can create & cache (in a static block) the frequently used int values (as Integer objs) in a HashMap (or similar collection). We can use it throughout the application. It gives more benefit, I believe. So, the usage of valueOf() depends on the context and should not be treated as the best option over new Integer() ALWAYS.

    One surprising fact is PMD made the valueOf() as the mandatory i.e applicable for all the cases :)

    Thx
    ~Prasad

Trackbacks

Check out what others are saying about this post...
  1. […] I’ve blogged before on one instance where this is used in the JVM itself. Integer.valueOf() lazily caches Integer objects from -128 to 127 (as required by the JVM specification). But, you don’t want to incur the cost of creating the cache unless it’s needed, so the cache is initialized on demand via a private static class. […]

  2. […] Yesterday I did a bit of exploring how well PMD’s Eclipse plug-in integrates into our development tools / process. Even if you are not planning on implementing static analysis and you are a java shop, you owe it to yourself to become familiar with the rules that PMD and other such tools utilize. Why? Because while we have in technology a great tradition of reinventing the wheel it is generally not too efficient and managers tend to frown on it too. So why do you want your group to re-learn in their context what others have learned in similar ones? You will also learn more about the underlying technology of the product you are testing. That is certainly not a bad thing either. Back to the original point. The code I work against is triggering a pretty sizable amount of hits to the IntegerInstantiation rule which they summarize as IntegerInstantiation: In JDK 1.5, calling new Integer() causes memory allocation. Integer.valueOf() is more memory friendly. I can’t argue with the whole ‘memory friendly’ aspect, but we’re using Java 1.4 which makes this a false positive (something you learn to live with in the world of static analysis. But through the process I extended my knowledge of how Java works; initially from the description of the rule, and further through articles I found like this which is absolutely a Good Thing™. […]

  3. […] Later I came across this blog post. When I completed reading it I could not agree more with my mentors line of reasoning. […]