Benchmarking Different Approaches to Equality in Java

Written by: Stephen Connolly

Java's type system all starts with Object . Every class extends from Object. The methods implemented by Object can be assumed present on every object. There are technically 11 methods that every Java object has but if we only consider public methods that can be implemented by subclasses there are really only three:

  1. equals(Object)

  2. hashCode()

  3. toString()

The toString() method is supposed to return a string representation of the object to aid debugging . (that last bit was added by me, but in my opinion it is the critical reason why every object should implement toString() )

The equals(Object) method is required because Objects variables are in effect pointers to the Object data, so the equality operator == will only tell you if the pointers are the same not if the Object instance is the "same".

The hashCode() method is provided as a utility for hashing based collections. There are some contractual restrictions that are related to how the equals(Object) method compares object instances.

Now there is a lot we could write on the equals(Object) method and what it means for two object instances to be equal. For example:

  • there can be a heated debate as to whether one should even implement equals(Object) for JPA entity classes...

  • when is it ok to "break" the symmetric requirements of equals(Object) - the JRE "breaks" this for the collection types because they made the decision that a HashSet should be the same as a TreeSet as long as they have exactly the same elements.

At some point, however, you will come to a decision and decide that two objects are equal if a defined list of properties of the object are equal. Now the time has come to actually write the equals(Object) method. The typical implementation will look something like this:

public boolean equals(Object o) {
    if (this == o) { // fast path check
        return true;
    }
    if (o == null || getClass() != o.getClass) { // ensure same type
        return false;
    }

    Bean bean = (Bean) o;
    // compare each field / property in sequence... make "fastest" comparisons first
    if (getA() != null ? !getA().equals(bean.getA()) : bean.getA() != null) {
        return false;
    }
    if (getB() != null ? !getB().equals(bean.getB()) : bean.getB() != null) { 
        return false; 
    }
    ... 
    return getW() != null ? getW().equals(bean.getW()) : bean.getW() == null;
}

In fact, most people will just use their IDE to generate the equals method... But there are some exceptions...

A friend showed me some code they had found the other day. It was just an .equals(Object) method. Both of us were simultaneously horrified and intrigued. Here is the gist of the equals method:

@Override
public boolean equals(Object o) {
    return Optional.ofNullable(o)
            .filter(x -> x instanceof Bean)
            .map(x -> (Bean) x)
            .filter(x -> Objects.equals(x.getA(), getA()))
            .filter(x -> Objects.equals(x.getB(), getB()))
            .isPresent();
}

Now on the one hand, this is an ingenious use of Java 8's Optional... on the other hand OMG Noooo...

In general I tend to favour having IntelliJ generate the equals method and then tweak it if necessary. If we let IntelliJ generate an equivalent equals method (notice the x -> x instanceof Bean allows breaking symmetric equality), we get the following:

public boolean equals(Object o) {
    if (this == o) {
        return true;
    }
    if (!(o instanceof Bean)) {
        return false;
    }

    Bean bean = (Bean) o;

    if (getA() != null ? !getA().equals(bean.getA()) : bean.getA() != null) {
        return false;
    }
    return getB() != null ? getB().equals(bean.getB()) : bean.getB() == null;
}

There are a number of other templates I can use, for example I could ask InjelliJ to use Apache Commons Lang 3's EqualsBuilder:

public boolean equals(Object o) {
    if (this == o) {
        return true;
    }

    if (!(o instanceof Bean)) {
        return false;
    }

    Bean bean = (Bean) o;

    return new EqualsBuilder()
            .append(getA(), bean.getA())
            .append(getB(), bean.getB())
            .isEquals();
}

Or I could ask IntelliJ to use Objects.equals as introduced in Java 7:

public boolean equals(Object o) {
    if (this == o) {
        return true;
    }
    if (!(o instanceof Bean)) {
        return false;
    }
    Bean bean = (Bean) o;
    return Objects.equals(getA(), bean.getA()) &&
            Objects.equals(getB(), bean.getB());
} 

Looking at all those generated methods, I actually started to admire the aesthetics of the Optional filter map filter isPresent chain. So I thought to myself, well what about the bytecode, these are all nice short methods:

public static <T> Optional<T> ofNullable(T value) {
    return value == null ? empty() : of(value);
}
public Optional<T> filter(Predicate<? super T> predicate) {
    Objects.requireNonNull(predicate);
    if (!isPresent())
        return this;
    else
        return predicate.test(value) ? this : empty();
}
public<U> Optional<U> map(Function<? super T, ? extends U> mapper) {
    Objects.requireNonNull(mapper);
    if (!isPresent())
        return empty();
    else {
        return Optional.ofNullable(mapper.apply(value));
    }
}
public boolean isPresent() {
    return value != null;
}

They should all be candidates for inlining at runtime. Will the JVM optimize it all away? How could we answer such a question? If only there was some tool that could help!

Enter JMH the micro-benchmarking tool of choice.

One of the problems that we face in determining exactly how well the JVM will optimize the method is that the JVM will be trying to generically optimize things anyway. So how does this affect the measurements?

Well suppose we start with a benchmark that looks like:

@Benchmark
public void optionalFilter() {
    Bean b = new Bean(...);
    b.equals(b);
    b.equals(null);
    b.equals("");
    b.equals(new Bean(...));
}

The results of those methods are being discarded, and the JVM can determine that the equals method is side-effect free and in fact the entire benchmark method is side-effect free... let's just throw that away completely.

Thankfully JMH provides us with the Blackhole to prevent some of that optimization by the JVM... so now you probably have something like:

@Benchmark
public void optionalFilter(Blackhole bh) {
    Bean b = new Bean(...);
    bh.consume(b.equals(b));
    bh.consume(b.equals(null));
    bh.consume(b.equals(""));
    bh.consume(b.equals(new Bean(...)));
}

So the problem with this is that the JVM can provably determine that the b.equals call can only ever be our b.equals call... normally a good thing, but in general equals methods are used from contexts where the call must be virtual. We do not want the JVM to make some optimization based on eliminating the virtual call.

An additional concern is that we are mixing object allocation with the benchmarking. Much better would be to move all the allocated objects to fields and just test the method invocation:

private final Object b = new Bean(...);
private final Object o = new Bean(...);

@Benchmark
public void optionalFilter(Blackhole bh) {
    bh.consume(b.equals(b));
    bh.consume(b.equals(null));
    bh.consume(b.equals(""));
    bh.consume(b.equals(o));
}

Oh if only it were that simple! The JVM can still see that the field is only ever a Bean when called from the benchmark...

private final Object b;
private final Object o;

@Setup(Level.Iteration)
public void beforeIteration() {
    b = new Bean(...);
    o = new Bean(...);
}

@TearDown(Level.Iteration)
public void afterIteration(Blackhole bh) {
    b = "dummy";
    o = 42;
    optionalFilter(bh);
}

@Benchmark
public void optionalFilter(Blackhole bh) {
    bh.consume(b.equals(b));
    bh.consume(b.equals(null));
    bh.consume(b.equals(""));
    bh.consume(b.equals(o));
}

Ok, with the above we ensure that the JVM is not able to infer and special cases about what object's equals implementation is being used, it should only be able to optimize the actual equals method.

I have seen the phrase "Shipilëv! Shipilëv! Shipilëv! " used to try and prevent Aleksey magically appearing to spot further flaws in your usage of JMH... let's see if it works here!

If you are interested in the actual benchmark code I'm sure you will find some more subtle flaws - this is always the way with micro benchmarking... but let's see the results I got:

On my Mac (Oracle Java 8u131):

BenchmarkModeCountScoreErrorUnits
Commons Lang 3sample62299271.146± 2.520ns/op
IntelliJ defaults (getters)sample59405780.110± 2.862ns/op
IntelliJ defaults (fields)sample59020577.860± 2.920ns/op
Objects.equalssample67222169.942± 0.993ns/op
Optional filter map filter isPresentsample613213115.478± 6.212ns/op

On a docker image running in AWS (OpenJDK 8u141):

BenchmarkModeCountScoreErrorUnits
Commons Lang 3sample

8834858

671.479± 50.771ns/op
IntelliJ defaults (getters)sample

9194271

617.315± 33.507ns/op
IntelliJ defaults (fields)sample

8841755

620.480± 34.359ns/op
Objects.equalssample

9294152

627.180± 34.926ns/op
Optional filter map filter isPresentsample

10732634

800.054± 40.756ns/op

In both cases the Optional filter map filter isPresent is significantly different from the others. The differences within the other implementations are negligible.

So the JMH results are in, and it seems like the "cool" use of Optional in equals is not quite good enough yet.

Stay up to date

We'll never share your email address and you can opt out at any time, we promise.