samedi 24 septembre 2011

Java Puzzler - 57 is back !

Detecting a bug and solving it may be very frustrating, but it becomes enjoyable when you try to turn it into a challenge. Once again, I ran into a bug. Once again, I did not see it comming and once again, I will turn this into a puzzler.

No XML Serialization this time, but the same pattern : You will see the code and will be asked to guess what it produces.


Puzzler

The following program tries to compare the method equals and containsAll of java sets and lists. It creates one set and one list with a regex pattern that matches "Mickey" and "mouse". What does the program print ?

public static void main(String... args) {
    System.out.println(prepareList().equals(prepareSet()));
    System.out.println(prepareList().containsAll(prepareSet()));
}

private static List<Pattern> prepareList() {
    List<Pattern> result = new ArrayList<>();
    result.add(Pattern.compile("[Mm](ickey|ouse)"));
    return result;
}

private static Set<Pattern> prepareSet() {
    Set<Pattern> result = new HashSet<>();
    result.add(Pattern.compile("[Mm](ickey|ouse)"));
    return result;
}

What do we expect ?

This program seems pretty straightforward. It compares two collections containing the same Pattern. If we take a brief look at the javadoc of the List interface, we see that the equals method should return true "if and only if the specified object is also a list, both lists have the same size, and all corresponding pairs of elements in the two lists are equal". We expect the first statement to print false for a Set is not a List. The containsAll method takes a Collection, which is the superinterface of Set and checks if the list "contains all of the elements of the specified collection". Therefore the program should print :

false
true

What do we have ?

If you ran it, you saw a different thing :

false
false

Our first assumption seems correct, but not the second one. What can be wrong ?


Solution

Our program only uses base classes of the Java platform. Set and List classes are used in almost every Java application, so they are hardly arguable. Our only chance is to create a test that will first assert that two identical Pattern does equal.

public static void main(String... args) {
    Pattern original = Pattern.compile("[Mm](ickey|ouse)");
    Pattern copy = Pattern.compile("[Mm](ickey|ouse)");
    System.out.println(original.equals(copy));
}

If you ran this test, you saw that it print false, which means that two identical patterns does not equal.

Indeed, if you look closely to the source code of the Pattern class, you will see that it does not override the equals() method of java.lang.Object, nor does it override the hashCode() method. The following program will therefore print "3", because there is 3 "distinct" patterns in the set.

public static void main(String... args) {
    Set<Pattern> result = new HashSet<>();
    result.add(Pattern.compile("[Mm](ickey|ouse)"));
    result.add(Pattern.compile("[Mm](ickey|ouse)"));
    result.add(Pattern.compile("[Mm](ickey|ouse)"));
    System.out.println(result.size());
}

We could override hashCode() and equals() with the following implementations to solve the problem. However, to make things worse, the Pattern class is final, so we must create a wrapper and all sort of delegates methods.

@Override
public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;
    return this.toString().equals(o.toString());
}

@Override
public int hashCode() {
    return this.toString().hashCode();
}

The lesson of this puzzle is : when comparing patterns, compare the toString() results instead of the objects themselves.

As i said in my previous article, if you haven't read the book "Java Puzzlers" yet, I urge you to do so. If you have read it, you may have noticed that this puzzler is quite similar with Puzzler #57 - "What's in a Name ?", which also explains the title of this article.

Do know why this implementation choice was made ? Did you run into this bug and find another workaround ? Let's hear about it in the comments.

dimanche 11 septembre 2011

Java Puzzler - Defensive Serialization

It has been a long time since I didn't write on Pingtimeout. Sorry for that :)

Yesterday, while I was adding some features to JTailPlus, I ran into an XML Serialization problem. I had some difficulties to find out what was wrong, and I must admit I didn't expect this kind of error to occur. In this article, I will expose the problem I ran into, and the solution.

First of all, if you don't have read the book "Java Puzzlers" yet, I urge you to do so. It is a must-read for every java developer. It is all about "Traps, Pitfalls, and Corner Cases" of the Java programming language. I enjoyed myself reading the puzzles, even if sometimes I almost tore my hair out.

As this book does, I will present the problem I ran into as a Puzzler. You will see the code and will be asked to guess what it produce. Let's begin.


Puzzler

The following class represents a person by it's first name, last name and children. The library commons-io is used to close streams. In the main method, we serialize and deserialize a person (aka John Jackson). What does the program print ?

public class Person implements Serializable {
    public static final long serialVersionUID = 1L;

    private String firstName;
    private String lastName;
    private List<Person> children;

    public Person() {}

    public Person(String firstName, String lastName, Person... children) {
        this.firstName = firstName;
        this.lastName = lastName;
        this.children = new ArrayList<>();
        Collections.addAll(this.children, children);
    }

    public List<Person> getChildren() { return Collections.unmodifiableList(children); }
    public String getFirstName() { return firstName; }
    public String getLastName() { return lastName; }
    public void setFirstName(String firstName) { this.firstName = firstName; }
    public void setLastName(String lastName) { this.lastName = lastName; }
    public void setChildren(<Person> children) { this.children = children; }

    @Override
    public String toString() {
        String result = firstName + ' ' + lastName;
        if(children != null && children.size() > 0) {
            result += "'s children are " + children;
        }
        return result;
    }

    public static void main(String [] args) {
        FileOutputStream fileOutputStream = null;
        FileInputStream fileInputStream = null;

        Person jim = new Person("Jim", "Jackson");
        Person barbara = new Person("Barbara", "Jackson");
        Person john = new Person("John", "Jackson", jim, barbara);

        try {
            System.out.println(john);

            File temporaryFile = File.createTempFile("XML-Test-", ".xml");
            temporaryFile.deleteOnExit();
            fileOutputStream = new FileOutputStream(temporaryFile);
            XMLEncoder encoder = new XMLEncoder(fileOutputStream);
            encoder.writeObject(john);
            encoder.close();

            fileInputStream = new FileInputStream(temporaryFile);
            XMLDecoder decoder = new XMLDecoder(fileInputStream);
            Person result = (Person) decoder.readObject();
            decoder.close();

            System.out.println(result);
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            IOUtils.closeQuietly(fileOutputStream);
            IOUtils.closeQuietly(fileInputStream);
        }
    }
}

What do we expect ?

This program does not seem weird. The class Person is a java bean and it is serializable. The main method creates three persons (John Jackson and his two children, Jim and Barbara), John is then serialized to XML with the XMLEncode class and deserialized using the XMLDecoder class. Therefore the program should print :

John Jackson's children are [Jim Jackson, Barbara Jackson]
John Jackson's children are [Jim Jackson, Barbara Jackson]

What do we have ?

If you ran it, you saw a different thing :

John Jackson's children are [Jim Jackson, Barbara Jackson]
java.lang.NullPointerException
Continuing ...
John Jackson

We lost two kids in the process, what can be wrong ?


Solution

The only unusual thing in our program is the use of the class Collections to populate the children attribute and to return an unmodifiable version of the list when the getter is called. We assume that the author of this code did not want anyone but the class Person itself to modify the list "children". Except that, we have a standard Java Bean.

The addAll method seems pretty straightforward : "The behavior of this convenience method is identical to that of c.addAll(Arrays.asList(elements)), but this method is likely to run significantly faster under most implementations.", Javadoc says. Sweet, a fast implementation.

The unmodifiableList method also seem simple : it returns "an unmodifiable view of the specified list. [...] attempts to modify the returned list, whether direct or via its iterator, result in an UnsupportedOperationException. [...] The returned list will be serializable if the specified list is serializable.".

Our initial is is an ArrayList, which is serializable, so the generated list is serializable as well. The console output does not show an UnsupportedOperationException but a NullPointerException. This does not help us very much.


Actually, the bug _is_ caused by the defensive copying design. If we replace the getter by another defensive copying method, the error is still there, John Jackson's kids are still lost :

public List<Person> getChildren() { return new ArrayList<Person>(children); }

The answer lies in the definition of a Java Bean itself. According to wikipedia, the assumed conventions for Java Beans are :

  1. The class must have a public default constructor (no-argument).
  2. The class properties must be accessible using get, set, is (used for boolean properties instead of get), [...] following a standard naming-convention.
  3. The class should be serializable.

The author of this code (me) did not follow the second convention : the "children" attribute is not directly accessible using its appropriate getter, instead, another object is returned. If we replace the getter statement by the following, everything works fine and John Jackson's children are back.

public List<Person> getChildren() { return children; }

The lesson of this puzzle is : be careful when dealing with java beans and defensive copying. The defensive copying design, which can be a valid choice in numerous situations, is inappropriate here.