Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6823565

Excessive use of HandleList class in de-serialization code causes OutOfMemory

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: P4
    • Resolution: Fixed
    • Affects Version/s: 6u12
    • Fix Version/s: 9
    • Component/s: core-libs
    • Labels:

      Backports

        Description

        FULL PRODUCT VERSION :
        All versions e.g. 6u12.

        ADDITIONAL OS VERSION INFORMATION :
        Any at all eg RHEL 5, Solaris 10, Windows Vista

        A DESCRIPTION OF THE PROBLEM :
        OutOfMemoryError when de-serializing Trove THashMap instances after upgrading to the latest version of the Trove collections framework (2.0.4).

        STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
        1. Ensure that you have a jar file containing the latest build of trove (2.0.4).

        2. Compile the attached source code (javac -classpath trove-2.0.4.jar:. TestTrove.java).

        3. Run the executable.

        4. During the "pause" after the number is printed out, use "jmap -histo <pid>" to display the contents of the heap.

        5. Observe the large number of instances of class java.io.ObjectInputStream$HandleTable$HandleList:


        num #instances #bytes class name
        --------------------------------------
          1: 100879 4542896 [Ljava.lang.Object;
          2: 20519 3839392 [I
          3: 1242 1123936 [B
          4: 20011 1120616 java.io.ObjectStreamClass$WeakClassKey
          5: 6221 751864 <methodKlass>
          6: 6221 748280 <constMethodKlass>
          7: 20001 640032 java.io.ObjectInputStream$HandleTable$HandleList

        In this output the instances of HandleList are at rank #7.


        EXPECTED VERSUS ACTUAL BEHAVIOR :
        EXPECTED -
        I would not expect to see so many instances of HandleList.

        REPRODUCIBILITY :
        This bug can be reproduced always.

        ---------- BEGIN SOURCE ----------
        import gnu.trove.*;
        import java.io.*;

        public class TestTrove {
                public static void main(String[] args) throws Exception {
                        THashMap tMap = new THashMap();

                        for (int i=0; i<10000; i++) {
                                tMap.put(new Integer(i), new Double(i));
                        }

                        ByteArrayOutputStream bos = new ByteArrayOutputStream();
                        ObjectOutputStream os = new ObjectOutputStream (bos);
                        os.writeObject(tMap);
                        os.close();
                        ByteArrayInputStream bis = new ByteArrayInputStream(bos.toByteArray());
                        ObjectInputStream is = new ObjectInputStream(bis);
                        THashMap tMapCopy = (THashMap)is.readObject();
                        System.out.println(tMapCopy.get(new Integer(1000)));
                        Thread.sleep(10000);
                }
        }

        ---------- END SOURCE ----------

        CUSTOMER SUBMITTED WORKAROUND :
        The problem appears to be down to a bug in the method markDependency() in class ObjectInputStream.java.

        In the switch() statement

        case STATUS_UNKNOWN:

        It looks like there should be an addition test something like:

        if (dependent == target) {
                break;
        }

        The problem is that since version 2.0.4. of Trove, the THashMap class contain a reference to itself as the first field in the class. This pointer is interpreted as an unresolved target which means that an instance of HandleList is created for all the remaining objects that are de-serialized in the THashMap instance. But since THashMap is a collection this is an arbitrarily large number of objects (and hence instances of HandleList since one is created for each object) and it can quickly cause the entire heap to get filled up.
         
         There doesn't seem to be any value in identifying a self-reference as "unresolved" since by definition the object depends on itself and if it is ultimately unresolved (e.g. by an exception) then this will be rippled up the object graph anyway.

        By adding the code above to the ObjectInputStream class the problem is resolved. There may of course be some subtlety that I have not identified.

        Please note that this is potentially a serious issue - the trove collection set is widely used by many applications and this represents a serious problem when de-serializing large collections.

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  redestad Claes Redestad
                  Reporter:
                  ndcosta Nelson Dcosta
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  2 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved:
                    Imported:
                    Indexed: