Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-5024566

Object integrity maybe changing using ParallelGC when a Full GC occurs



    • Subcomponent:
    • Resolved In Build:
    • CPU:
    • OS:
    • Verification:



        Customer is running Solaris 8 on eight CPU system.
        When they experience a Full GC using 1.4.2, their transaction server
        throws DataValidationExceptions complaining the integrity of the data has
        changed after the collection is finished. This causes rollbacks in
        transaction requests and trades will go unfilled and money is lost.
        The server is only using parallelgc at that time for cleaning the
        young generation.

        The only change they say in their trade environment is switching out
        1.4.1_03 and using 1.4.2_03. The transaction server is written in c++
        so they have natives threads referencing Java Objects. Though turning
        on the CMS collection with the UseParGC seems to hide the problem. They
        never see the update exceptions after a Full GC using these options
        with 1.4.2.

        The customer has run a couple of test with the UseParNewGC collector for several
        hours and did not experience any UpdateExceptions from their transaction
        server after Full GC occurred. The heap options are listed below:
        command: /usr/local/j2sdk1.4.2_01/bin/java
        -server -showversion -Xms512m
        -Xmx512m -XX:NewSize=500m -XX:MaxNewSize=500m -XX:InitialSurvivorRatio=4
        -XX:TargetSurvivorRatio=100 -XX:+PrintCompilation -XX:+UseParNewGC
        -XX:MaxPermSize=256MB -XX:PermSize=3m -XX:MinPermHeapExpansion=1m
        -XX:MaxPermHeapExpansion=10m -XX:-UseAdaptiveSizePolicy
        -XX:+DisableExplicitGC -XX:+PrintTenuringDistribution
        -XX:+PrintHeapAtGC -verbose:gc -XX:+PrintGCTimeStamps -Xnoclassgc

        Why would there be such a difference in behavior?

        The application is more like 98% Java, and 2% C++.
        The C++ code handles some of their ORB transport code (using
        a C-API to a 3rd-party sockets vendor). When the Java code talks
        to another process, it calls down to the C++ layer. This C++ code
        establishes the connection to the outside, and creates a
        "Receive" thread to receive messages from the newly created socket.
        Or, if another process initiates contact, the same receive thread is
        created for incoming messages.

        When a new message is received from a remote process, very
        minimal processing is done at the C++ layer before the JNI UpCall
        takes place. The Java code invoked from JNI, processes the ORB message
        and figures out which handling thread (a pure Java thread) the
        message should be dispatched to. The message is just put onto
        an internal queue, and then the dispatch thread picks it up and
        calls application code (like plug in code) to actually do
        application-level work.

        Objects in Question
        So, the C++ stuff is pretty thin, and just interacts with the
        older C-API to the 3rd party vendor software (which itself is
        really just a layer on top of sockets). The C++ threads that
        are created are connected to the JVM so they can make calls to
        the VM to create buffers, which the incoming messages are copied
        into. That buffer is basically the only Java object that the
        C++ thread creates, and it is passed up during the JNI up-call.
        This buffer is copied into separate objects created by the ORB
        code, so after the JNI call, the C++ created buffers are no
        longer referenced.

        So, the objects that get modified (unexpectedly) after the few
        FullGC's in 1.4.2, are not C++ created, nor are they
        stored/referenced in the C++ code.

        The GC output and log information is available in the attachments.


            Issue Links



                jmasa Jon Masamitsu (Inactive)
                atongschsunw Albert Tong-schmidt (Inactive)
                0 Vote for this issue
                3 Start watching this issue