Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4450088

Win32: Intermittent Carry Error During 64-bit Arithmetic

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: P1
    • Resolution: Won't Fix
    • Affects Version/s: 1.2.2
    • Fix Version/s: None
    • Component/s: hotspot
    • Labels:
    • Subcomponent:
    • CPU:
      x86
    • OS:
      windows_nt

      Description



      Name: yyT116575 Date: 04/24/2001


      java version "1.2.2"
      HotSpot VM (1.0.1, mixed mode, build g)

      This report is similar to bug #4390029, which was previously closed as "not
      reproducible". Our server product is encountering EXACTLY the same problem
      described within that report, which states that
      GregorianCalendar.julianDayToMillis() returns a bad value. Our investigation
      shows this to be true, but we believe that there is a more fundamental problem
      within the JVM that causes this "symptom" to manifest.

      GregorianCalendar.julianDayToMillis() is a one line function that returns the
      result of a basic arithmetic statement: (julian - EPOCH_JULIAN_DAY) * ONE_DAY.
      The following trace output from our debug version of this method shows the
      arithmetic error (which sometimes takes months to manifest):

      * GregorianCalendar.julianDayToMillis() -> result: -371084186649600000,
      julianDay: 2452020, EPOCH_JULIAN_DAY: 2440588, ONE_DAY: 86400000

      Of course, the correct result should be 987724800000. Note that once the error
      occurs it appears to be "permanent" - that is, julianDayToMillis() continues to
      return a bad result.

      Further examination of the good/bad results shows that the bad result is NOT
      random. This can be seen by examining the results in hex.

      Good Result: 0x000000e5f8fc6000
      Bad Result: 0xfad9a4e5f8fc6000

      As shown, the least-significant 10 nibbles match. A plausible explanation for
      the "fad9a4" within the most-significant 6 nibbles of the bad result can be
      observed by walking through the computation manually as follows:

        To compute (julian - EPOCH_JULIAN_DAY), it is necessary to take the 2's
      complement of EPOCH_JULIAN_DAY, sign extend it from 32 to 64 bits, and add it
      to "julian" (note that EPOCH_JULIAN_DAY is the only 32-bit value in the
      equation; all others are 64-bits):

      0x0000000000256a34 (julian day 2452020 in hex)
      0xffffffffffdac274 (-EPOCH_JULIAN_DAY sign extended in hex)
      ------------------
      0x0000000000002ca8 (correct result of the two values added together)

      Multiplying the above by "ONE_DAY" (0x0000000005265c00) yields the correct
      result (i.e. 0x000000e5f8fc6000). However, if the underlying runtime failed to
      perform a "carry" from bit position 31 to bit position 32 (i.e. incorrect carry
      from lower 32-bit word to upper 32-bit word), then the following result would
      occur:

      0x0000000000256a34 (julian day 2452020 in hex)
      0xffffffffffdac274 (-EPOCH_JULIAN_DAY sign extended in hex)
      ------------------
      0xffffffff00002ca8 (bad result if carry fails from bit 31 to bit 32)

      Multiplying the above result by "ONE_DAY" effectively adds a "-ONE_DAY" (i.e.
      0xfad9a400) into the upper 32-bits, yielding the incorrect result seen (i.e.
      0xfad9a4e5f8fc6000).

      Although this problem is manifesting itself as a "negative date issue" in our
      server application, it is likely more pervasive and insidious, since it appears
      to be an arithmetic error associated with 64-bit values. Apparently, the JVM
      *occassionally* enters into a state in which it does not correctly perform 64-
      bit addition (possibly due to a carry error). This means that the problem
      could be extremely dangerous for all applications that rely on 64-bit
      arithmetic, not just those using Calendar functions. For example, consider the
      implications to a financial application.

      It is critical that this problem be addressed and fixed, versus being closed
      as "not reproducible". Although we have not found a "smoking gun" that
      triggers this problem, it is reproducible and impacts our deployment severely.
      We suggest that Sun investigate the underlying code to determine the possible
      cause, without waiting to reproduce it.
      (Review ID: 123169)
      ======================================================================

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              yyoungsunw Yung-ching Young (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: