Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8176889

AOT: aot compilation hangs while compiling jdk.localedata module

    Details

    • Type: Bug
    • Status: In Progress
    • Priority: P2
    • Resolution: Unresolved
    • Affects Version/s: 9, 10
    • Fix Version/s: 10
    • Component/s: hotspot
    • Labels:
    • Subcomponent:
    • Understanding:
      Fix Understood

      Description

      try to compile jdk.localedata module via jaotc. Compilation can't finish(at least in reasonable time)

        Activity

        Hide
        jcm Jamsheed C M added a comment - - edited
        lscpu
        Architecture: x86_64
        CPU op-mode(s): 32-bit, 64-bit
        Byte Order: Little Endian
        CPU(s): 6
        On-line CPU(s) list: 0-5
        Thread(s) per core: 6
        Core(s) per socket: 1
        Socket(s): 1
        NUMA node(s): 1
        Vendor ID: GenuineIntel
        CPU family: 6
        Model: 45
        Stepping: 7
        CPU MHz: 2893.050
        BogoMIPS: 5786.10
        Hypervisor vendor: Xen
        Virtualization type: para
        L1d cache: 32K
        L1i cache: 32K
        L2 cache: 256K
        L3 cache: 20480K
        NUMA node0 CPU(s): 0-5

        Free mem = ~20g

        number of method compiled ~4000


        GC config used , parallel with 24g xmx,xms, 12g xmn
        Total runtime: 169.7s
        application time : 134.3047932s
        app pause time : 35s
        Total compilation time : 113s
        compilation time : 98.84s
        compilation time pause time : 15s

        Compilation time is too high (98s) for compiling 4k localedata methods , while 55k methods in java.base takes just 84s.
        Show
        jcm Jamsheed C M added a comment - - edited lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 6 On-line CPU(s) list: 0-5 Thread(s) per core: 6 Core(s) per socket: 1 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 45 Stepping: 7 CPU MHz: 2893.050 BogoMIPS: 5786.10 Hypervisor vendor: Xen Virtualization type: para L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 20480K NUMA node0 CPU(s): 0-5 Free mem = ~20g number of method compiled ~4000 GC config used , parallel with 24g xmx,xms, 12g xmn Total runtime: 169.7s application time : 134.3047932s app pause time : 35s Total compilation time : 113s compilation time : 98.84s compilation time pause time : 15s Compilation time is too high (98s) for compiling 4k localedata methods , while 55k methods in java.base takes just 84s.
        Hide
        jcm Jamsheed C M added a comment - - edited
        total time spent on a typical getContents compilation : ~5s
        time spent on different phases for a getContents method

        |-> Summary
            |-> DuplicateGraph_Accm=246.9 ms
            |-> DuplicateGraph_Flat=246.9 ms
            |-> FrontEnd_Accm=2164.1 ms
            |-> FrontEnd_Flat=1.2 ms
            |-> NodeClass.Init.AllowedUsages_Accm=0.3 ms
            |-> NodeClass.Init.AllowedUsages_Flat=0.3 ms
            |-> NodeClass.Init.AnnotationParsing_Accm=6.6 ms
            |-> NodeClass.Init.AnnotationParsing_Flat=6.6 ms
            |-> NodeClass.Init.Data_Accm=0.2 ms
            |-> NodeClass.Init.Data_Flat=0.2 ms
            |-> NodeClass.Init.Edges_Accm=0.6 ms
            |-> NodeClass.Init.Edges_Flat=0.6 ms
            |-> NodeClass.Init.FieldScanning.Inner_Accm=0.4 ms
            |-> NodeClass.Init.FieldScanning.Inner_Flat=0.4 ms
            |-> NodeClass.Init.FieldScanning_Accm=6.4 ms
            |-> NodeClass.Init.FieldScanning_Flat=3.1 ms
            |-> NodeClass.Init.IterableIds_Accm=0.0 ms
            |-> NodeClass.Init.IterableIds_Flat=0.0 ms
            |-> PhaseTime_AddressLoweringPhase_Accm=53.1 ms
            |-> PhaseTime_AddressLoweringPhase_Flat=53.1 ms
            |-> PhaseTime_CanonicalizerPhase_Accm=100.0 ms
            |-> PhaseTime_CanonicalizerPhase_Flat=99.4 ms
            |-> PhaseTime_CanonicalizerPhase_Instance_Accm=644.8 ms
            |-> PhaseTime_CanonicalizerPhase_Instance_Flat=644.7 ms
            |-> PhaseTime_ConvertDeoptimizeToGuardPhase_Accm=0.2 ms
            |-> PhaseTime_ConvertDeoptimizeToGuardPhase_Flat=0.2 ms
            |-> PhaseTime_DeadCodeEliminationPhase_Accm=29.4 ms
            |-> PhaseTime_DeadCodeEliminationPhase_Flat=29.4 ms
            |-> PhaseTime_DeoptimizationGroupingPhase_Accm=0.8 ms
            |-> PhaseTime_DeoptimizationGroupingPhase_Flat=0.8 ms
            |-> PhaseTime_DominatorConditionalEliminationPhase_Accm=108.4 ms
            |-> PhaseTime_DominatorConditionalEliminationPhase_Flat=104.4 ms
            |-> PhaseTime_EarlyReadEliminationPhase_Accm=12.4 ms
            |-> PhaseTime_EarlyReadEliminationPhase_Flat=8.2 ms
            |-> PhaseTime_EliminateRedundantInitializationPhase_Accm=4.4 ms
            |-> PhaseTime_EliminateRedundantInitializationPhase_Flat=4.4 ms
            |-> PhaseTime_ExpandLogicPhase_Accm=0.0 ms
            |-> PhaseTime_ExpandLogicPhase_Flat=0.0 ms
            |-> PhaseTime_FloatingReadPhase_Accm=4.7 ms
            |-> PhaseTime_FloatingReadPhase_Flat=4.3 ms
            |-> PhaseTime_FrameStateAssignmentPhase_Accm=16.8 ms
            |-> PhaseTime_FrameStateAssignmentPhase_Flat=16.8 ms
            |-> PhaseTime_GraphBuilderPhase_Instance_Accm=60.4 ms
            |-> PhaseTime_GraphBuilderPhase_Instance_Flat=54.5 ms
            |-> PhaseTime_GuardLoweringPhase_Accm=13.2 ms
            |-> PhaseTime_GuardLoweringPhase_Flat=8.5 ms
            |-> PhaseTime_HighTier_Accm=631.0 ms
            |-> PhaseTime_HighTier_Flat=1.7 ms
            |-> PhaseTime_IncrementalCanonicalizerPhase_Accm=1397.1 ms
            |-> PhaseTime_IncrementalCanonicalizerPhase_Flat=3.2 ms
            |-> PhaseTime_InliningPhase_Accm=6.7 ms
            |-> PhaseTime_InliningPhase_Flat=6.7 ms
            |-> PhaseTime_IterativeConditionalEliminationPhase_Accm=108.8 ms
            |-> PhaseTime_IterativeConditionalEliminationPhase_Flat=0.4 ms
            |-> PhaseTime_LoadJavaMirrorWithKlassPhase_Accm=2.0 ms
            |-> PhaseTime_LoadJavaMirrorWithKlassPhase_Flat=2.0 ms
            |-> PhaseTime_LockEliminationPhase_Accm=0.0 ms
            |-> PhaseTime_LockEliminationPhase_Flat=0.0 ms
            |-> PhaseTime_LoopFullUnrollPhase_Accm=0.1 ms
            |-> PhaseTime_LoopFullUnrollPhase_Flat=0.1 ms
            |-> PhaseTime_LoopPeelingPhase_Accm=0.1 ms
            |-> PhaseTime_LoopPeelingPhase_Flat=0.1 ms
            |-> PhaseTime_LoopSafepointEliminationPhase_Accm=0.4 ms
            |-> PhaseTime_LoopSafepointEliminationPhase_Flat=0.4 ms
            |-> PhaseTime_LoopSafepointInsertionPhase_Accm=0.0 ms
            |-> PhaseTime_LoopSafepointInsertionPhase_Flat=0.0 ms
            |-> PhaseTime_LoopUnswitchingPhase_Accm=0.1 ms
            |-> PhaseTime_LoopUnswitchingPhase_Flat=0.1 ms
            |-> PhaseTime_LowTier_Accm=1224.2 ms
            |-> PhaseTime_LowTier_Flat=0.1 ms
            |-> PhaseTime_LoweringPhase_Accm=1380.5 ms
            |-> PhaseTime_LoweringPhase_Flat=2.8 ms
            |-> PhaseTime_LoweringPhase_Round_Accm=750.8 ms
            |-> PhaseTime_LoweringPhase_Round_Flat=188.5 ms
            |-> PhaseTime_MidTier_Accm=307.6 ms
            |-> PhaseTime_MidTier_Flat=0.3 ms
            |-> PhaseTime_OptimizeGuardAnchorsPhase_Accm=0.2 ms
            |-> PhaseTime_OptimizeGuardAnchorsPhase_Flat=0.2 ms
            |-> PhaseTime_PartialEscapePhase_Accm=354.2 ms
            |-> PhaseTime_PartialEscapePhase_Flat=298.7 ms
            |-> PhaseTime_PushThroughPiPhase_Accm=0.1 ms
            |-> PhaseTime_PushThroughPiPhase_Flat=0.1 ms
            |-> PhaseTime_ReassociateInvariantPhase_Accm=0.0 ms
            |-> PhaseTime_ReassociateInvariantPhase_Flat=0.0 ms
            |-> PhaseTime_RemoveValueProxyPhase_Accm=0.2 ms
            |-> PhaseTime_RemoveValueProxyPhase_Flat=0.2 ms
            |-> PhaseTime_ReplaceConstantNodesPhase_Accm=4.4 ms
            |-> PhaseTime_ReplaceConstantNodesPhase_Flat=4.0 ms
            |-> PhaseTime_SchedulePhase_Accm=228.4 ms
            |-> PhaseTime_SchedulePhase_Flat=228.4 ms
            |-> PhaseTime_UseTrappingNullChecksPhase_Accm=0.0 ms
            |-> PhaseTime_UseTrappingNullChecksPhase_Flat=0.0 ms
            |-> PhaseTime_ValueAnchorCleanupPhase_Accm=0.9 ms
            |-> PhaseTime_ValueAnchorCleanupPhase_Flat=0.9 ms
            |-> PhaseTime_WriteBarrierAdditionPhase_Accm=11.1 ms
            |-> PhaseTime_WriteBarrierAdditionPhase_Flat=10.5 ms
            |-> SnippetInstantiationTime[allocateArrayPIC]_Accm=228.0 ms
            |-> SnippetInstantiationTime[allocateArrayPIC]_Flat=3.6 ms
            |-> SnippetInstantiationTime[initializeKlass]_Accm=0.6 ms
            |-> SnippetInstantiationTime[initializeKlass]_Flat=0.0 ms
            |-> SnippetInstantiationTime[resolveObjectConstant]_Accm=68.1 ms
            |-> SnippetInstantiationTime[resolveObjectConstant]_Flat=22.8 ms
            |-> SnippetInstantiationTime[serialPreciseWriteBarrier]_Accm=60.2 ms
            |-> SnippetInstantiationTime[serialPreciseWriteBarrier]_Flat=3.2 ms
            |-> SnippetPreparationTime_Accm=70.5 ms
            |-> SnippetPreparationTime_Flat=2.9 ms
            |-> SnippetTemplateCreationTime_Accm=97.1 ms
            |-> SnippetTemplateCreationTime_Flat=14.6 ms
            |-> SnippetTemplateInstantiationTime[allocateArrayPIC(3=16, 4=2, 5=true, 6=r15, 7=true, 8=)]_Accm=224.3 ms
            |-> SnippetTemplateInstantiationTime[allocateArrayPIC(3=16, 4=2, 5=true, 6=r15, 7=true, 8=)]_Flat=52.0 ms
            |-> SnippetTemplateInstantiationTime[initializeKlass()]_Accm=0.6 ms
            |-> SnippetTemplateInstantiationTime[initializeKlass()]_Flat=0.2 ms
            |-> SnippetTemplateInstantiationTime[serialPreciseWriteBarrier()]_Accm=57.0 ms
            |-> SnippetTemplateInstantiationTime[serialPreciseWriteBarrier()]_Flat=31.4 ms
        </DebugValues>
        -bash-4.2$
        Show
        jcm Jamsheed C M added a comment - - edited total time spent on a typical getContents compilation : ~5s time spent on different phases for a getContents method |-> Summary     |-> DuplicateGraph_Accm=246.9 ms     |-> DuplicateGraph_Flat=246.9 ms     |-> FrontEnd_Accm=2164.1 ms     |-> FrontEnd_Flat=1.2 ms     |-> NodeClass.Init.AllowedUsages_Accm=0.3 ms     |-> NodeClass.Init.AllowedUsages_Flat=0.3 ms     |-> NodeClass.Init.AnnotationParsing_Accm=6.6 ms     |-> NodeClass.Init.AnnotationParsing_Flat=6.6 ms     |-> NodeClass.Init.Data_Accm=0.2 ms     |-> NodeClass.Init.Data_Flat=0.2 ms     |-> NodeClass.Init.Edges_Accm=0.6 ms     |-> NodeClass.Init.Edges_Flat=0.6 ms     |-> NodeClass.Init.FieldScanning.Inner_Accm=0.4 ms     |-> NodeClass.Init.FieldScanning.Inner_Flat=0.4 ms     |-> NodeClass.Init.FieldScanning_Accm=6.4 ms     |-> NodeClass.Init.FieldScanning_Flat=3.1 ms     |-> NodeClass.Init.IterableIds_Accm=0.0 ms     |-> NodeClass.Init.IterableIds_Flat=0.0 ms     |-> PhaseTime_AddressLoweringPhase_Accm=53.1 ms     |-> PhaseTime_AddressLoweringPhase_Flat=53.1 ms     |-> PhaseTime_CanonicalizerPhase_Accm=100.0 ms     |-> PhaseTime_CanonicalizerPhase_Flat=99.4 ms     |-> PhaseTime_CanonicalizerPhase_Instance_Accm=644.8 ms     |-> PhaseTime_CanonicalizerPhase_Instance_Flat=644.7 ms     |-> PhaseTime_ConvertDeoptimizeToGuardPhase_Accm=0.2 ms     |-> PhaseTime_ConvertDeoptimizeToGuardPhase_Flat=0.2 ms     |-> PhaseTime_DeadCodeEliminationPhase_Accm=29.4 ms     |-> PhaseTime_DeadCodeEliminationPhase_Flat=29.4 ms     |-> PhaseTime_DeoptimizationGroupingPhase_Accm=0.8 ms     |-> PhaseTime_DeoptimizationGroupingPhase_Flat=0.8 ms     |-> PhaseTime_DominatorConditionalEliminationPhase_Accm=108.4 ms     |-> PhaseTime_DominatorConditionalEliminationPhase_Flat=104.4 ms     |-> PhaseTime_EarlyReadEliminationPhase_Accm=12.4 ms     |-> PhaseTime_EarlyReadEliminationPhase_Flat=8.2 ms     |-> PhaseTime_EliminateRedundantInitializationPhase_Accm=4.4 ms     |-> PhaseTime_EliminateRedundantInitializationPhase_Flat=4.4 ms     |-> PhaseTime_ExpandLogicPhase_Accm=0.0 ms     |-> PhaseTime_ExpandLogicPhase_Flat=0.0 ms     |-> PhaseTime_FloatingReadPhase_Accm=4.7 ms     |-> PhaseTime_FloatingReadPhase_Flat=4.3 ms     |-> PhaseTime_FrameStateAssignmentPhase_Accm=16.8 ms     |-> PhaseTime_FrameStateAssignmentPhase_Flat=16.8 ms     |-> PhaseTime_GraphBuilderPhase_Instance_Accm=60.4 ms     |-> PhaseTime_GraphBuilderPhase_Instance_Flat=54.5 ms     |-> PhaseTime_GuardLoweringPhase_Accm=13.2 ms     |-> PhaseTime_GuardLoweringPhase_Flat=8.5 ms     |-> PhaseTime_HighTier_Accm=631.0 ms     |-> PhaseTime_HighTier_Flat=1.7 ms     |-> PhaseTime_IncrementalCanonicalizerPhase_Accm=1397.1 ms     |-> PhaseTime_IncrementalCanonicalizerPhase_Flat=3.2 ms     |-> PhaseTime_InliningPhase_Accm=6.7 ms     |-> PhaseTime_InliningPhase_Flat=6.7 ms     |-> PhaseTime_IterativeConditionalEliminationPhase_Accm=108.8 ms     |-> PhaseTime_IterativeConditionalEliminationPhase_Flat=0.4 ms     |-> PhaseTime_LoadJavaMirrorWithKlassPhase_Accm=2.0 ms     |-> PhaseTime_LoadJavaMirrorWithKlassPhase_Flat=2.0 ms     |-> PhaseTime_LockEliminationPhase_Accm=0.0 ms     |-> PhaseTime_LockEliminationPhase_Flat=0.0 ms     |-> PhaseTime_LoopFullUnrollPhase_Accm=0.1 ms     |-> PhaseTime_LoopFullUnrollPhase_Flat=0.1 ms     |-> PhaseTime_LoopPeelingPhase_Accm=0.1 ms     |-> PhaseTime_LoopPeelingPhase_Flat=0.1 ms     |-> PhaseTime_LoopSafepointEliminationPhase_Accm=0.4 ms     |-> PhaseTime_LoopSafepointEliminationPhase_Flat=0.4 ms     |-> PhaseTime_LoopSafepointInsertionPhase_Accm=0.0 ms     |-> PhaseTime_LoopSafepointInsertionPhase_Flat=0.0 ms     |-> PhaseTime_LoopUnswitchingPhase_Accm=0.1 ms     |-> PhaseTime_LoopUnswitchingPhase_Flat=0.1 ms     |-> PhaseTime_LowTier_Accm=1224.2 ms     |-> PhaseTime_LowTier_Flat=0.1 ms     |-> PhaseTime_LoweringPhase_Accm=1380.5 ms     |-> PhaseTime_LoweringPhase_Flat=2.8 ms     |-> PhaseTime_LoweringPhase_Round_Accm=750.8 ms     |-> PhaseTime_LoweringPhase_Round_Flat=188.5 ms     |-> PhaseTime_MidTier_Accm=307.6 ms     |-> PhaseTime_MidTier_Flat=0.3 ms     |-> PhaseTime_OptimizeGuardAnchorsPhase_Accm=0.2 ms     |-> PhaseTime_OptimizeGuardAnchorsPhase_Flat=0.2 ms     |-> PhaseTime_PartialEscapePhase_Accm=354.2 ms     |-> PhaseTime_PartialEscapePhase_Flat=298.7 ms     |-> PhaseTime_PushThroughPiPhase_Accm=0.1 ms     |-> PhaseTime_PushThroughPiPhase_Flat=0.1 ms     |-> PhaseTime_ReassociateInvariantPhase_Accm=0.0 ms     |-> PhaseTime_ReassociateInvariantPhase_Flat=0.0 ms     |-> PhaseTime_RemoveValueProxyPhase_Accm=0.2 ms     |-> PhaseTime_RemoveValueProxyPhase_Flat=0.2 ms     |-> PhaseTime_ReplaceConstantNodesPhase_Accm=4.4 ms     |-> PhaseTime_ReplaceConstantNodesPhase_Flat=4.0 ms     |-> PhaseTime_SchedulePhase_Accm=228.4 ms     |-> PhaseTime_SchedulePhase_Flat=228.4 ms     |-> PhaseTime_UseTrappingNullChecksPhase_Accm=0.0 ms     |-> PhaseTime_UseTrappingNullChecksPhase_Flat=0.0 ms     |-> PhaseTime_ValueAnchorCleanupPhase_Accm=0.9 ms     |-> PhaseTime_ValueAnchorCleanupPhase_Flat=0.9 ms     |-> PhaseTime_WriteBarrierAdditionPhase_Accm=11.1 ms     |-> PhaseTime_WriteBarrierAdditionPhase_Flat=10.5 ms     |-> SnippetInstantiationTime[allocateArrayPIC]_Accm=228.0 ms     |-> SnippetInstantiationTime[allocateArrayPIC]_Flat=3.6 ms     |-> SnippetInstantiationTime[initializeKlass]_Accm=0.6 ms     |-> SnippetInstantiationTime[initializeKlass]_Flat=0.0 ms     |-> SnippetInstantiationTime[resolveObjectConstant]_Accm=68.1 ms     |-> SnippetInstantiationTime[resolveObjectConstant]_Flat=22.8 ms     |-> SnippetInstantiationTime[serialPreciseWriteBarrier]_Accm=60.2 ms     |-> SnippetInstantiationTime[serialPreciseWriteBarrier]_Flat=3.2 ms     |-> SnippetPreparationTime_Accm=70.5 ms     |-> SnippetPreparationTime_Flat=2.9 ms     |-> SnippetTemplateCreationTime_Accm=97.1 ms     |-> SnippetTemplateCreationTime_Flat=14.6 ms     |-> SnippetTemplateInstantiationTime[allocateArrayPIC(3=16, 4=2, 5=true, 6=r15, 7=true, 8=)]_Accm=224.3 ms     |-> SnippetTemplateInstantiationTime[allocateArrayPIC(3=16, 4=2, 5=true, 6=r15, 7=true, 8=)]_Flat=52.0 ms     |-> SnippetTemplateInstantiationTime[initializeKlass()]_Accm=0.6 ms     |-> SnippetTemplateInstantiationTime[initializeKlass()]_Flat=0.2 ms     |-> SnippetTemplateInstantiationTime[serialPreciseWriteBarrier()]_Accm=57.0 ms     |-> SnippetTemplateInstantiationTime[serialPreciseWriteBarrier()]_Flat=31.4 ms </DebugValues> -bash-4.2$
        Hide
        dlong Dean Long added a comment -
        [~jcm] Rather than try to share ReferenceMaps like we discussed, it may be enough to "intern" the Location values. What if Location.subregister() and Location.stack() used a static java.util.Set to reuse values instead of returning a new Location each time?
        Show
        dlong Dean Long added a comment - [~jcm] Rather than try to share ReferenceMaps like we discussed, it may be enough to "intern" the Location values. What if Location.subregister() and Location.stack() used a static java.util.Set to reuse values instead of returning a new Location each time?
        Hide
        jcm Jamsheed C M added a comment -
        yeah, that is still more better. Thank you, Dean
        Show
        jcm Jamsheed C M added a comment - yeah, that is still more better. Thank you, Dean
        Hide
        jcm Jamsheed C M added a comment - - edited
        [~dlong] please ignore that comment .. that was oversight keeping value type in mind.
        Show
        jcm Jamsheed C M added a comment - - edited [~dlong] please ignore that comment .. that was oversight keeping value type in mind.

          People

          • Assignee:
            jcm Jamsheed C M
            Reporter:
            dpochepk Dmitrij Pochepko
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated: