Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4168596

Exceeding MAX_LWP (defined in threads_md.c) crashes JVM

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: P3
    • Resolution: Not an Issue
    • Affects Version/s: 1.1.6
    • Fix Version/s: None
    • Component/s: hotspot
    • Subcomponent:
    • CPU:
      sparc
    • OS:
      solaris_2.6

      Description


      A customer uses a server engine, written in Java, to manage interactive
      sessions between hundreds of clients. Each client connection takes
      a thread. They need to be able to support thousands of users on their
      server. CPU load testing shows that this should be possible given a
      large enough server, but the JVM always crashes when the user count goes
      slightly over 1000. The customer has tracked the bug down to

      jdk-1.1.6-src/src/solaris/java/native_threads/src/threads_md.c

      Where an parameter called MAX_LWP is defined. When they exceed MAX_LWP
      threads in their server, the next GC causes the system to die. Here
      is the customer's description:

      -------------------------------------------

      in my tests something bad (e.g. segfault or hard loop) always happens as
      soon as you GC with more than 1024 threads. for example, try unlimiting
      file descriptors (otherwise you'll die immediately) and then running it with
          jre -verbosegc -ss2k -oss2k -nojit Threads 46738 520
      (those jre options aren't necessary to provoke the bug; they just minimize
      the load on the system in getting there.)

      what's going on? well,

          jdk-1.1.6-src/src/solaris/java/native_threads/src/threads_md.c

      includes the following code:

          #define MAX_LWPS 1024

          static prstatus_t Mystatus;
          static id_t lwpid_list_buf[MAX_LWPS];
          static id_t oldlwpid_list_buf[MAX_LWPS];
          static sys_thread_t *onproct_list_buf[MAX_LWPS];

      (i.e. it declares a constant that sounds an awful lot like "max lightweight
      processes", and then sizes a bunch of tables). more discouraging, it never
      seems to check against this limit before assigning into those tables, so i
      suspect the answer is that the first time you GC after you get more than
      1024 LWPs you'll start smashing memory (these tables appear to be used for
      suspending the LWPs during garbage collection).

      ------------------------------------------------

      The customer also wrote a small test program which reproduces the problem
      without having to run their server.


      import java.io.*;
      import java.net.*;

      public class Threads extends Thread {

          private static int port;
          private static InetAddress localHost;

          private Socket s;

          private Threads (Socket s) {
      this.s = s;
          }

          public void run () {
      try {
      if (s == null)
      s = new Socket(localHost, port);
      byte[] buf = new byte[4];
      s.getInputStream().read(buf, 0, buf.length);
      } catch (Exception e) {
      e.printStackTrace();
      }
      System.err.print("!");
          }

          public static void main (String[] args) throws Exception {
      if (args.length != 2) {
      System.err.println("use: threads <port> <gcCount>");
      System.exit(1);
      }
      port = Integer.parseInt(args[0]);
      int gcCount = Integer.parseInt(args[1]);
      localHost = InetAddress.getLocalHost();
      ServerSocket ss = new ServerSocket(port);
      for (int i = 0; ; ++i) {
      new Threads(null).start();
      Socket s = ss.accept();
      new Threads(s).start();
      System.err.print("(" + i + ")");
      System.err.flush();
      Thread.sleep(5);
      if (((i + 1) % gcCount) == 0)
      System.gc();
      }
          }
      }


      steve.fritzinger@East 1998-08-24

      ============================================================================

      I got a libthread panic error running the test program on build H (transcript
      below), but this did NOT happen on build I. (It still ends up with a
      SocketException (Too many open files).)

      Maybe this was fixed in I????

      %/usr/local/java/jdk1.3bak/solaris/bin/java -version
      java version "1.3.0"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0-H)
      Java HotSpot (TM) Client VM (build 1.3-H, interpreted mode)
      %/usr/local/java/jdk1.3bak/solaris/bin/java -cp . Threads 9999 520
      (0)(1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30)(31)(32)(33)(34)(35)(36)(37)(38)(39)(40)(41)(42)(43)(44)(45)(46)(47)(48)(49)(50)(51)(52)(53)(54)(55)(56)(57)(58)(59)(60)(61)(62)(63)(64)(65)(66)(67)(68)(69)(70)(71)(72)(73)(74)(75)(76)(77)(78)(79)(80)(81)(82)(83)(84)(85)(86)(87)(88)(89)(90)(91)(92)(93)(94)(95)(96)(97)(98)(99)(100)(101)(102)(103)(104)(105)(106)(107)(108)(109)(110)(111)(112)(113)(114)(115)(116)(117)(118)(119)(120)(121)(122)(123)(124)(125)(126)(127)(128)(129)(130)(131)(132)(133)(134)(135)(136)(137)(138)(139)(140)(141)(142)(143)(144)(145)(146)(147)(148)(149)(150)(151)(152)(153)(154)(155)(156)(157)(158)(159)(160)(161)(162)(163)(164)(165)(166)(167)(168)(169)(170)(171)(172)(173)(174)(175)(176)(177)(178)(179)(180)(181)(182)(183)(184)(185)(186)(187)(188)(189)(190)(191)(192)(193)(194)(195)(196)(197)(198)(199)(200)(201)(202)(203)(204)(205)(206)(207)(208)(209)(210)(211)(212)(213)(214)(215)(216)(217)(218)(219)(220)(221)(222)(223)(224)(225)(226)(227)(228)(229)(230)(231)(232)(233)(234)(235)(236)(237)(238)(239)(240)(241)(242)(243)(244)(245)(246)(247)(248)(249)(250)(251)(252)(253)(254)(255)(256)(257)(258)(259)(260)(261)(262)(263)(264)(265)(266)(267)(268)(269)(270)(271)(272)(273)(274)(275)(276)(277)(278)(279)(280)(281)(282)(283)(284)(285)(286)(287)(288)(289)(290)(291)(292)(293)(294)(295)(296)(297)(298)(299)(300)(301)(302)(303)(304)(305)(306)(307)(308)(309)(310)(311)(312)(313)(314)(315)(316)(317)(318)(319)(320)(321)(322)(323)(324)(325)(326)(327)(328)(329)(330)(331)(332)(333)(334)(335)(336)(337)(338)(339)(340)(341)(342)(343)(344)(345)(346)(347)(348)(349)(350)(351)(352)(353)(354)(355)(356)(357)(358)(359)(360)(361)(362)(363)(364)(365)(366)(367)(368)(369)(370)(371)(372)(373)(374)(375)(376)(377)(378)(379)(380)(381)(382)(383)(384)(385)(386)(387)(388)(389)(390)(391)(392)(393)(394)(395)(396)(397)(398)(399)(400)(401)(402)(403)(404)(405)(406)(407)(408)(409)(410)(411)(412)(413)(414)(415)(416)(417)(418)(419)(420)(421)(422)(423)(424)(425)(426)(427)(428)(429)(430)(431)(432)(433)(434)(435)(436)(437)(438)(439)(440)(441)(442)(443)(444)(445)(446)(447)(448)(449)(450)(451)(452)(453)(454)(455)(456)(457)(458)(459)(460)(461)(462)(463)(464)(465)(466)(467)(468)(469)(470)(471)(472)(473)(474)(475)(476)(477)(478)(479)(480)(481)(482)(483)(484)(485)(486)(487)(488)(489)(490)(491)(492)(493)(494)(495)(496)(497)(498)(499)(500)(501)(502)(503)(504)(505)(506)(507)libthread panic: cannot create new lwp (PID: 21935 LWP 2)
      stacktrace:
              ef76ec70
              0
      Exception in thread "main" java.net.SocketException: Too many open files
              at java.net.PlainSocketImpl.socketAccept(Native Method)
              at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:417)
              at java.net.ServerSocket.implAccept(ServerSocket.java:245)
              at java.net.ServerSocket.accept(ServerSocket.java:226)
              at Threads.main(Threads.java:38)
      java.net.SocketException: Too many open files
              at java.net.PlainSocketImpl.socketCreate(Native Method)
              at java.net.PlainSocketImpl.create(PlainSocketImpl.java:74)
              at java.net.Socket.<init>(Socket.java:270)
              at java.net.Socket.<init>(Socket.java:131)
              at Threads.run(Threads.java:18)
      !^C%


      david.bowen@Eng 1999-09-29

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              ysr Y. Ramakrishna
              Reporter:
              duke J. Duke (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: