Details

    • Author:
      Mark Reinhold
    • JEP Type:
      Feature
    • Exposure:
      Open
    • Scope:
      Implementation
    • Discussion:
      jigsaw dash dev at openjdk dot java dot net
    • Effort:
      L
    • Duration:
      L
    • Alert Status:
       Green
    • JEP Number:
      201

      Description

      Summary

      Reorganize the JDK source code into modules, enhance the build system to compile modules, and enforce module boundaries at build time.

      Non-Goals

      This JEP will not change the structure of the JRE and JDK binary images, nor will it introduce a module system. That work will be covered by related JEPs and, where appropriate, JSRs.

      This JEP will define a new source-code layout for the JDK. This layout may be used outside of the JDK, but it is not a goal of this JEP to design a broadly-accepted universal modular source-code layout.

      Motivation

      Project Jigsaw aims to design and implement a standard module system for the Java SE Platform and to apply that system to the Platform itself, and to the JDK. Its primary goals are to make implementations of the Platform more easily scalable down to small devices, improve security and maintainability, enable improved application performance, and provide developers with better tools for programming in the large.

      This JEP is the first step of Project Jigsaw; later JEPs will modularize the JRE and JDK images (JEP 220) and then introduce a module system (JEP 261).

      The motivations to reorganize the source code at this early stage are to:

      1. Give JDK developers the opportunity to become familiar with the modular structure of the system;

      2. Preserve that structure going forward by enforcing module boundaries in the build, even prior to the introduction of a module system; and

      3. Enable further development of Project Jigsaw to proceed without always having to "shuffle" the present non-modular source code into modular form.

      Description

      Current scheme

      Most of the JDK source code is today organized, roughly, in a scheme that dates back to 1997. In abbreviated form:

      src/{share,$OS}/{classes,native}/$PACKAGE/*.{java,c,h,cpp,hpp}

      where:

      • The share directory contains shared, cross-platform code;

      • The $OS directory contains operating-system-specific code, where $OS is one of solaris, windows, etc.;

      • The classes directory contains Java source files, and possibly resource files;

      • The native directory contains C or C++ source files; and

      • $PACKAGE is the relevant Java API package name, with periods replaced by slashes.

      To take a simple example, the source code for the java.lang.Object class in the jdk repository resides in two files, one in Java and the other in C:

      src/share/classes/java/lang/Object.java
                native/java/lang/Object.c

      For a less trivial example, the source code for the package-private java.lang.ProcessImpl and ProcessEnvironment classes is operating-system-specific; for Unix-like systems it resides in three files:

      src/solaris/classes/java/lang/ProcessImpl.java
                                    ProcessEnvironment.java
                  native/java/lang/ProcessEnvironment_md.c

      (Yes, the second-level directory is named solaris even though this code is relevant to all Unix derivatives; more on this below.)

      There are a handful of directories under src/{share,$OS} that don't match the current structure, including:

      Directory                     Content
      --------------------------    --------------------------
      src/{share,$OS}/back          JDWP back end
                      bin           Java launcher
                      instrument    Instrumentation support
                      javavm        Exported JVM include files
                      lib           Files for $JAVA_HOME/lib
                      transport     JDWP transports

      New scheme

      The modularization of the JDK presents a rare opportunity to completely restructure the source code in order to make it easier to maintain. We propose to implement the following scheme in every repository in the JDK forest except for hotspot. In abbreviated form:

      src/$MODULE/{share,$OS}/classes/$PACKAGE/*.java
                              native/include/*.{h,hpp}
                                     $LIBRARY/*.{c,cpp}
                              conf/*

      where:

      • $MODULE is a module name (e.g., java.base);

      • The share directory contains shared, cross-platform code, as before;

      • The $OS directory contains operating-system-specific code, as before, where $OS is one of unix, windows, etc.;

      • The classes directory contains Java source files and resource files organized into a directory tree reflecting their API $PACKAGE hierarchy, as before;

      • The native directory contains C or C++ source files, as before but organized differently:

        • The include directory contains C or C++ header files intended to be exported for external use (e.g., jni.h);

        • C or C++ source files are placed in a $LIBRARY directory, whose name is that of the shared library or DLL into which the compiled code will be linked (e.g., libjava or libawt); and, finally,

      • The conf directory contains configuration files meant to be edited by end users (e.g., net.properties).

      To recast the previous examples, the source code for the java.lang.Object class will be laid out as follows:

      src/java.base/share/classes/java/lang/Object.java
                          native/libjava/Object.c

      The source code for the package-private java.lang.ProcessImpl and ProcessEnvironment classes will be laid out this way:

      src/java.base/unix/classes/java/lang/ProcessImpl.java
                                           ProcessEnvironment.java
                         native/libjava/ProcessEnvironment_md.c

      (We take the opportunity here, finally, to rename the solaris directory to unix.)

      The content of the directories currently under src/{share,$OS} that don't match the current structure will be moved into appropriate modules:

      Directory                     Module
      --------------------------    --------------------------
      src/{share,$OS}/back          jdk.jdwp.agent
                      bin           java.base
                      instrument    java.instrument
                      javavm        java.base
                      lib           $MODULE/{share,$OS}/conf
                      transport     jdk.jdwp.agent

      Files in the current lib directory that are not intended to be edited by end users will be converted into resource files.

      Build-system changes

      The build system will be modified to compile one module at a time rather than one repository at a time, and it will compile modules according to a reverse topological sort of the module graph. Modules that do not depend on each other, directly or indirectly, will be compiled concurrently when possible.

      A side benefit of compiling modules rather than repositories is that code in the corba, jaxp, and jaxws repositories will be able to make use of new Java language features and APIs. This was previously forbidden, since those repositories were compiled before the jdk repository.

      The compiled classes in an intermediate (i.e., non-image) build will be divided into modules. Where today we have:

       jdk/classes/*.class

      the revised build system will produce:

       jdk/modules/$MODULE/*.class

      The structure of image builds, as noted, will not change; there will be very minor differences in their content.

      Module boundaries will be enforced at build time, insofar as possible, by the build system. If a module boundary is violated then the build will fail. The boundaries will be defined in the modules.xml file described in JEP 200, which will be maintained alongside the source code. Changes to this file will require review by Committers to Project Jigsaw.

      Alternatives

      There are numerous other possible source-layout schemes, including:

      1. Keep {share,$OS} at the top, with a modules directory to contain module class files:

         src/{share,$OS}/modules/$MODULE/$PACKAGE/*.java
                         native/include/*.{h,hpp}
                                $LIBRARY/*.{c,cpp}
                         conf/*
      2. Put everything under the appropriate $MODULE directory, but keep {share,$OS} at the top:

         src/{share,$OS}/$MODULE/classes/$PACKAGE/*.java
                                 native/include/*.{h,hpp}
                                        $LIBRARY/*.{c,cpp}
                                 conf/*
      3. Push {share,$OS} down into the $MODULE directories, as in the present proposal, but remove the intermediate classes directory and prefix the names of the native and conf directories with an underscore, all so as to simplify the common case of pure Java modules:

         src/$MODULE/{share,$OS}/$PACKAGE/*.java
                                 _native/include/*.{h,hpp}
                                         $LIBRARY/*.{c,cpp}
                                 _conf/*
      4. A variant of scheme 3, but with {share,$OS} at the top:

         src/{share,$OS}/$MODULE/$PACKAGE/*.java
                                 _native/include/*.{h,hpp}
                                         $LIBRARY/*.{c,cpp}
                                 _conf/*
      5. Another variant of scheme 3, pushing {share,$OS} deeper down so as to further simplify the case of pure Java modules with no $OS-specific code:

         src/$MODULE/$PACKAGE/*.java
                     _native/include/*.{h,hpp}
                             $LIBRARY/*.{c,cpp}
                     _conf/*
                     _$OS/$PACKAGE/*.java
                         _native/include/*.{h,hpp}
                                 $LIBRARY/*.{c,cpp}
                         _conf/*

      We rejected the schemes involving underscores (3–5) as too unfamiliar and difficult to navigate. We prefer the present proposal over schemes 1 and 2 because it entails the least change from the current scheme while placing all of the source code for a module under a single directory. Tools and scripts that depend upon the current scheme must be revised, but at least for Java source code the structure underneath each $MODULE directory is the same as before.

      Additional issues which we considered:

      • Should we define distinct directories for resource files, so that they would be separate from Java source files? — No; this does not seem worth the trouble.

      • Some modules have content that spans repositories; is this a problem? — It's an annoyance, but the build system can cope with it via the magic of the VPATH mechanism. Over time we might restructure the repositories to reduce or even eliminate cross-repo modules, but that's beyond the scope of this JEP.

      • Some modules have multiple native libraries; should we merge them so that each module has at most one native library? — No; in some cases we need the flexibility of multiple native libraries per module, e.g., for "headless" vs. "headful" AWT.

      Testing

      As stated, this JEP will not change the structure of the JRE and JDK binary images, and will make only minor changes to the content. We can therefore validate this change by comparing images built with it against images built without it, and running tests to validate the actual minor changes.

      Risks and Assumptions

      We assume that Mercurial will be able to handle the massive number of file-rename operations that will be done to implement this change, and to preserve all historical information in the process. Early testing has shown Mercurial to be capable of this, but there is still a minor risk that the relationship between the new and old locations of a file will not properly be recorded. In that case the history of the file in its old location will still be in the repository; it will just be more difficult to find.

      It will be impossible to apply a patch created against a repository using the old scheme directly to a repository using the new scheme, and vice versa. To mitigate this we plan to develop a script to translate the file names in a patch from their old locations to their new locations.

      Dependences

      This JEP is the second of several JEPs for Project Jigsaw. It incorporates the definition of the modular structure of the JDK from JEP 200, but it does not explicitly depend upon that JEP.

        Issue Links

          Activity

          Hide
          mduigou Michael Duigou added a comment -
          Under non-goals it should be mentioned that the jtreg test source code will not be moving/reorganized/modularized as part of this JEP.
          Show
          mduigou Michael Duigou added a comment - Under non-goals it should be mentioned that the jtreg test source code will not be moving/reorganized/modularized as part of this JEP.

            People

            • Assignee:
              alanb Alan Bateman
              Reporter:
              mr Mark Reinhold
              Owner:
              Alan Bateman
              Reviewed By:
              Alan Bateman, Alex Buckley, Mandy Chung, Paul Sandoz
              Endorsed By:
              Brian Goetz
            • Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

              • Due:
                Created:
                Updated:
                Resolved:
                Integration Due: