Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8167368

JEP 296: Consolidate the JDK Forest into a Single Repository

    Details

    • Author:
      Joseph D. Darcy
    • JEP Type:
      Infrastructure
    • Exposure:
      Open
    • Subcomponent:
    • Scope:
      Implementation
    • Discussion:
      jdk9 dash dev at openjdk dot java dot net
    • Effort:
      M
    • Duration:
      M
    • JEP Number:
      296

      Description

      Summary

      Combine the numerous repositories of the JDK forest into a single repository in order to simplify and streamline development.

      Non-Goals

      Adding the FX sources to the JDK forest is not part of the proposal.

      Motivation

      For many years, the full code base of the JDK has been broken into numerous Mercurial repositories. In JDK 9 there are eight repos: root, corba, hotspot, jaxp, jaxws, jdk, langtools, and nashorn.

      While this model of multiple repos offers some advantages, it also has many downsides and does a poor job of supporting various desirable source-code management operations. In particular, it is not possible to perform an atomic commit across repositories of inter-dependent changesets. For example, if the code for a single bug fix or RFE spans both the jdk and hotspot repos today, the change to both repositories cannot be done atomically in the forest hosting those two distinct repos. Changes spanning multiple repos are a common occurrence; over 1,100 bug ids have been reused across repositories in the JDK forest. The 1,100+ repo-crossing bugs is only a lower bound on the number of logically repo-crossing bugs, since some engineers use separate bug ids to push to different repos.

      This mismatch between the divisions of the Mercurial repos and unity of the engineering dilutes one of the main benefits of modern source-code management: tracking changes to sets of files rather than just individual files. As a corollary, this mismatch between SCM transactions and logical transactions complicates use of tools such as Mercurial bisect.

      The individual repos don't have a development cycle separate from the JDK as a whole; all the repos advance in lockstep with the JDK promotion cycle. The multiplicity of repos presents a larger than necessary barrier to entry to new developers and has lead to workarounds such as the "get source" script.

      Description

      To address these issues, a prototype of a consolidated forest has been developed. The prototype is available at:

      http://hg.openjdk.java.net/jdk10/consol-proto/

      Some of the supporting conversion scripts used to create the prototype are attached as unify.zip.

      In the prototype. the eight repositories have been combined into a single repository using an automated conversion script that preserves history on a per-file level, with the consolidated forest being synchronized at the tags used to mark JDK promotions. The changeset comments and creation date are also preserved.

      The prototype has another level of code reorganization. In the consolidated forests, code for Java modules is generally combined under a single top-level src directory. For example, today in the JDK forest there are module-based directories like

      $ROOT/jdk/src/java.base
      ...
      $ROOT/langtools/src/java.compiler
      ...

      In the consolidated forest, this code is instead organized as

      $ROOT/src/java.base
      $ROOT/src/java.compiler
      ...

      As a consequence, from the root of the repository the relative path of a source file in a module is preserved after the consolidation and src directory combination.

      An analogous but less aggressive reorganization is done for the test directories to go from

      $ROOT/jdk/test/Foo.java
      $ROOT/langtools/test/Bar.java

      to

      $ROOT/test/jdk/Foo.java
      $ROOT/test/langtools/Bar.java

      Since the effort is currently a prototype, not all portions of it are entirely complete and the fit and finish can be improved in some areas. The HotSpot C/C++ sources are moved to the shared src directory alongside the modularized Java code.

      While the regression tests will run with the current state of the prototype, further consolidations of the jtreg configuration files are possible and may be done in the future.

      Alternatives

      One alternative is to simply stay with the current set of repositories. The history of some or all of the repositories could have been dropped when moving to a single repository, but that was rejected. Consolidating a core subset of the repositories was considered, but rejected in favor of the simplicity of a single repository.

      Testing

      To validate the file contents, for each promotion tag a script was used to verify the contents of the split forest at that tag matched the contents of the consolidated repository at that tag. For a recent JDK 9 tag, builds of the split forest and consolidated forest at the same tag were compared; there were only minor and explainable differences.

      Risks and Assumptions

      The testing described above should mitigate the most serious risks of file corruptions and faulty builds. While the major portions of the needed work for the consolidation are complete in the prototype, various smaller supporting features may not be finished before the consolidations is put into production. The pre and post consolidation code bases are not related in a Mercurial sense. Diffs (with suitably massaged paths) will have to be used for forward- and back- ports as opposed to exporting and importing changesets.

        Issue Links

          Activity

          Hide
          ehelin Erik Helin added a comment -
          Attached the scripts used for consolidating a forest.
          Show
          ehelin Erik Helin added a comment - Attached the scripts used for consolidating a forest.
          Hide
          simonis Volker Simonis added a comment -
          This is a relatively big change so I wonder if it was considered to not only consolidate the JDK forest but instead change the version control system from Mercurial to Git. I understand that such a change is even more disruptive and it would require a lot of infrastructure work on Oracle side paired with the support of two VCSs for quite some time. Nevertheless Git is nowadays the standard open-source, distributed revision control system. It has a lot more traction and support compared to Mercurial and it may simply solve some of the currently visible problems like for example performance for big repos.

          I think there will be probably no better opportunity for such a change in the foreseeable future. I know Joe has posted on the discussion thread for this JEP that "a git migration is well outside the scope of this project" but I wonder if such a change has already been discussed or evaluated?

          As a matter of fact, with Graal at least one major OpenJDK project is already using Git as version control system (https://github.com/graalvm/graal-core) so there must be already some experience and evidence about the pros and cons of the two systems.
          Show
          simonis Volker Simonis added a comment - This is a relatively big change so I wonder if it was considered to not only consolidate the JDK forest but instead change the version control system from Mercurial to Git. I understand that such a change is even more disruptive and it would require a lot of infrastructure work on Oracle side paired with the support of two VCSs for quite some time. Nevertheless Git is nowadays the standard open-source, distributed revision control system. It has a lot more traction and support compared to Mercurial and it may simply solve some of the currently visible problems like for example performance for big repos. I think there will be probably no better opportunity for such a change in the foreseeable future. I know Joe has posted on the discussion thread for this JEP that "a git migration is well outside the scope of this project" but I wonder if such a change has already been discussed or evaluated? As a matter of fact, with Graal at least one major OpenJDK project is already using Git as version control system ( https://github.com/graalvm/graal-core ) so there must be already some experience and evidence about the pros and cons of the two systems.
          Hide
          ihse Magnus Ihse Bursie added a comment -
          [~simonis], any kind of transition to git would be effectively hampered by the current forest structure. If, at any future point, we should consider doing a transition to git, having a single repo will be a prerequisite. Trust me, the consolidation's gonna be disruptive enough, so that you wouldn't want to change all tooling at the same time as well.
          Show
          ihse Magnus Ihse Bursie added a comment - [~simonis], any kind of transition to git would be effectively hampered by the current forest structure. If, at any future point, we should consider doing a transition to git, having a single repo will be a prerequisite. Trust me, the consolidation's gonna be disruptive enough, so that you wouldn't want to change all tooling at the same time as well.
          Hide
          darcy Joe Darcy added a comment -
          The three main lines of development for JDK 10, master, client, and hotspot, were switched to consolidated repos in September 2017:

              http://mail.openjdk.java.net/pipermail/jdk10-dev/2017-September/000499.html

          A few small issues associated with the consolidation continue to be filed and fixed. Since development has been switched to the consolidated repo, marking this JEP as integrated.
          Show
          darcy Joe Darcy added a comment - The three main lines of development for JDK 10, master, client, and hotspot, were switched to consolidated repos in September 2017:      http://mail.openjdk.java.net/pipermail/jdk10-dev/2017-September/000499.html A few small issues associated with the consolidation continue to be filed and fixed. Since development has been switched to the consolidated repo, marking this JEP as integrated.

            People

            • Assignee:
              darcy Joe Darcy
              Reporter:
              darcy Joe Darcy
              Owner:
              Joe Darcy
              Reviewed By:
              Brian Goetz, Mikael Vidstedt
              Endorsed By:
              Mark Reinhold
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated: