Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6982173

(coll) Performance anomalies in AbstractSet.removeAll

    Details

    • Type: Bug
    • Status: Open
    • Priority: P4
    • Resolution: Unresolved
    • Affects Version/s: 7
    • Fix Version/s: None
    • Component/s: core-libs
    • Labels:

      Description

      As written about in
       http://msmvps.com/blogs/jon_skeet/archive/2010/07/29/there-s-a-hole-in-my-abstraction-dear-liza-dear-liza.aspx

      there can be performance anomolies in AbstractSet.removeAll under some conditions; quoting:

      " This implementation [of AbstractSet.removeAll] determines which is the smaller of this set and the specified collection, by invoking the size method on each. If this set has fewer elements, then the implementation iterates over this set, checking each element returned by the iterator in turn to see if it is contained in the specified collection. If it is so contained, it is removed from this set with the iterator's remove method. If the specified collection has fewer elements, then the implementation iterates over the specified collection, removing from this set each element returned by the iterator, using this set's remove method.

      Now that sounds reasonable on the surface of it - iterate through the smaller collection, check for the presence in the bigger collection. However, this is where the abstraction is leaky. Just because we can ask for the presence of an item in a large collection doesn't mean it's going to be fast. In our case, the collections are the same size - but checking for the presence of an item in the HashSet is O(1) whereas checking in the ArrayList is O(N)... whereas the cost of iterating is going to be the same for each collection. Basically by choosing to iterate over the HashSet and check for presence in the ArrayList, we've got an O(M * N) solution overall instead of an O(N) solution. Ouch. The removeAll method is making an "optimization" based on assumptions which just aren't valid in this case."

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                michaelm Michael McMahon
                Reporter:
                darcy Joe Darcy
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Imported:
                  Indexed: