Details

Type: Bug

Status: Resolved

Priority: P4

Resolution: Duplicate

Affects Version/s: 8, 9

Fix Version/s: None

Component/s: corelibs

Labels:

Subcomponent:

CPU:generic

OS:generic
Description
FULL PRODUCT VERSION :
java version "1.8.0_77"
Java(TM) SE Runtime Environment (build 1.8.0_77b03)
Java HotSpot(TM) 64Bit Server VM (build 25.77b03, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows [Version 10.0.10586]
A DESCRIPTION OF THE PROBLEM :
The attached program should produce a stream of the cartesian product of several given sets. (In the example program, the "sets" are sets of digits, and the "products" are numbers composed of those digits.) Instead, it buffers the entire cartesian product before any processing begins, thereby consuming an unlimited amount of memory, causing OutOfMemoryError in practical applications with large sets.
The documentation does not explicitly state that flatMap will use a bounded amount of memory if it can, but it does not seem to be in the spirit of streaming, to buffer the entire stream before processing any of it. The Workaround section below shows an alternative implementation using LongStream.concat, which actually *does* explicitly state that it "creates a lazily concatenated stream"; yet even that solution eagerly evaluates all but one of the dimensions of the cartesian product.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the attached program and observe that the complete cartesian product is constructed and buffered (as shown by the "peek" code that prints all generated values) before any processing begins.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED 
+ 1
+ 11
+ 111
First is 111
ACTUAL 
+ 1
+ 11
+ 111
+ 112
+ 113
+ 12
+ 121
+ 122
+ 123
+ 13
+ 131
+ 132
+ 133
+ 2
+ 21
+ 211
+ 212
+ 213
+ 22
+ 221
+ 222
+ 223
+ 23
+ 231
+ 232
+ 233
+ 3
+ 31
+ 311
+ 312
+ 313
+ 32
+ 321
+ 322
+ 323
+ 33
+ 331
+ 332
+ 333
First is 111
REPRODUCIBILITY :
This bug can be reproduced always.
 BEGIN SOURCE 
package org.vena.qb;
import java.util.stream.LongStream;
public class BugReport {
static final int NUM_DIMENSIONS = 3;
static final int TOP_DIGIT = 3;
public static void main(String[] args) {
LongStream permutations = LongStream.of(0L);
for (int dimension = 0; dimension < NUM_DIMENSIONS; dimension++) {
permutations = permutations.flatMap(tens > LongStream
.rangeClosed(1, TOP_DIGIT).map(ones> 10*tens + ones)
.peek(v>System.out.println("+ "+v)));
}
long first = permutations.findFirst().getAsLong();
System.out.println("First is " + first);
}
}
 END SOURCE 
CUSTOMER SUBMITTED WORKAROUND :
I've been unable to find a workaround involving streams. I have a suitable workaround using nonstream code, which I then wrap in a stream, so I've listed this as No Impact.
The closest I got to a streambased solution was this implementation, but it still buffers all but one dimension:
for (int dimension = 0; dimension < NUM_DIMENSIONS; dimension++) {
permutations = permutations
.mapToObj(tens > LongStream
.rangeClosed(1, TOP_DIGIT)
.map(ones > 10*tens + ones)
.peek(n>System.err.println("+ " + n)))
.reduce(LongStream.empty(), (a,b)>LongStream.concat(a, b));
}
java version "1.8.0_77"
Java(TM) SE Runtime Environment (build 1.8.0_77b03)
Java HotSpot(TM) 64Bit Server VM (build 25.77b03, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows [Version 10.0.10586]
A DESCRIPTION OF THE PROBLEM :
The attached program should produce a stream of the cartesian product of several given sets. (In the example program, the "sets" are sets of digits, and the "products" are numbers composed of those digits.) Instead, it buffers the entire cartesian product before any processing begins, thereby consuming an unlimited amount of memory, causing OutOfMemoryError in practical applications with large sets.
The documentation does not explicitly state that flatMap will use a bounded amount of memory if it can, but it does not seem to be in the spirit of streaming, to buffer the entire stream before processing any of it. The Workaround section below shows an alternative implementation using LongStream.concat, which actually *does* explicitly state that it "creates a lazily concatenated stream"; yet even that solution eagerly evaluates all but one of the dimensions of the cartesian product.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the attached program and observe that the complete cartesian product is constructed and buffered (as shown by the "peek" code that prints all generated values) before any processing begins.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED 
+ 1
+ 11
+ 111
First is 111
ACTUAL 
+ 1
+ 11
+ 111
+ 112
+ 113
+ 12
+ 121
+ 122
+ 123
+ 13
+ 131
+ 132
+ 133
+ 2
+ 21
+ 211
+ 212
+ 213
+ 22
+ 221
+ 222
+ 223
+ 23
+ 231
+ 232
+ 233
+ 3
+ 31
+ 311
+ 312
+ 313
+ 32
+ 321
+ 322
+ 323
+ 33
+ 331
+ 332
+ 333
First is 111
REPRODUCIBILITY :
This bug can be reproduced always.
 BEGIN SOURCE 
package org.vena.qb;
import java.util.stream.LongStream;
public class BugReport {
static final int NUM_DIMENSIONS = 3;
static final int TOP_DIGIT = 3;
public static void main(String[] args) {
LongStream permutations = LongStream.of(0L);
for (int dimension = 0; dimension < NUM_DIMENSIONS; dimension++) {
permutations = permutations.flatMap(tens > LongStream
.rangeClosed(1, TOP_DIGIT).map(ones> 10*tens + ones)
.peek(v>System.out.println("+ "+v)));
}
long first = permutations.findFirst().getAsLong();
System.out.println("First is " + first);
}
}
 END SOURCE 
CUSTOMER SUBMITTED WORKAROUND :
I've been unable to find a workaround involving streams. I have a suitable workaround using nonstream code, which I then wrap in a stream, so I've listed this as No Impact.
The closest I got to a streambased solution was this implementation, but it still buffers all but one dimension:
for (int dimension = 0; dimension < NUM_DIMENSIONS; dimension++) {
permutations = permutations
.mapToObj(tens > LongStream
.rangeClosed(1, TOP_DIGIT)
.map(ones > 10*tens + ones)
.peek(n>System.err.println("+ " + n)))
.reduce(LongStream.empty(), (a,b)>LongStream.concat(a, b));
}
Attachments
Issue Links
 duplicates

JDK8075939 Stream.flatMap() causes breaking of shortcircuiting of terminal operations
 Resolved