I couldn't believe that Java 8 streams are slower than for-loops when not running in parallel on multiple cores, but it's true, read this article by Angelika Langer.
→ If you use sequential streams then you don’t do it for performance reasons; you do it because you like the functional programming style.
I myself couldn't even verify that streams are faster than for-loops when running on multiple cores in parallel, not even with Java 11.
In this Blog I will present a performance test that compares a for-loop with a stream implementation. You need several test runs (first time is slowest!), different surrounding conditions, and an average calculation to get a realistic view on the performance of your application. Finally I will run the test also with parallel streams on 4 cores and present the results.
Mind that one CPU can have several cores.
In Java you call Runtime.getRuntime().availableProcessors()
to find out the number of cores,
on Ubuntu-LINUX you can see it in System-Monitor on the "Resources" tab,
on WINDOWS in Task-Manager on the "Performance" tab.
Test Source
Here is the main()
runner with initial test parameters:
import java.util.ArrayList; import java.util.Arrays; import java.util.Collection; public class StreamPerformance { public static void main(String[] args) { System.out.println("Java: "+System.getProperty("java.version")); StreamPerformance streamPerformance = new StreamPerformance(5000000); streamPerformance.performTests(40); } ..... }
This is the test class. It will iterate 5 million random test-numbers and find the maximum. I want 40 executions of this, taking into account also the warm-up phase.
All following source goes where the "....." is.
private final long[] longArray; private final Collection<Long> longList; private Long firstMax; public StreamPerformance(int testDataLength) { longArray = createTestArray(testDataLength); longList = createTestList(longArray); } private long[] createTestArray(int testDataLength) { final long[] longArray = new long[testDataLength]; for (int i = 0; i < testDataLength; i++) longArray[i] = (long) (Math.random() * 1000000d); return longArray; } private Collection<Long> createTestList(long[] longArray) { final Collection<Long> longList = new ArrayList<Long>(longArray.length); for (int i = 0; i < longArray.length; i++) longList.add(Long.valueOf(longArray[i])); return longList; }
The StreamPerformance
constructor sets up the test data as array and
Collection.
The collection is a clone of the array.
It takes part in the test because its iteration is slower than that of the array (different surrounding conditions).
public void performTests(final int numberOfTests) { long sumStreamWithArray = 0L; long sumForLoopWithArray = 0L; long sumStreamWithList = 0L; long sumForLoopWithList = 0L; for (int testNumber = 0; testNumber < numberOfTests; testNumber++) { final boolean doForLoop = (testNumber % 2 == 0); final boolean doArray = ((testNumber / 2) % 2 == 0); long millis = performTest(doForLoop, doArray); if (doForLoop) if (doArray) sumForLoopWithArray += millis; else sumForLoopWithList += millis; else if (doArray) sumStreamWithArray += millis; else sumStreamWithList += millis; System.out.println(testNumber+": "+(doForLoop ? "ForLoop" : "Stream")+" maximum in "+(doArray ? "array" : "list")+" found in "+millis+" millis."); } System.out.println("ForLoop with array millis: "+sumForLoopWithArray); System.out.println("Stream with array millis: "+sumStreamWithArray); System.out.println("ForLoop with list millis: "+sumForLoopWithList); System.out.println("Stream with list millis: "+sumStreamWithList); final long forLoopSum = sumForLoopWithArray + sumForLoopWithList; final long streamSum = sumStreamWithArray + sumStreamWithList; System.out.println("ForLoop millis: "+forLoopSum); System.out.println("Stream millis: "+streamSum); System.out.println("ForLoop was "+Math.round((double) streamSum / (double) forLoopSum)+" times faster than Stream."); }
This is still test management, not the maximum retrieval loop.
The performTests()
method executes the test as many times as its parameter demands,
but it will switch between array and collection, and it will switch between stream and for-loop
in a way that every second run will be done with stream.
Finally the collected sums are written to stdout
, first the results of the 4 test modes,
then a general result for stream and for-loop, and last it will tell how many times the for-loop was faster.
private long performTest(boolean doForLoop, boolean doArray) { final long startValue = Long.MIN_VALUE; final long before = System.currentTimeMillis(); final Long max; if (doForLoop) max = performForLoopTest(doArray, startValue); else max = performStreamTest(doArray, startValue); if (firstMax == null) firstMax = max; else if (firstMax.equals(max) == false) // assert maximum throw new IllegalStateException("Maximum "+max+" was not found correctly, should be "+firstMax); return (System.currentTimeMillis() - before); } private Long performForLoopTest(boolean doArray, final long startValue) { if (doArray) { long max = startValue; for (int i = 0; i < longArray.length; i++) { long l = longArray[i]; if (l > max) max = l; } return Long.valueOf(max); } Long max = Long.valueOf(startValue); for (Long l : longList) if (l > max) max = l; return max; } private Long performStreamTest(boolean doArray, final long startValue) { if (doArray) { long max = Arrays.stream(longArray).reduce(startValue, Math::max); return Long.valueOf(max); } Long max = longList.stream().reduce(Long.valueOf(startValue), Math::max); return max; }
These three methods perform one single test run, i.e.
performTest()
method will do exactly one iteration to find a maximum.
It receives the test-mode in parameters doForLoop
and doArray
.
According to that mode it calls either the stream test or the for-loop test.
It takes the execution time as milliseconds and returns it to the caller.
Mind that this also asserts that all maximum retrievals yield the same result.
The performForLoopTest()
method executes a for-loop, either iterating the array or the collection.
The performStreamTest()
does the same but using streams.
Both return the found maximum to the caller for assertion.
Click here to see the entire test-source for copy & paste.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 | package streams; import java.util.ArrayList; import java.util.Arrays; import java.util.Collection; /** * Check whether streams outperform for-loops. * See also https://jaxenter.com/java-performance-tutorial-how-fast-are-the-java-8-streams-118830.html */ public class StreamPerformance { public static void main(String[] args) { System.out.println("Java: "+System.getProperty("java.version")); StreamPerformance streamPerformance = new StreamPerformance(5000000); streamPerformance.performTests(40); } private final long[] longArray; private final Collection<Long> longList; private Long firstMax; public StreamPerformance(int testDataLength) { longArray = createTestArray(testDataLength); longList = createTestList(longArray); } private long[] createTestArray(int testDataLength) { final long[] longArray = new long[testDataLength]; for (int i = 0; i < testDataLength; i++) longArray[i] = (long) (Math.random() * 1000000d); return longArray; } private Collection<Long> createTestList(long[] longArray) { final Collection<Long> longList = new ArrayList<Long>(longArray.length); for (int i = 0; i < longArray.length; i++) longList.add(Long.valueOf(longArray[i])); return longList; } public void performTests(final int numberOfTests) { long sumStreamWithArray = 0L; long sumForLoopWithArray = 0L; long sumStreamWithList = 0L; long sumForLoopWithList = 0L; for (int testNumber = 0; testNumber < numberOfTests; testNumber++) { final boolean doForLoop = (testNumber % 2 == 0); final boolean doArray = ((testNumber / 2) % 2 == 0); long millis = performTest(doForLoop, doArray); if (doForLoop) if (doArray) sumForLoopWithArray += millis; else sumForLoopWithList += millis; else if (doArray) sumStreamWithArray += millis; else sumStreamWithList += millis; //System.out.println(testNumber+": "+(doForLoop ? "ForLoop" : "Stream")+" maximum in "+(doArray ? "array" : "list")+" found in "+millis+" millis."); } System.out.println("ForLoop with array millis: "+sumForLoopWithArray); System.out.println("Stream with array millis: "+sumStreamWithArray); System.out.println("ForLoop with list millis: "+sumForLoopWithList); System.out.println("Stream with list millis: "+sumStreamWithList); final long forLoopSum = sumForLoopWithArray + sumForLoopWithList; final long streamSum = sumStreamWithArray + sumStreamWithList; System.out.println("ForLoop millis: "+forLoopSum); System.out.println("Stream millis: "+streamSum); System.out.println("ForLoop was "+Math.round((double) streamSum / (double) forLoopSum)+" times faster than Stream."); } private long performTest(boolean doForLoop, boolean doArray) { final long startValue = Long.MIN_VALUE; final long before = System.currentTimeMillis(); final Long max; if (doForLoop) max = performForLoopTest(doArray, startValue); else max = performStreamTest(doArray, startValue); if (firstMax == null) firstMax = max; else if (firstMax.equals(max) == false) // assert maximum throw new IllegalStateException("Maximum "+max+" was not found correctly, should be "+firstMax); return (System.currentTimeMillis() - before); } private Long performForLoopTest(boolean doArray, final long startValue) { if (doArray) { long max = startValue; for (int i = 0; i < longArray.length; i++) { long l = longArray[i]; if (l > max) max = l; } return Long.valueOf(max); } Long max = Long.valueOf(startValue); for (Long l : longList) if (l > max) max = l; return max; } private Long performStreamTest(boolean doArray, final long startValue) { if (doArray) { long max = Arrays.stream(longArray).reduce(startValue, Math::max); return Long.valueOf(max); } Long max = longList.stream().reduce(Long.valueOf(startValue), Math::max); return max; } } |
So far the test logic. Let's see what it did on my machine.
Sequential Result
You could run this test now with Java 8 or Java 11. Here is the result of Java 8:
Java: 1.8.0_25 ForLoop with array millis: 58 Stream with array millis: 101 ForLoop with list millis: 185 Stream with list millis: 862 ForLoop millis: 243 Stream millis: 963 ForLoop was 4 times faster than Stream.
Now such results could depend on compiler or JVM optimizations. Let's see what Java 11 does (don't forget to also recompile with Java 11!):
Java: 11.0.2 ForLoop with array millis: 61 Stream with array millis: 126 ForLoop with list millis: 250 Stream with list millis: 662 ForLoop millis: 311 Stream millis: 788 ForLoop was 3 times faster than Stream.
Got better. But not good enough to blindly rewrite everything to use streams. Maybe it would make sense when using parallel(), but this has some subtle implications which could cause hard-to-detect bugs. You will have to rethink every single case before rewriting.
Parallel Test
For this test, rewriting is easy. You just need to change the perfomStreamTest()
method to the following:
private Long performStreamTest(boolean doArray, final long startValue) { if (doArray) { long max = Arrays.stream(longArray).parallel().reduce(startValue, Math::max); return Long.valueOf(max); } Long max = longList.stream().parallel().reduce(Long.valueOf(startValue), Math::max); return max; }
The sequential stream can easily be turned into a parallel one, just by appending a parallel()
call.
For convenience you could also use parallelStream()
instead of stream()
.
Here are the results of parallel processing on my 4-core machine:
Java: 1.8.0_25 ForLoop with array millis: 50 Stream with array millis: 113 ForLoop with list millis: 144 Stream with list millis: 703 ForLoop millis: 194 Stream millis: 816 ForLoop was 4 times faster than Stream.
Java: 11.0.2 ForLoop with array millis: 72 Stream with array millis: 124 ForLoop with list millis: 350 Stream with list millis: 638 ForLoop millis: 422 Stream millis: 762 ForLoop was 2 times faster than Stream.
I expected more. With 4 cores, parallel should be about 3 times faster than the sequential. Instead parallel stream improved just by 30%, and that only with Java 11, with Java 8 it's obviously at the same speed as its sequential counterpart.
Keine Kommentare:
Kommentar veröffentlichen