Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
759 views
in Technique[技术] by (71.8m points)

algorithm - Why does Java 6 Arrays#sort(Object[]) change from mergesort to insertionsort for small arrays?

Java 6's mergesort implementation in Arrays.java uses an insertion-sort if the array length is less than some threshold. This value is hard-coded to 7. As the algorithm is recursive, this eventually happens many times for a large array. The canonical merge-sort algorithm does not do this, just using merge-sort all the way down until there is only 1 element in the list.

Is this an optimisation? If so, how is it supposed to help? And why 7? The insertion sort (even of <=7 things) increases the number of comparisons required to sort a large array dramatically - so will add cost to a sort where compareTo() calls are slow.

array-size vs #-of-comparisons for different values of INSERTIONSORT_THRESHOLD

(x-axis is size of array, y-axis is # of comparisons, for different values of INSERTIONSORT_THRESHOLD)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Yes this is intentional. While the Big-O of mergesort is less than that of quadratic sorts such as insertion sort, the operations it does are more complex and thus slower.

Consider sorting an array of length 8. Merge sort makes ~14 recursive calls to itself in addition to 7 merge operations. Each recursive call contributes some non-trivial overhead to the run-time. Each merge operation involves a loop where index variables must be initialized, incremented, and compared, temporary arrays must be copied, etc. All in all, you can expect well over 300 "simple" operations.

On the other hand, insertion sort is inherently simple and uses about 8^2=64 operations which is much faster.

Think about it this way. When you sort a list of 10 numbers by hand, do you use merge sort? No, because your brain is much better at doing simple things like like insertion sort. However if I gave you a year to sort a list of 100,000 numbers, you might be more inclined to merge sort it.

As for the magic number 7, it is empirically derived to be optimal.

EDIT: In a standard insertion sort of 8 elements, the worst case scenario leads to ~36 comparisons. In a canonical merge sort, you have ~24 comparisons. Adding in the overhead from the method calls and complexity of operations, insertion sort should be faster. Additionally if you look at the average case, insertion sort would make far fewer comparisons than 36.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...