2718.us blog » mergesort http://2718.us/blog Miscellaneous Technological Geekery Tue, 18 May 2010 02:42:55 +0000 en hourly 1 http://wordpress.org/?v=3.0.4 Randomizing by Random-Comparison Sorting (Revisited) http://2718.us/blog/2010/02/24/randomizing-by-random-comparison-sorting-revisited/ http://2718.us/blog/2010/02/24/randomizing-by-random-comparison-sorting-revisited/#comments Wed, 24 Feb 2010 21:09:28 +0000 2718.us http://2718.us/blog/?p=215 Yesterday, I posted the results of my quick exploration of whether sorting the list {0,1,2,3,4} using a comparison function that randomly returns < or > (with equal probability).  My exploration was prompted by a report on the non-uniformity of the distribution of the random orderings of the browsers in Microsoft’s EU browser ballot.  I had said that it seemed likely that the distribution would vary based on the sorting algorithm used.

Today, I have data (and code) that confirms the distribution is sorting-algorithm-dependent.  For each sorting algorithm, 1,000,000 instances of the list {0,1,2,3,4} were sorted with a random comparison function and the relative frequencies (rounded to the nearest whole percent) of each number in each position were computed.

Mathematica’s Sort[]
position/number 0 1 2 3 4
first 18% 12% 12% 12% 46%
second 18% 24% 18% 18% 24%
third 20% 20% 26% 20% 12%
fourth 22% 22% 22% 28% 6%
fifth 22% 22% 22% 22% 12%
BubbleSort
position/number 0 1 2 3 4
first 36% 28% 20% 10% 6%
second 28% 32% 22% 12% 6%
third 20% 22% 32% 18% 10%
fourth 12% 12% 18% 38% 20%
fifth 6% 6% 10% 20% 60%
QuickSort (random pivot)
position/number 0 1 2 3 4
first 20% 20% 20% 20% 20%
second 20% 20% 20% 20% 20%
third 20% 20% 20% 20% 20%
fourth 20% 20% 20% 20% 20%
fifth 20% 20% 20% 20% 20%
MergeSort
position/number 0 1 2 3 4
first 24% 24% 26% 12% 12%
second 26% 24% 18% 16% 16%
third 18% 18% 22% 20% 20%
fourth 16% 16% 18% 26% 26%
fifth 16% 16% 18% 26% 26%
SelectionSort
position/number 0 1 2 3 4
first 6% 6% 12% 26% 50%
second 12% 12% 20% 32% 24%
third 20% 20% 26% 20% 12%
fourth 30% 30% 20% 12% 6%
fifth 30% 30% 20% 12% 6%

The distributions are significantly different among these sorts.  QuickSort appears to provide a uniform distribution.  I believe that this is because QuickSort will only compare a particular pair of elements once, whereas each of the other sorting algorithms may compare a given pair of elements more than once (and with a random comparison function, receive a different result from one time to the next).

Here is the Mathematica notebook I used to generate this data: Randomize by Sorting.nb.  As noted in the file, some of the code for the sorting algorithms was taken from other locations and may be/is subject to their copyrights and/or license terms (I reasonably believe that this use complies with their licenses and/or constitutes fair use.  Also, some algorithms exhibited improper behavior when trying to sort lists with duplicate entries using a normal comparison function as noted in the file, though this should have no effect on the data above.

]]>
http://2718.us/blog/2010/02/24/randomizing-by-random-comparison-sorting-revisited/feed/ 1