2718.us blog » eu http://2718.us/blog Miscellaneous Technological Geekery Tue, 18 May 2010 02:42:55 +0000 en hourly 1 http://wordpress.org/?v=3.0.4 Randomizing by Random-Comparison Sorting (Revisited) http://2718.us/blog/2010/02/24/randomizing-by-random-comparison-sorting-revisited/ http://2718.us/blog/2010/02/24/randomizing-by-random-comparison-sorting-revisited/#comments Wed, 24 Feb 2010 21:09:28 +0000 2718.us http://2718.us/blog/?p=215 Yesterday, I posted the results of my quick exploration of whether sorting the list {0,1,2,3,4} using a comparison function that randomly returns < or > (with equal probability).  My exploration was prompted by a report on the non-uniformity of the distribution of the random orderings of the browsers in Microsoft’s EU browser ballot.  I had said that it seemed likely that the distribution would vary based on the sorting algorithm used.

Today, I have data (and code) that confirms the distribution is sorting-algorithm-dependent.  For each sorting algorithm, 1,000,000 instances of the list {0,1,2,3,4} were sorted with a random comparison function and the relative frequencies (rounded to the nearest whole percent) of each number in each position were computed.

Mathematica’s Sort[]
position/number 0 1 2 3 4
first 18% 12% 12% 12% 46%
second 18% 24% 18% 18% 24%
third 20% 20% 26% 20% 12%
fourth 22% 22% 22% 28% 6%
fifth 22% 22% 22% 22% 12%
BubbleSort
position/number 0 1 2 3 4
first 36% 28% 20% 10% 6%
second 28% 32% 22% 12% 6%
third 20% 22% 32% 18% 10%
fourth 12% 12% 18% 38% 20%
fifth 6% 6% 10% 20% 60%
QuickSort (random pivot)
position/number 0 1 2 3 4
first 20% 20% 20% 20% 20%
second 20% 20% 20% 20% 20%
third 20% 20% 20% 20% 20%
fourth 20% 20% 20% 20% 20%
fifth 20% 20% 20% 20% 20%
MergeSort
position/number 0 1 2 3 4
first 24% 24% 26% 12% 12%
second 26% 24% 18% 16% 16%
third 18% 18% 22% 20% 20%
fourth 16% 16% 18% 26% 26%
fifth 16% 16% 18% 26% 26%
SelectionSort
position/number 0 1 2 3 4
first 6% 6% 12% 26% 50%
second 12% 12% 20% 32% 24%
third 20% 20% 26% 20% 12%
fourth 30% 30% 20% 12% 6%
fifth 30% 30% 20% 12% 6%

The distributions are significantly different among these sorts.  QuickSort appears to provide a uniform distribution.  I believe that this is because QuickSort will only compare a particular pair of elements once, whereas each of the other sorting algorithms may compare a given pair of elements more than once (and with a random comparison function, receive a different result from one time to the next).

Here is the Mathematica notebook I used to generate this data: Randomize by Sorting.nb.  As noted in the file, some of the code for the sorting algorithms was taken from other locations and may be/is subject to their copyrights and/or license terms (I reasonably believe that this use complies with their licenses and/or constitutes fair use.  Also, some algorithms exhibited improper behavior when trying to sort lists with duplicate entries using a normal comparison function as noted in the file, though this should have no effect on the data above.

]]>
http://2718.us/blog/2010/02/24/randomizing-by-random-comparison-sorting-revisited/feed/ 1
The EU Browser Ballot and Random Sorting http://2718.us/blog/2010/02/23/the-eu-browser-ballot-and-random-sorting/ http://2718.us/blog/2010/02/23/the-eu-browser-ballot-and-random-sorting/#comments Wed, 24 Feb 2010 02:09:44 +0000 2718.us http://2718.us/blog/?p=212 An Ars Technica “etc” post linked to a TechCrunch article (apparently based on a Slovakian article, but I didn’t look into the Slovakian article to be sure) that talks about the ordering of the browsers in Microsoft’s EU Browser Ballot not being uniformly distributed.  At a glance at the Javascript that does the randomizing of the browsers (randomly orders the top 5, and randomly orders the rest), it appears to randomize by calling the Javascript array sort with a comparison function that returns < half the time and > the other half of the time.  I believe that this is likely the underlying cause of the non-uniformity of the orderings.

The second result in a google search for “javascript sort” says:

To randomize the order of the elements within an array, what we need is the body of our sortfunction to return a number that is randomly <0, 0, or >0, irrespective to the relationship between “a” and “b”. The below will do the trick:

//Randomize the order of the array:
var myarray=[25, 8, "George", "John"]
myarray.sort(function() {return 0.5 - Math.random()}) //Array elements now scrambled

This is almost exactly the method of randomization used in the browser ballot javascript.

To test the results of this randomization technique, I applied it 1,000,000 times to the list {0,1,2,3,4} in Mathematica and tabulated the relative frequencies of each number in each position. (Rounded to the nearest whole %).

position/number 0 1 2 3 4
first 18% 12% 12% 12% 47%
second 18% 24% 18% 18% 24%
third 20% 21% 27% 20% 12%
fourth 22% 22% 22% 28% 6%
fifth 22% 22% 22% 22% 12%

At a glance, it appears that the distribution is far from uniform.  My quick attempt at re-learning how to use the Χ2 test gave a probability less than 1×10-100000 that this data matched a uniform distribution (if someone can confirm/fix that, please comment).

I used the Mathematica Sort[] command to do the sorting.  I don’t know what algorithm that uses.  It appears that the algorithm used by Javascript’s sort() varies from browser to browser, though the browser ballot would be displayed in IE8 by default.  I suspect that the distribution is highly dependent on the sorting algorithm used, though I cannot readily verify it [edit: I verified it].  Regardless, this seems to be a very poor way to generate a random ordering.

]]>
http://2718.us/blog/2010/02/23/the-eu-browser-ballot-and-random-sorting/feed/ 1