<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>2718.us blog &#187; eu</title>
	<atom:link href="http://2718.us/blog/tag/eu/feed/" rel="self" type="application/rss+xml" />
	<link>http://2718.us/blog</link>
	<description>Miscellaneous Technological Geekery</description>
	<lastBuildDate>Tue, 18 May 2010 02:42:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.4</generator>
		<item>
		<title>Randomizing by Random-Comparison Sorting (Revisited)</title>
		<link>http://2718.us/blog/2010/02/24/randomizing-by-random-comparison-sorting-revisited/</link>
		<comments>http://2718.us/blog/2010/02/24/randomizing-by-random-comparison-sorting-revisited/#comments</comments>
		<pubDate>Wed, 24 Feb 2010 21:09:28 +0000</pubDate>
		<dc:creator>2718.us</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[algorithms]]></category>
		<category><![CDATA[browser]]></category>
		<category><![CDATA[browser ballot]]></category>
		<category><![CDATA[bubblesort]]></category>
		<category><![CDATA[eu]]></category>
		<category><![CDATA[european union]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[mathematica]]></category>
		<category><![CDATA[mathematics]]></category>
		<category><![CDATA[mergesort]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[nb]]></category>
		<category><![CDATA[quicksort]]></category>
		<category><![CDATA[selectionsort]]></category>
		<category><![CDATA[sort]]></category>
		<category><![CDATA[sorting]]></category>
		<category><![CDATA[sorting algorithms]]></category>
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://2718.us/blog/?p=215</guid>
		<description><![CDATA[Yesterday, I posted the results of my quick exploration of whether sorting the list {0,1,2,3,4} using a comparison function that randomly returns &#60; or &#62; (with equal probability).  My exploration was prompted by a report on the non-uniformity of the distribution of the random orderings of the browsers in Microsoft&#8217;s EU browser ballot.  I had [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday, I posted <a href="http://2718.us/blog/2010/02/23/the-eu-browser-ballot-and-random-sorting/">the results of my quick exploration</a> of whether sorting the list {0,1,2,3,4} using a comparison function that randomly returns &lt; or &gt; (with equal probability).  My exploration was prompted by <a href="http://techcrunch.com/2010/02/22/microsoft-ballot-screen/">a report on the non-uniformity of the distribution</a> of the random orderings of the browsers in <a href="http://www.browserchoice.eu/BrowserChoice/browserchoice_en.htm">Microsoft&#8217;s EU browser ballot</a>.  I had said that it seemed likely that the distribution would vary based on the sorting algorithm used.</p>
<p>Today, I have data (and code) that confirms the distribution is sorting-algorithm-dependent.  For each sorting algorithm, 1,000,000 instances of the list {0,1,2,3,4} were sorted with a random comparison function and the relative frequencies (rounded to the nearest whole percent) of each number in each position were computed.<span id="more-215"></span></p>
<table border="1" cellspacing="0">
<tbody>
<tr>
<td>Mathematica&#8217;s Sort[]</td>
<td>
<table>
<tbody>
<tr>
<td>position/number</td>
<td>0</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
</tr>
<tr>
<td>first</td>
<td>18%</td>
<td>12%</td>
<td>12%</td>
<td>12%</td>
<td>46%</td>
</tr>
<tr>
<td>second</td>
<td>18%</td>
<td>24%</td>
<td>18%</td>
<td>18%</td>
<td>24%</td>
</tr>
<tr>
<td>third</td>
<td>20%</td>
<td>20%</td>
<td>26%</td>
<td>20%</td>
<td>12%</td>
</tr>
<tr>
<td>fourth</td>
<td>22%</td>
<td>22%</td>
<td>22%</td>
<td>28%</td>
<td>6%</td>
</tr>
<tr>
<td>fifth</td>
<td>22%</td>
<td>22%</td>
<td>22%</td>
<td>22%</td>
<td>12%</td>
</tr>
</tbody>
</table>
</td>
</tr>
<tr>
<td>BubbleSort</td>
<td>
<table>
<tbody>
<tr>
<td>position/number</td>
<td>0</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
</tr>
<tr>
<td>first</td>
<td>36%</td>
<td>28%</td>
<td>20%</td>
<td>10%</td>
<td>6%</td>
</tr>
<tr>
<td>second</td>
<td>28%</td>
<td>32%</td>
<td>22%</td>
<td>12%</td>
<td>6%</td>
</tr>
<tr>
<td>third</td>
<td>20%</td>
<td>22%</td>
<td>32%</td>
<td>18%</td>
<td>10%</td>
</tr>
<tr>
<td>fourth</td>
<td>12%</td>
<td>12%</td>
<td>18%</td>
<td>38%</td>
<td>20%</td>
</tr>
<tr>
<td>fifth</td>
<td>6%</td>
<td>6%</td>
<td>10%</td>
<td>20%</td>
<td>60%</td>
</tr>
</tbody>
</table>
</td>
</tr>
<tr>
<td>QuickSort (random pivot)</td>
<td>
<table>
<tbody>
<tr>
<td>position/number</td>
<td>0</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
</tr>
<tr>
<td>first</td>
<td>20%</td>
<td>20%</td>
<td>20%</td>
<td>20%</td>
<td>20%</td>
</tr>
<tr>
<td>second</td>
<td>20%</td>
<td>20%</td>
<td>20%</td>
<td>20%</td>
<td>20%</td>
</tr>
<tr>
<td>third</td>
<td>20%</td>
<td>20%</td>
<td>20%</td>
<td>20%</td>
<td>20%</td>
</tr>
<tr>
<td>fourth</td>
<td>20%</td>
<td>20%</td>
<td>20%</td>
<td>20%</td>
<td>20%</td>
</tr>
<tr>
<td>fifth</td>
<td>20%</td>
<td>20%</td>
<td>20%</td>
<td>20%</td>
<td>20%</td>
</tr>
</tbody>
</table>
</td>
</tr>
<tr>
<td>MergeSort</td>
<td>
<table>
<tbody>
<tr>
<td>position/number</td>
<td>0</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
</tr>
<tr>
<td>first</td>
<td>24%</td>
<td>24%</td>
<td>26%</td>
<td>12%</td>
<td>12%</td>
</tr>
<tr>
<td>second</td>
<td>26%</td>
<td>24%</td>
<td>18%</td>
<td>16%</td>
<td>16%</td>
</tr>
<tr>
<td>third</td>
<td>18%</td>
<td>18%</td>
<td>22%</td>
<td>20%</td>
<td>20%</td>
</tr>
<tr>
<td>fourth</td>
<td>16%</td>
<td>16%</td>
<td>18%</td>
<td>26%</td>
<td>26%</td>
</tr>
<tr>
<td>fifth</td>
<td>16%</td>
<td>16%</td>
<td>18%</td>
<td>26%</td>
<td>26%</td>
</tr>
</tbody>
</table>
</td>
</tr>
<tr>
<td>SelectionSort</td>
<td>
<table>
<tbody>
<tr>
<td>position/number</td>
<td>0</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
</tr>
<tr>
<td>first</td>
<td>6%</td>
<td>6%</td>
<td>12%</td>
<td>26%</td>
<td>50%</td>
</tr>
<tr>
<td>second</td>
<td>12%</td>
<td>12%</td>
<td>20%</td>
<td>32%</td>
<td>24%</td>
</tr>
<tr>
<td>third</td>
<td>20%</td>
<td>20%</td>
<td>26%</td>
<td>20%</td>
<td>12%</td>
</tr>
<tr>
<td>fourth</td>
<td>30%</td>
<td>30%</td>
<td>20%</td>
<td>12%</td>
<td>6%</td>
</tr>
<tr>
<td>fifth</td>
<td>30%</td>
<td>30%</td>
<td>20%</td>
<td>12%</td>
<td>6%</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<p>The distributions are significantly different among these sorts.  QuickSort appears to provide a uniform distribution.  I believe that this is because QuickSort will only compare a particular pair of elements once, whereas each of the other sorting algorithms may compare a given pair of elements more than once (and with a random comparison function, receive a different result from one time to the next).</p>
<p>Here is the Mathematica notebook I used to generate this data: <a href="http://2718.us/blog/wp-content/uploads/2010/02/testing-randomize-by-sorting-2.nb">Randomize by Sorting.nb</a>.  As noted in the file, some of the code for the sorting algorithms was taken from other locations and may be/is subject to their copyrights and/or license terms (I reasonably believe that this use complies with their licenses and/or constitutes fair use.  Also, some algorithms exhibited improper behavior when trying to sort lists with duplicate entries using a normal comparison function as noted in the file, though this should have no effect on the data above.</p>
]]></content:encoded>
			<wfw:commentRss>http://2718.us/blog/2010/02/24/randomizing-by-random-comparison-sorting-revisited/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The EU Browser Ballot and Random Sorting</title>
		<link>http://2718.us/blog/2010/02/23/the-eu-browser-ballot-and-random-sorting/</link>
		<comments>http://2718.us/blog/2010/02/23/the-eu-browser-ballot-and-random-sorting/#comments</comments>
		<pubDate>Wed, 24 Feb 2010 02:09:44 +0000</pubDate>
		<dc:creator>2718.us</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[browser]]></category>
		<category><![CDATA[browser ballot]]></category>
		<category><![CDATA[eu]]></category>
		<category><![CDATA[european union]]></category>
		<category><![CDATA[ie8]]></category>
		<category><![CDATA[internet explorer]]></category>
		<category><![CDATA[javascript]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[mathematica]]></category>
		<category><![CDATA[mathematics]]></category>
		<category><![CDATA[microsoft]]></category>
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://2718.us/blog/?p=212</guid>
		<description><![CDATA[An Ars Technica &#8220;etc&#8221; post linked to a TechCrunch article (apparently based on a Slovakian article, but I didn&#8217;t look into the Slovakian article to be sure) that talks about the ordering of the browsers in Microsoft&#8217;s EU Browser Ballot not being uniformly distributed.  At a glance at the Javascript that does the randomizing of [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://arstechnica.com/microsoft/news/2010/02/-the-javascript-code-on.ars">An Ars Technica &#8220;etc&#8221; post</a> linked to <a href="http://techcrunch.com/2010/02/22/microsoft-ballot-screen/">a TechCrunch article</a> (apparently based on <a href="http://www.dsl.sk/article.php?article=8770">a Slovakian article</a>, but I didn&#8217;t look into the Slovakian article to be sure) that talks about the ordering of the browsers in <a href="http://www.browserchoice.eu/BrowserChoice/browserchoice_en.htm">Microsoft&#8217;s EU Browser Ballot</a> not being uniformly distributed.  At a glance at the Javascript that does the randomizing of the browsers (randomly orders the top 5, and randomly orders the rest), it appears to randomize by calling the Javascript array sort with a comparison function that returns &lt; half the time and &gt; the other half of the time.  I believe that this is likely the underlying cause of the non-uniformity of the orderings.</p>
<p><a href="http://www.javascriptkit.com/javatutors/arraysort.shtml">The second result</a> in <a href="http://www.google.com/search?sourceid=chrome&amp;ie=UTF-8&amp;q=javascript+sort">a google search for &#8220;javascript sort&#8221;</a> says:</p>
<blockquote><p>To randomize the order of the elements within an array, what we need is the body of our sortfunction to return a number that is randomly &lt;0, 0, or &gt;0, irrespective to the relationship between &#8220;a&#8221; and &#8220;b&#8221;. The below will do the trick:</p>
<pre>//Randomize the order of the array:
var myarray=[25, 8, "George", "John"]
myarray.sort(function() {return 0.5 - Math.random()}) //Array elements now scrambled</pre>
</blockquote>
<p>This is almost exactly the method of randomization used in the browser ballot javascript.</p>
<p>To test the results of this randomization technique, I applied it 1,000,000 times to the list {0,1,2,3,4} in Mathematica and tabulated the relative frequencies of each number in each position.  (Rounded to the nearest whole %).</p>
<table>
<tbody>
<tr>
<td>position/number</td>
<td>0</td>
<td>1</td>
<td>2</td>
<td>3</td>
<td>4</td>
</tr>
<tr>
<td>first</td>
<td>18%</td>
<td>12%</td>
<td>12%</td>
<td>12%</td>
<td>47%</td>
</tr>
<tr>
<td>second</td>
<td>18%</td>
<td>24%</td>
<td>18%</td>
<td>18%</td>
<td>24%</td>
</tr>
<tr>
<td>third</td>
<td>20%</td>
<td>21%</td>
<td>27%</td>
<td>20%</td>
<td>12%</td>
</tr>
<tr>
<td>fourth</td>
<td>22%</td>
<td>22%</td>
<td>22%</td>
<td>28%</td>
<td>6%</td>
</tr>
<tr>
<td>fifth</td>
<td>22%</td>
<td>22%</td>
<td>22%</td>
<td>22%</td>
<td>12%</td>
</tr>
</tbody>
</table>
<p>At a glance, it appears that the distribution is far from uniform.  My quick attempt at re-learning how to use the Χ<sup>2</sup> test gave a probability less than 1×10<sup>-100000</sup> that this data matched a uniform distribution (if someone can confirm/fix that, please comment).</p>
<p>I used the Mathematica Sort[] command to do the sorting.  I don&#8217;t know what algorithm that uses.  It appears that the algorithm used by Javascript&#8217;s sort() varies from browser to browser, though the browser ballot would be displayed in IE8 by default.  I suspect that the distribution is highly dependent on the sorting algorithm used, though I cannot readily verify it [<em>edit</em>: <a href="http://2718.us/blog/2010/02/24/randomizing-by-random-comparison-sorting-revisited/">I verified it</a>].  Regardless, this seems to be a very poor way to generate a random ordering.</p>
]]></content:encoded>
			<wfw:commentRss>http://2718.us/blog/2010/02/23/the-eu-browser-ballot-and-random-sorting/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

