Monday, June 15, 2009

Fast Sorting - was Why is my bubblesort in Python so slow?

Someone posted this question in StackOverflow:

http://stackoverflow.com/questions/997322/why-is-my-bubble-sort-in-python-so-slow

I'm not sure why, but radix sort seems to be overlooked a lot as a sorting algorithm. This is a little strange, as the space requirements are not bad and it is O(N) unless you have a strange distribution.

I wrote a quick implementation in a few minutes (it's also easy to implement) and the time was < 1 sec to execute for 100,000 elements, compared to the poster's 2 hrs. Granted, I was using VB instead of Python :)

Option Strict On
Option Explicit On

Module Module1

Private Const MAX_SIZE As Integer = 100000
Private m_input(MAX_SIZE) As Integer
Private m_table(MAX_SIZE) As List(Of Integer)
Private m_randomGen As New Random()
Private m_operations As Integer = 0

Private Sub generateData()
' fill with random numbers between 0 and MAX_SIZE - 1
For i = 0 To MAX_SIZE - 1
m_input(i) = m_randomGen.Next(0, MAX_SIZE - 1)
Next

End Sub

Private Sub sortData()
For i As Integer = 0 To MAX_SIZE - 1
Dim x = m_input(i)
If m_table(x) Is Nothing Then
m_table(x) = New List(Of Integer)
End If
m_table(x).Add(x)
' clearly this is simply going to be MAX_SIZE -1
m_operations = m_operations + 1
Next
End Sub

Private Sub printData(ByVal start As Integer, ByVal finish As Integer)
If start < 0 Or start > MAX_SIZE - 1 Then
Throw New Exception("printData - start out of range")
End If
If finish < 0 Or finish > MAX_SIZE - 1 Then
Throw New Exception("printData - finish out of range")
End If
For i As Integer = start To finish
If m_table(i) IsNot Nothing Then
For Each x In m_table(i)
Console.WriteLine(x)
Next
End If
Next
End Sub

' run the entire sort, but just print out the first 100 for verification purposes
Private Sub test()
m_operations = 0
generateData()
Console.WriteLine("Time started = " & Now.ToString())
sortData()
Console.WriteLine("Time finished = " & Now.ToString & " Number of operations = " & m_operations.ToString())
' print out a random 100 segment from the sorted array
Dim start As Integer = m_randomGen.Next(0, MAX_SIZE - 101)
printData(start, start + 100)
End Sub

Sub Main()
test()
Console.ReadLine()
End Sub

End Module
Time started = 6/15/2009 4:04:08 PM
Time finished = 6/15/2009 4:04:08 PM Number of operations = 100000
21429
21430
21430
21431
21431
21432
21433
21435
21435
21435
21436
21437
21437
21439
21441
21441
21441
21446
21447
21448
21449
21451
21451
21454
21456
21456
21458
21460
21461
21461
21461
21463
21463
21464
21465
21466
21468
21468
21468
21469
21469
21470
21471
21471
21471
21471
21472
21473
21474
21474
21477
21479
21481
21481
21481
21483
21483
21486
21487
21489
21490
21493
21494
21495
21496
21496
21499
21500
21501
21501
21501
21502
21502
21502
21505
21509
21510
21510
21511
21512
21514
21515
21516
21517
21519
21519
21522
21524
21526

No comments:

Post a Comment