★ APPLICATIONS ★ DIVERS ★ SELECTION SORT (COMPUTING WITH THE AMSTRAD) ★ |
LET'S SORT IT ALL OUT... (Computing with the Amstrad) | Bubble quickly or shell out? (Computing with the Amstrad) |
SIMON MONK and SUE BRADSHAW end their series on sorting data in Basic WE looked last month at two methods for sorting data into order. Now we'll investigate three more, and round off with some timings and guidelines for the various algorithms. Bubble sort This is a well-used and easy-to-implement sorting method. It is called bubble sort because the larger items are made to bubble up to the right. This can be a fast way of sorting if the data are in some form of order before you begin. 1: Set a counter i to 1. An example: 10 3 8 5 9 6 Swap 10 and 3 because they are out of order. 3 10 8 5 9 6 Swap 10 and 8. Notice how 10 is bubbling up to the right. 3 8 10 5 9 6 And so on. 3 8 5 10 9 6 3 8 5 9 10 6 3 8 5 9 6 10 Now the end of the data set has been reached. We have done some exchanging, so we are not finished. The whole process is therefore repeated. 3 8 5 9 6 10 3 and 8 are in order so do not swap them. 3 8 5 9 6 10 8 and 5 need swapping. 3 5 8 9 6 10 3 5 8 9 6 10 3 5 8 6 9 10 3 5 8 6 9 10 The end of the data has been reached again, but two swaps have taken place so we have not finished yet. 3 5 8 6 9 10 3 5 8 6 9 10 3 5 8 6 9 10 Swap 8 and 6. 3 5 6 8 9 10 Although the data is now sorted, the algorithm has to finish this pass and complete the next in which no items will need to be exchanged. To see the bubble sort in action merge the next program with last month's test program.
Table I lists the variables the sort routine uses. Quick sort As the name implies, quick sort's best feature is its speed. It is usually written as a recursive program, that is one which works by repeatedly calling itself. Unfortunately Locomotive Basic 1: Divide the data set into two parts, where all the numbers in the right hand side are greater than the numbers in the left hand side. This is done by moving two pointers from each edge towards the middle, exchanging the elements under the pointers if the element under the left hand one is greater than the element under the right hand one. 2: The same treatment is now applied to each half of the data set, and so on, until the data set is so divided that every individual element is sorted. Here's a worked example: 7 10 3 8 / 5 9 6 11 Swap 5 with 8, and 6 with 10 so all the right hand side are greater than all the left. 7 6/35/8 9/ 10 11 Swap 3 with 6 and 5 with 7. No swaps needed on the right. 5/3/6/7/8/9/10/11 Swap 3 with 5. No further swaps needed. The next program and Table II detail the appropriate subroutine.
Table II: Variables used in quick sort Shell sort We have already mentioned that some sorting methods work faster if the data is already partly in order. What is needed is some method of getting the smaller numbers down to the left hand side of the data set as quickly as possible. The final program, shell sort, is an attempt to do just this. Instead of just sorting adjacent numbers, it works on numbers separated by some distance. This distance is reduced until on the final pass the distance is one. The data is then sorted properly. This method is a modification of the insertion sort. The distance separating the items to be sorted is often derived by successively applying the formula: Distance - 3 x Distance + 1 Where Distance starts at one. The formula has been found by trial and error, and mathematicians have struggled in vain to find why it works so well. Using it we can calculate, for instance, the distances for sorting 50 items as 40,13,4,1. Here's how the algorithm works: 1: Find the largest distance that will just fit. 2: Do insertion sort at that distance. 3: Decrease the separation to the next step down. 4: Repeat steps two and three until the data set has been sorted at distance one. The data will now be in order. Here's an example: There are only nine elements so using the sequence 40 13 4 1, four is the first distance we can use. So: 7 10 3 8 5 9 6 11 2 First 7, 5 and 2 are sorted giving: 2 10 3 8 5 9 6 11 7 Now 10 and 9: 2 9 3 8 5 10 6 11 7 Next 3 and 6: 2 9 3 8 5 10 6 11 7 The first pass through the data is complete. The distance is now reduced to one, and the data is insertion sorted in the normal fashion. Notice how after the first pass the data is now partly sorted, allowing the insertion sort to run quickly.
Table III: Variables used in shell sort We have used the testing program to do some timings on each of the algorithms described. The results are shown in Table IV. Although these are for Basic, the conclusions apply to implementations in any language. The timings are in seconds and those in italics are predicted rather than measured. Note that insertion sort, which is the fastest algorithm apart from quick and shell sort, would take about 33 hours to process 10,000 items compared to 20 minutes for quick sort. Some rules of thumb based on the number of items needing to be sorted can be drawn from this table:
These rules of thumb are only intended as a rough guide and it is worth noting that quick sort works well in almost any situation. We hope that this introduction to data sorting has proved both interesting and painless. In the near future we'll publish a machine code sort routine which works on string arrays.
Table III: Variables used in shell sort CWTA |
|
|