Display context menus for TreeView items in C#

[context menus]

This example displays different context menus when you right-click on different kinds of nodes in a TreeView control. In this example, different nodes represent factories, groups, or individuals. Different node types display different context menus.

When it starts, the program adds data to its TreeView control just as the example Display tooltips for TreeView nodes in C# does. When you press the right mouse button down over a node, the following MouseDown event handler executes.

// Display the appropriate context menu.
private void trvOrg_MouseDown(object sender, MouseEventArgs e)
{
    // Make sure this is the right button.
    if (e.Button != MouseButtons.Right) return;

    // Select this node.
    TreeNode node_here = trvOrg.GetNodeAt(e.X, e.Y);
    trvOrg.SelectedNode = node_here;

    // See if we got a node.
    if (node_here == null) return;

    // See what kind of object this is and
    // display the appropriate popup menu.
    if (node_here.Tag is FactoryData)
        ctxFactory.Show(trvOrg, new Point(e.X, e.Y));
    else if (node_here.Tag is GroupData)
        ctxGroup.Show(trvOrg, new Point(e.X, e.Y));
    else if (node_here.Tag is PersonData)
        ctxPerson.Show(trvOrg, new Point(e.X, e.Y));
}

If the mouse button is not the right button, the code does nothing. If this is the right button, the code uses the TreeView control’s GetNodeAt method to find the node under the mouse. It sets the TreeView control’s SelectedNode property to that node. If the node is null, the event handler then exits.

If the code gets this far (it’s the right mouse button and there is a node under the mouse), then the program examines the type of object contained in the node’s Tag property and displays the appropriate context menu. The parameters to the calls to the context menus’ Show method tell which control is associated with the menu and where the menu should be positioned.

The context menus all have a Delete command labeled appropriately: Delete Factory, Delete Group, or Delete Person.

The following code shows the code behind each of the context menus’ Delete commands.

private void mnuFactoryDelete_Click(object sender, EventArgs e)
{
    trvOrg.SelectedNode.Remove();
}

private void mnuGroupDelete_Click(object sender, EventArgs e)
{
    trvOrg.SelectedNode.Remove();
}

private void mnuPersonDelete_Click(object sender, EventArgs e)
{
    trvOrg.SelectedNode.Remove();
}

Each of the commands simply deletes the TreeView control’s selected node, which was set by the control’s MouseDown event handler.


Download Example   Follow me on Twitter   RSS feed   Donate




Posted in controls, user interface | Tagged , , , , , , , , , , | Leave a comment

Enumerate TreeView nodes in C#

[enumerate TreeView nodes]

The previous two posts show two ways you can enumerate TreeView nodes that are checked. The idea is to recursively crawl over the tree’s nodes and pick out the checked ones.

Unfortunately that method isn’t easily extendable. For example, if you wanted to find the nodes that were expanded, you would need to write new code to do that. The new code would be almost the same as the old code but with a different test to see whether a node should be included in the result.

This example shows how you can enumerate TreeView nodes to examine every node. Then the main program can test to see which nodes it needs to use.

To make using the code easier, the example defines three extension methods for the TreeView class.

The first two are shown in the following code. They work together so, to make it easier to tell which one is which, I’ve color coded their calls to each other.

// Return the nodes in this collection and their descendants.
public static IEnumerable<TreeNode> Descendants(
    this TreeNodeCollection nodes)
{
    foreach (TreeNode node in nodes)
    {
        yield return node;
        foreach (TreeNode child in node.Nodes.Descendants())
        {
            yield return child;
        }
    }
}

// Return this node and its descendants.
public static IEnumerable<TreeNode> Descendants(this TreeNode node)
{
    yield return node;
    foreach (TreeNode child in node.Nodes.Descendants())
    {
        yield return child;
    }
}

The first (blue) method works on a TreeNodeCollection. It loops through the nodes in the collection and calls the second (red) method for each.

The first (blue) method works on a TreeNodeCollection. It loops through the nodes in the collection and yields each node for the enumeration. It then calls the second (red) method for each node’s Nodes collection to enumerate the node’s descendants. It loops through the returned nodes and yields them.

The second (red) method works on a single TreeNode. It first yields the node. Then it uses the first (blue) method to enumerate the node’s descendants. It loops through the descendants and yields them.

This highlights a common complaint about the yield statement. There’s no easy way to return the results of another IEnumerable method. Instead you need to iterate through it and return each of the values individually.

This example also defines the following extension method for the TreeView control.

// Return all of the TreeView's nodes.
public static IEnumerable<TreeNode> Descendants(this TreeView trv)
{
    foreach (TreeNode node in trv.Nodes.Descendants())
        yield return node;
}

This method simply invokes the earlier (blue) one to enumerate TreeView nodes in the control’s Nodes collection.

This is probably the most useful of the three extension methods. In fact you could make the other two private so they would not be usable outside of the class that defines them.

Now that the program has a way to enumerate TreeView nodes, it uses code similar to the following to find the nodes that are checked.

// List the checked TreeNodes.
private void btnShowChecked_Click(object sender, EventArgs e)
{
    string results = "";
    foreach (TreeNode node in trvMeals.Descendants())
    {
        if (node.Checked) results += node.Text + "\n";
    }
    MessageBox.Show(results);
}

This code uses the last (green) extension method to enumerate the control’s nodes. If a node is checked, it adds it to a string. When it’s finished, the program displays the string.

You might wonder whether this is as efficient as the methods used by the earlier posts that test each node to see if it is checked before adding it to the enumeration. In either case you need visit every node and test every node to see if it’s checked, so the number of steps is the same. The earlier versions return fewer nodes in the enumeration and there’s probably some overhead for yielding a node, so they may be a bit more efficient, but the difference shouldn’t be huge.


Download Example   Follow me on Twitter   RSS feed   Donate




Posted in algorithms, controls, recursion, user interface | Tagged , , , , , , , , , , , , , , , , | Leave a comment

Yield checked TreeView nodes in C#

[checked TreeView nodes]

The previous example used recursion to traverse the nodes in a TreeView control and fill a list with the nodes that are checked. This example uses a slightly different approach. It traverses the TreeView control’s nodes and uses the yield return statement to return the checked TreeView nodes. The main program can then iterate over the returned nodes.

The following method recursively crawls through the nodes in a TreeNodeCollection and yields the checked TreeView nodes.

// Return a list of the TreeNodes that are checked.
private IEnumerable<TreeNode> CheckedNodes(TreeNodeCollection nodes)
{
    foreach (TreeNode node in nodes)
    {
        // Yield this node.
        if (node.Checked) yield return node;

        // Yield the checked descendants of this node.
        foreach (TreeNode checked_child in CheckedNodes(node.Nodes))
            yield return checked_child;
    }
}

The method loops through the TreeNodeCollection. If a node is checked, it uses yield return to return that node.

The method then calls itself recursively to find the checked descendants of that node that are in its Nodes collection. It iterates over those descendants and uses yield return to return them.

This is one of the more common complaints about the yield return syntax: it doesn’t allow you to return an IEnumerable in a single statement. For example, it might be nice to be able to do something like yield return CheckedNodes(nodes.Nodes).

Unfortunately you can’t do that. Instead you must iterate through another IEnumerable and return each item individually.

If you think about the way yield works, however, you’ll realize that it doesn’t make much difference in terms of performance. The yield statement returns a value to the loop that is iterating over its values and then suspends its own execution until the next time a value is needed. Being able to return an IEnumerable in a single statement might be syntactically convenient but it wouldn’t change the fact that the program needs to wait between each returned value.

But back to this example. To make it easier to find the control’s checked TreeView nodes, the program defines the following method.

// Return a list of the checked TreeView nodes.
private IEnumerable<TreeNode> CheckedNodes(TreeView trv)
{
    return CheckedNodes(trv.Nodes);
}

This method simply invokes the previous one to find the checked TreeView nodes in the control’s Nodes collection.

Notice that this method simply returns the result of the call to the other CheckedNodes method without using yield return. If a method uses yield return, then Visual Studio knows that the method is an iterator and it can only return values with yield return.
That’s not the case here, however. This method doesn’t use yield return so it’s allowed to use a normal return statement to return the IEnumerable result of the method call.

Finally the main program uses the following code to invoke this method.

// List the checked TreeNodes.
private void btnShowChecked_Click(object sender, EventArgs e)
{
    string results = "";
    foreach (TreeNode node in CheckedNodes(trvMeals))
        results += node.Text + "\n";
    MessageBox.Show(results);
}

This code iterates through the nodes returned by the preceding CheckedNodes method, adding their text to a string. When it’s done, the code displays the string.


Download Example   Follow me on Twitter   RSS feed   Donate




Posted in algorithms, controls, recursion, user interface | Tagged , , , , , , , , , , , , , , | Leave a comment

Make a list of checked TreeView nodes in C#

[example]

If you set a TreeView control’s CheckBoxes property to true, then the control displays boxes that the user can check to select nodes. In that case you will probably need to find the checked TreeView nodes at some point.

Strangely the TreeView control doesn’t provide a simple method for finding the checked TreeView nodes. In fact, it doesn’t even have a simple way to enumerate the control’s nodes so you can see which ones are checked. Both of those omissions seem odd. After all, the ListBox control provides five properties that help you figure out which items are selected: SelectedIndex, SelectedIndices, SelectedItem, SelectedItems, and SelectedValue.

One way to find the checked TreeView nodes is to recursively crawl over the tree’s nodes and find them. The TreeView control has a Nodes property. That property has type TreeNodeCollection, and it contains the control’s top-level nodes.

Each node in the tree has a similar Nodes property that contains its child nodes.

To enumerate every node in the tree, you can write a method that enumerates the nodes within a TreeNodeCollection and their descendants in the tree. This example uses the following method to build a list of the checked TreeView nodes.

// Return a list of the TreeNodes that are checked.
private void FindCheckedNodes(
    List<TreeNode> checked_nodes, TreeNodeCollection nodes)
{
    foreach (TreeNode node in nodes)
    {
        // Add this node.
        if (node.Checked) checked_nodes.Add(node);

        // Check the node's descendants.
        FindCheckedNodes(checked_nodes, node.Nodes);
    }
}

The method takes as a parameter a List<TreeNode> where it will place the checked TreeView nodes. It also takes as a parameter a TreeNodeCollection.

The method loops through the nodes in the collection. It checks each node and then recursively calls itself to check the node’s children stored in its Nodes property.

The following method wraps the call to FindCheckedNodes for the TreeView control.

// Return a list of the checked TreeView nodes.
private List<TreeNode> CheckedNodes(TreeView trv)
{
    List<TreeNode> checked_nodes = new List();
    FindCheckedNodes(checked_nodes, trvMeals.Nodes);
    return checked_nodes;
}

This method simply creates a List<TreeNode> and then calls the FindCheckedNodes method for the TreeView control’s Nodes collection.

When you click the program’s Show Checked button, it uses the following code to get the list of checked TreeView nodes.

// Get the checked nodes.
List<TreeNode> checked_nodes = CheckedNodes(trvMeals);

The code that displays the list is straightforward so it isn’t shown here. Download the example to see the details.


Download Example   Follow me on Twitter   RSS feed   Donate




Posted in algorithms, controls, recursion, user interface | Tagged , , , , , , , , , , , , , | Leave a comment

Display tooltips for TreeView nodes in C#

[example]

At design time, I added a TreeView control to the form. I also added an associated ImageList control to hold images for the TreeView control’s nodes, and I set the TreeView control’s ImageList property to the ImageList control. Finally I added a ToolTip component to the form and named it ttOrg.

When the program starts, it makes three types of TreeView nodes representing factories, groups, and persons. It makes an associated object for each node and assigns it to the node’s Tag property to give the node’s extra information about the objects they represent.

In this example the object classes FactoryData, GroupData, and PersonData really just hold names to display in the tooltips, but in a real program you could give them more interesting information and methods.

The following code shows how the program makes a new factory node.

factory = AddTreeViewNode(trvOrg.Nodes,
     "R & D", imFactory,
    new FactoryData("Factory: R & D"));

Here the parameters to the AddTreeViewNode method represent:

  • The TreeView control’s Nodes collection
  • The text to display on the node
  • A number giving the index of the node’s picture in the ImageList control
  • The object that should be stored in the new node’s Tag property

The following code shows the AddTreeViewNode method that the previous code invokes.

// Add a new node to the collection.
private TreeNode AddTreeViewNode(TreeNodeCollection parent_nodes,
    string text, int image_index, object tag_object)
{
    TreeNode new_node = parent_nodes.Add(text);
    new_node.ImageIndex = image_index;
    new_node.SelectedImageIndex = image_index;
    new_node.Tag = tag_object;
    return new_node;
}

The method adds a new TreeNode to the TreeNodeCollection passed in as the first parameter. It sets the node’s ImageIndex to indicate which ImageList image should be displayed for this node. Finally it sets the node’s Tag property to the associated object.

The following code shows the simple FactoryData class used by this example.

public class FactoryData
{
    public string Name = "";

    // Initializing constructor.
    public FactoryData(string new_name)
    {
        Name = new_name;
    }
}

While the program is running, the TreeView control’s trvOrg_MouseMove event handler shown in the following code displays the tooltips.

// Display the appropriate tooltip.
private TreeNode old_node = null;
private void trvOrg_MouseMove(object sender, MouseEventArgs e)
{
    // Find the node under the mouse.
    TreeNode node_here = trvOrg.GetNodeAt(e.X, e.Y);
    if (node_here == old_node) return;
    old_node = node_here;

    // See if we have a node.
    if (old_node == null)
    {
        ttOrg.SetToolTip(trvOrg, "");
    }
    else
    {
        // Get this node's object data.
        if (node_here.Tag is FactoryData)
        {
            FactoryData factory_data = node_here.Tag as FactoryData;
            ttOrg.SetToolTip(trvOrg, factory_data.Name);
        }
        else if (node_here.Tag is GroupData)
        {
            GroupData group_data = node_here.Tag as GroupData;
            ttOrg.SetToolTip(trvOrg, group_data.Name);
        }
        else if (node_here.Tag is PersonData)
        {
            PersonData person_data = node_here.Tag as PersonData;
            ttOrg.SetToolTip(trvOrg, person_data.Name);
        }
    }
}

This event handler uses the TreeView control’s GetNodeAt method to see which node is under the mouse. If this is the same as the previously found node, the event handler returns.

If the new node is null, the code sets the TreeView control’s ToolTip to an empty string to remove the previous tooltip.

If the new node is not null, the code examines the node’s type and sets the ToolTip appropriately.

The key is the GetNodeAt method, which you can use to find the node under the mouse.


Download Example   Follow me on Twitter   RSS feed   Donate




Posted in controls, user interface | Tagged , , , , , , , , , , , , , | 1 Comment

Compare sorting algorithms in C#, article 5 of 5

Countingsort

[sorting]

The discussion of Quicksort in my previous post mentions that the fastest possible sorting algorithms that use comparisons take on the order of N * log(N) steps. Countingsort does not use comparisons so it is not bound by that result. In fact Countingsort is so fast that it seems to sort using magic rather than comparisons.

Unfortunately Countingsort only works well very specific circumstances. First, the items you are sorting must be integers. You cannot use Countingsort to sort strings. Second, the range of values that the items have must be fairly limited. If your items range from 1 to 100, Countingsort will work extremely well. If the items range from 1 to 30 thousand, Countingsort still be fast but not quite as fast. If the items range from 1 to 10 billion, you may be better off with Array.Sort.

The basic idea is that the algorithm counts the number of items that have each value. For example, if the list contains 39 items that have value 17, then the algorithm needs to know that.

The algorithm starts by allocating a temporary counts array of integers to hold the count for each value. This array has bounds that cover the range of the items. If the items range in value from 0 to MaxValue, then the algorithm creates an array with indexes ranging from 0 to MaxValue like this:

int[] counts = new int[MaxValue + 1];

Because C# arrays always have 0 for a lower bound, working with a range of values that doesn’t start at 0 is trickier. For example, if values range from -100 to +100, the algorithm must take extra steps to translate those indexes into something that will fit in an array. I’ll leave that as an exercise for you to work on.

Next, the algorithm looks through the items in the list and increments the counts entry corresponding to that item. When this stage is finished, counts[i] holds a count of the number of items that have the value i.

Note that allocating the counts array initializes all of its entries to 0 so the algorithm doesn’t need to do that.

// Count the items.
for (int i = 0; i < NumItems; i++)
{
    counts[values[i]]++;
}

The algorithm then runs through the counts array copying the necessary number of items with a particular value into the sorted array. For example, if counts[7] is 12, then the program writes the value 7 into the sorted array 12 times. The following code shows this step.

// Place items in the sorted array.
int index = 0;
for (int i = 0; i <= MaxValue; i++)
{
    for (int j = 0; j < counts[i]; j++)
    {
        values[index++] = i;
    }
}

The following code shows the entire algorithm. It’s remarkably simple.

// Countingsort.
private void Countingsort(int[] values)
{
    // Create the Counts array.
    int[] counts = new int[MaxValue + 1];

    // Count the items.
    for (int i = 0; i < NumItems; i++)
    {
        counts[values[i]]++;
    }

    // Place items in the sorted array.
    int index = 0;
    for (int i = 0; i <= MaxValue; i++)
    {
        for (int j = 0; j < counts[i]; j++)
        {
            values[index++] = i;
        }
    }
}

To sort N numbers that have a range that spans M values, Countingsort executes roughly 2 * N + M steps.

First it reads the N items to fill in the counts array. That takes N steps.

Then the algorithm runs through the M values in the counts array and adds their values to the sorted array. It takes M steps to run through the counts array and the code drops N values into the sorted array, so this second step takes roughly N + M steps.

That makes the total run time on the order of N + (N + M) = 2 * N + M.

If N is large and M is relatively small, then 2 * N + M is much faster than the N * log(N) performance given by Quicksort. To sort 30,000 numbers that ranged from 1 to 1,000, for example, Quicksort might execute more than 300,000 steps. Countingsort would execute only about 61,000 steps and take about a fifth as long.

If you look at the picture above, you’ll see that Array.Sort took 2.38 seconds to sort 30,000 items 1,000 times with values between 0 and 1,000 while Countingsort took only 0.58 seconds.

This algorithm easily beats Array.Sort but only when the values have a limited range.

Conclusion

[Book: Essential Algorithms]

Which sorting algorithm you should choose depends on your data. The Array.Sort method provided by the .NET Framework works extremely well and is easy to use so it should be your first choice. If you find that Array.Sort isn’t giving you good enough performance and you have special information about the data you are sorting, you may be able to improve performance by using another algorithm. If the data is almost completely sorted already, you may get better performance from Bubblesort. If the list of items is very small, you may get better performance from Bubblesort or Selectionsort. Finally if the range of values used by the data is very small, Countingsort may be much faster than Array.Sort and the other methods.


Download Example   Follow me on Twitter   RSS feed   Donate




Posted in algorithms, performance | Tagged , , , , , , , , , , , , , , | Leave a comment

Compare sorting algorithms in C#, part 4 of 5

Quicksort

[sorting]

Quicksort is a recursive algorithm that uses a divide-and-conquer technique. While the list of items to be sorted contains at least two items, Quicksort divides it into two sublists and recursively calls itself to sort the sublists.

The Quicksort method first checks to see if the list it is sorting contains fewer than two items. If so, it simply returns because a list containing zero or one items is already sorted.

if the list contains two or more items, the method picks an item from the array to use as a dividing item. It then moves all of the items that belong before this dividing item in the left part of the array, and it places all of the other items in right part of the array. The method then recursively calls itself to sort two smaller sublists.

There are several ways in which the Quicksort method might pick the dividing item. One of the easiest is to simply use the first item in the sublist being sorted. If the list is initially arranged randomly, that item will probably be a reasonable choice. Chances are good that the item will belong somewhere in the middle of the sorted list so the two sublists the algorithm creates will be reasonably equal in size.

However, if the numbers are initially sorted or almost sorted, or if they are initially sorted in reverse order, then this method fails miserably. In that case the first item in the list will divide the list into one sublist that contains almost every item and another that contains almost no items. Since the larger sublist does not shrink much, the algorithm makes little headway.

In this case the algorithm will require on the order of N2 steps. This is the same order of performance given by Selectionsort, but this algorithm is much more complicated so it’s even slower.

A better method for selecting the dividing item is to choose one randomly. Then no matter how the items in the list are initially arranged, chances are good that the selected item will belong somewhere in the middle of the sorted list and the sublists will be fairly evenly sized.

As long as the sublists are fairly equal in size, the algorithm will require on the order of N * log(N) steps. It can be proven that this is the fastest time possible for a sorting algorithm that sorts by using comparisons. By adding a little randomness, this algorithm avoids the possibility of its worst case N2 behavior and gives an expected case performance of N * log(N).

Quicksort is very fast in practice as well as theory, so it is the favorite sorting algorithm of many programmers. For all of these reasons, I’m also pretty sure it’s the algorithm used by the Array.Sort method.

The following code shows the example program’s method that starts the Quicksort algorithm. This is the method that the program passes into the RunAlgorithm method.

// Quicksort.
private void Quicksort(int[] values)
{
    Quicksort(values, 0, NumItems - 1);
}

This code simply calls an overloaded version of the Quicksort method that takes as parameters the array of values, and the indexes of the first and last items that the method should sort. Those indexes tell the method which part of the array represents the sublist that it should sort.

The following code shows the version of the Quicksort method that takes those indexes as parameters.

// Use Quicksort to recursively sort
// the items with indexes min <= i <= max.
private void Quicksort(int[] values, int min, int max)
{
    // If the list has no more than 1 element, it's sorted.
    if (min >= max) return;

    // Pick a dividing item.
    int i = Rand.Next(min, max + 1);
    int med_value = values[i];

    // Swap it to the front so we can find it easily.
    values[i] = values[min];

    // Move the items smaller than this into the left
    // half of the list. Move the others into the right.
    int lo = min;
    int hi = max;
    for (;;)
    {
        // Look down from hi for a value < med_value.
        while (values[hi] >= med_value)
        {
            hi--;
            if (hi <= lo) break;
        }
        if (hi <= lo)
        {
            values[lo] = med_value;
            break;
        }

        // Swap the lo and hi values.
        values[lo] = values[hi];
        
        // Look up from lo for a value >= med_value.
        lo++;
        while (values[lo] < med_value)
        {
            lo++;
            if (lo >= hi) break;
        }
        if (lo >= hi)
        {
            lo = hi;
            values[hi] = med_value;
            break;
        }
        
        // Swap the lo and hi values.
        values[hi] = values[lo];
    }
    
    // Sort the two sublists
    Quicksort(values, min, lo - 1);
    Quicksort(values, lo + 1, max);
}

[Book: Essential Algorithms]

This code follows the description shown earlier. It picks a dividing item, moves the items to either side of that item, and then calls the method recursively to sort the two sublists.

Although I think Array.Sort uses the Quicksort algorithm, its implementation is better than the one I can build in C# code so it is consistently faster.



Download Example   Follow me on Twitter   RSS feed   Donate




Posted in algorithms, performance | Tagged , , , , , , , , , , , , , | Leave a comment

Compare sorting algorithms in C#, part 3 of 5

Selectionsort

[sorting]

Selectionsort is a very simple algorithm. First you search the list for the smallest item. Then you swap that item with the item at the top of the list. Next you find the second smallest item and swap it with the second item in the list. You continue finding the next smallest item and swapping it into its final position in the list until you have swapped all of the items into their final positions.


The following code shows the example program’s implementation of Selectionsort.

// Selectionsort.
private void Selectionsort(int[] values)
{
    for (int i = 0; i < NumItems - 1; i++)
    {
        int best_value = values[i];
        int best_j = i;
        for (int j = i + 1; j < NumItems; j++)
        {
            if (values[j] < best_value)
            {
                best_value = values[j];
                best_j = j;
            }
        }
        values[best_j] = values[i];
        values[i] = best_value;
    }
}

If the list contains N items, then when you look for the K-th smallest item you must examine each of the N – K items that you have not yet placed in their final positions. That means the total number of steps the algorithm needs is:

N + (N - 1) + (N - 2) + ... + 1 = N * (N + 1) / 2

This function is on the order of N2. That means if you increase the number of items in the list by a factor of 2, the run time of the algorithm increases by a factor of roughly 22 = 4.

[Book: Essential Algorithms]

Several other sorting algorithms require only about N * log(N) steps (Quicksort is one that I’ll describe later), so Selectionsort is not a very fast algorithm for large lists.

Selectionsort is fine for small lists, however. It is very simple so it’s easy to program, debug, and maintain over time. In fact it is so simple that it is actually faster than the more complicated algorithms if the list you are sorting is very small. If your list contains only a dozen or so items, Selectionsort is often competitive with other algorithms.



Download Example   Follow me on Twitter   RSS feed   Donate




Posted in algorithms, performance | Tagged , , , , , , , , , , , , , , | Leave a comment

Compare sorting algorithms in C#, part 2 of 5

Bubblesort

[sorting]

Bubblesort is a specialized algorithm designed for sorting items that are already mostly in sorted order. If only one or two items in your list are out of order, Bubblesort is very fast. If the items in your list are initially arranged randomly, Bubblesort is extremely slow. For this reason you should be careful when you use Bubblesort.

The idea behind the algorithm is to scan through the list looking for two adjacent items that are out of order. When you find two such items, you swap them and continue down the list. You repeat this process until all of the items are in order.

Figure 1 shows a small list where the item 1 is out of order. During the first pass through the list, the algorithm finds that the items with values 4 and 1 are out of order so it swaps them. During the next pass it finds that the items with values 3 and 1 are out of order so it swaps them. During the third pass it swaps the items with values 2 and 1, and the list is sorted. The way in which the item with value 1 seems to bubble towards the top of the list is what gives Bubblesort its name.

Pass 1 Pass 2 Pass 3 Pass 4
2 2 2 1
3 3 1 2
4 1 3 3
1 4 4 4

Figure 1. In a Bubblesort, item 1 slowly “bubbles” to the top.

You can improve the algorithm if you alternate upward and downward passes through the list. During downward passes an item that is too far down in the list, like the item with value 1 in the previous example, can move up only one position. In contrast, an item that is too far up in the list might move many positions. If you add upward passes through the list, items can quickly move many positions both up and down through the list.

If you study the algorithm for a while, you’ll see that at least one new item reaches its final position during each pass through the list. If the items in your list begin mostly in sorted order, the algorithm will need only one or two passes through the list to finish the ordering. If you have a list of 1,000 items with only one out of order, the algorithm would require only 2 passes taking 2,000 steps to sort the list.

If the items begin arranged randomly, the algorithm may need one pass per item in the list. The algorithm would need up to 1 million steps to arrange a list of 1,000 items.

The following code shows the C# code for this Bubblesort algorithm.

// Bubblesort with:
//   - Alternating upward and downward passes
//   - Holding bubbled item in a temporary variable
//   - Updating min and max to narrow the search range
private void Bubblesort(int[] values)
{
    int min = 0;
    int max = NumItems - 1;
    while (min < max)
    {
        // Bubble up.
        int last_swap = min - 1;
        // For i = min + 1 To max
        int i = min + 1;
        while (i <= max)
        {
            // Find a bubble.
            if (values[i - 1] > values[i])
            {
                // See where to drop the bubble.
                int tmp = values[i - 1];
                int j = i;
                do
                {
                    values[j - 1] = values[j];
                    j++;
                    if (j > max) break;
                } while (values[j] < tmp);
                values[j - 1] = tmp;
                last_swap = j - 1;
                i = j + 1;
            } 
            else
            {
                i++;
            }
        }
        // Update max.
        max = last_swap - 1;

        // Bubble down.
        last_swap = max + 1;
        // For i = max - 1 To min Step -1
        i = max - 1;
        while (i >= min)
        {
            // Find a bubble.
            if (values[i + 1] < values[i])
            {
                // See where to drop the bubble.
                int tmp = values[i + 1];
                int j = i;
                do
                {
                    values[j + 1] = values[j];
                    j--;
                    if (j < min) break;
                } while (values[j] > tmp);
                values[j + 1] = tmp;
                last_swap = j + 1;
                i = j - 1;
            }
            else
            {
                i--;
            }
        }
        // Update min.
        min = last_swap + 1;
    }
}

[Book: Essential Algorithms]

Bubblesort works quite well when the list of items has very few items out of order. Array.Sort is so fast, however, that it beats Bubblesort unless the list holds only a couple of items that are out of order.

Bubblesort is also quite fast for very small lists (3 or 4 items) and it often beats Array.Sort in that case. However in that situation the total time for sorting is so small that it may be better to just use Array.Sort so you don’t have to worry about bugs in your Bubblesort code.



Download Example   Follow me on Twitter   RSS feed   Donate




Posted in algorithms, performance | Tagged , , , , , , , , , , , , , , | Leave a comment

Compare sorting algorithms in C#, post 1 of 5

[sorting]

This post begins a series of articles about sorting algorithms. Much of the text originally appeared in the December 1995 issue of Visual Basic Developer magazine, a publication that seems to no longer exist. All of these posts include the same download: an example program that demonstrates all of the sorting algorithms described in the articles. You can use the example to compare the different algorithms under difference circumstances. Keep in mind that some of the algorithms may take a very long time depending on the kind of data you are sorting. For example, Bubblesort will take a very long time to sort 1 million randomly arranged items.

Why So Many Algorithms?

The .NET Framework’s Array class provides a Sort method that sorts an array very quickly, apparently by using Quicksort. In most cases, you can simply use Array.Sort to get very good performance. Other built-in sorting tools such as LINQ use the same method so they also provide good performance.

The Framework’s implementation is quite good, so you are unlikely to beat it’s performance for “random” data, so why should you bother to learn how these algorithms work?

[Essential Algorithms: A Practical Approach to Computer Algorithms]

Algorithm books spend a great deal of time on sorting algorithms. For example, my book Essential Algorithms: A Practical Approach to Computer Algorithms spends a full chapter describing different sorting algorithms. There are several reasons why sorting algorithms are so popular.

First, sorting is a common task in computer applications. Almost any data is more meaningful if you can sort it and display it in various ways.

Second, sorting algorithms demonstrate many important algorithmic techniques such as binary subdivision, recursion, and linear interpolation. Studying sorting algorithms will teach you techniques that you may be able to use in other parts of your applications.

Third, different sorting algorithms behave differently for different data so no single algorithm is best under all circumstances. Often Quicksort provides excellent performance, but under some circumstances Bubblesort or Countingsort can give you much better performance.

Finally complex algorithms are just plain fun! Okay, I admit I’m an algorithm nerd, but I think of these things as puzzles. It’s more fun for me to get a complex algorithm working correctly than it is to work a Sudoku puzzle.

This series of posts describes several important sorting algorithms, including a Quicksort algorithm that is similar to the one used by Array.Sort. Depending on the circumstances, you can pick the algorithm that best suits your data.

Array.Sort

The Array class’s Sort method sorts arrays very quickly. It’s easy to use and you don’t need to debug it, so I recommend that you start by using it and decide whether it is fast enough for your needs. If Array.Sort is fast enough, then you’re done and you don’t need to implement, debug, and maintain a more complicated algorithm. If you have performance problems and know something about your data that will let one of the other algorithms give you better performance, then you can implement a different algorithm.

The example program uses Array.Sort as a benchmark for the other algorithms. The following code shows how the program runs a test of the Array.Sort method.

// Sort with Array.Sort.
private void btnArraySort_Click(object sender, EventArgs e)
{
    RunAlgorithm(txtTimeArraySort, ArraySort);
}

// Run an algorithm.
private delegate void Algorithm(int[] values);
private void RunAlgorithm(TextBox time_textbox, Algorithm algorithm)
{
    time_textbox.Clear();
    Cursor = Cursors.WaitCursor;
    Refresh();
    int num_trials = int.Parse(txtNumTrials.Text);

    // Prepare the data.
    PrepareData();

    Stopwatch watch = new Stopwatch();
    for (int trial = 0; trial < num_trials; trial++)
    {
        // Copy the unsorted data into the SortedValues array.
        Array.Copy(Values, SortedValues, NumItems);

        // Run the algorithm.
        watch.Start();
        algorithm(SortedValues);
        watch.Stop();
    }

    // Display the total time.
    time_textbox.Text =
        watch.Elapsed.TotalSeconds.ToString("0.00") + " sec";

    // Verify that the algorithm worked.
    Verify();

    Cursor = Cursors.Default;
}

When you click the Array.Sort button, the button’s event handler calls the RunAlgorithm method passing it an algorithm and the TextBox where it should display the elapsed time.

The next piece of code declares a delegate named Algorithm that represents a method that takes an array of integers as a parameter.

The RunAlgorithm method takes a TextBox and an Algorithm as parameters. It performs some setup and calls PrepareData to make the data that will be sorted.

The PrepareData method is relatively straightforward so it isn’t shown here. It just generates random numbers as specified by the parameters you enter on the form. If the Sorted box is checked, the method sorts the data and then switches some items around to make “# Unsorted” of them be unsorted. Download the example and look at the code to see how it works.

Next the RunAlgorithm method performs the trials. For each trial the code copies the unsorted values into the SortedValues array and then calls the current algorithm to sort the SortedValues array.

The method finishes by displaying the total elapsed time.

Note that the timing is approximate. Many trials can even out small, random fluctuations, but for extremely fast tests such as sorting 5 items, the timing may not be very good. In most cases the results are reasonable, however, and you can use them to compare different algorithms so I’m not going to worry about more accurate timers.

The following code shows how the program responds when you click the Array.Sort button.

// Sort with Array.Sort.
private void btnArraySort_Click(object sender, EventArgs e)
{
    RunAlgorithm(txtTimeArraySort, ArraySort);
}

This code simply calls RunAlgorithm, passing it the txtTimeArraySort TextBox and the ArraySort method to be tested. The code for the other algorithms is similar. Their event handlers just pass different result TextBox controls and algorithm methods.

The following code shows the ArraySort method.

// Sort with Array.Sort.
private void ArraySort(int[] values)
{
    // Sort.
    Array.Sort(values);
}

This code simply calls the Array class’s Sort method. Nothing could be simpler!

If you look closely at the picture at the top of this post, you’ll see that Array.Sort beat Quicksort by a fair margin but was beaten by Countingsort. (Bubblesort and Selectionsort would have taken forever so I didn’t run them on this data.)

[Book: Essential Algorithms]

It’s important to note, however, that this test used 10,000 items with values between 0 and 10,000. Different algorithms perform differently with different kinds of data, so Array.Sort isn’t always fastest. It’s often fastest, however, and very easy to use, so it’s a good first choice.

My next few posts describe the other algorithms in greater detail so you can learn when to use which algorithm.



Download Example   Follow me on Twitter   RSS feed   Donate




Posted in algorithms, performance | Tagged , , , , , , , , , , , , , , | 6 Comments