Use branch and bound to solve the partition problem in C#

[partitioning problem]

The post Use exhaustive search to solve the partition problem in C# explains the partition problem and how you can use an exhaustive search to find solutions for it.

Unfortunately the number of possible solutions is enormous. If you are dividing N items, then there are 2N possible ways you can divide them into two groups. That means there are too many possible solutions to try them all unless N is quite small. For example, if you can examine 1 million possible solutions per second, it would take 18 minutes to examine every possible solution for 30 items, 12.7 days for 40 items, and 35.7 years for 50 items.

Branch and bound is a technique for reducing the number of branches that you need to visit while searching a tree. You can consider the partitioning problem as a tree search if you consider each decision whether to put an item in set 1 or set 2 as selecting a branch in the tree. If there are N items, then the tree is a full binary tree with N levels so it contains 2N leaf nodes corresponding to possible solutions.

In branch and bound, the program keeps track of the most improvement a branch could yield if you continued to follow it. Then if the improvement cannot possibly make the current test solution better than best solution found so far, the program stops exploring that branch.

For example, suppose you have already found a division of 30 items where the two sets’ values differ by 10. Now suppose you are considering a test solution that has already assigned 20 of the items so the sets in the test solution so far differ by 20 and the total value of the remaining 10 items is 5. Even if you add all of the remaining items to the smaller of the two sets, the best you could do would be to make the sets differ by 15. That’s not an improvement over the best solution found so far so there’s no point continuing to consider this test solution. That lets you skip the recursive calls that would visit the rest of the tree and that can potentially save a huge amount of time.

The following version of the PartitionValuesFromIndex method takes an additional unassigned_value parameter to keep track of the total amount of value that is not yet assigned in the test solution. It only calls itself recursively if that value can potentially improve the best solution.

// Partition the values with those before index start_index fixed.
// test_assignment is the test assignment so far.
// test_value = total value of the first set in test_assignment.
// Initially best assignment and its error are in
//     best_assignment and best_err.
// Update those to reflect any improved solution we find.
private void PartitionValuesFromIndex(int[] values,
    int start_index, int total_value, int unassigned_value,
    bool[] test_assignment, int test_value,
    ref bool[] best_assignment, ref int best_err)
    // If start_index is beyond the end of the array,
    // then all entries have been assigned.
    if (start_index >= values.Length)
        // We're done. See if this assignment is better than
        // what we have so far.
        int test_err = Math.Abs(2 * test_value - total_value);
        if (test_err < best_err)
            // This is an improvement. Save it.
            best_err = test_err;
            best_assignment = (bool[])test_assignment.Clone();

        // See if there's any way we can assign
        // the remaining items to improve the solution.
        int test_err = Math.Abs(2 * test_value - total_value);
        if (test_err - unassigned_value < best_err)
            // There's a chance we can make an improvement.
            // We will now assign the next item.
            unassigned_value -= values[start_index];

            // Try adding values[start_index] to set 1.
            test_assignment[start_index] = true;
            PartitionValuesFromIndex(values, start_index + 1,
                total_value, unassigned_value,
                test_assignment, test_value + values[start_index],
                ref best_assignment, ref best_err);

            // Try adding values[start_index] to set 2.
            test_assignment[start_index] = false;
            PartitionValuesFromIndex(values, start_index + 1,
                total_value, unassigned_value,
                test_assignment, test_value,
                ref best_assignment, ref best_err);

If you compare this example to the previous one, you’ll find that this one is much faster because it visits only a small fraction of the search tree.

It still visits some percentage of the tree, however, so the problem size that you can handle is still quite limited. In my next post, I’ll explain what options you have for solving bigger problems.

Download Example   Follow me on Twitter   RSS feed   Donate

This entry was posted in algorithms, mathematics and tagged , , , , , , , , , , , , , , , , , , , , . Bookmark the permalink.

6 Responses to Use branch and bound to solve the partition problem in C#

  1. Solve the partitioning problem by using an exhaustive search in C#

    In the partitioning problem, the goal is to divide a list of numbers into two sets that have the same total. My favorite story for the partition problem is the “booty division problem.” Suppose you and a friend find a buried treasure containing crowns, necklaces, goblets, and other goodies with different values. You want to divide the treasure exactly evenly into two piles, one for you and one for your friend. This is an NP-complete problem, which means (without getting technical) that it’s an extremely hard problem. Intuitively you can think of the problem as determining whether each number …

  2. jerry says:

    what initial values do you use when calling this?

    • RodStephens says:

      You can download the example program to see all of the details, but when the main program calls the method at the top level:

      values – The array of possible values to assign
      start_index – 0 to indicate that no values have yet been assigned
      total_value – The total of the values in the values array (the upper bound)
      test_assignment – Uninitialized
      test_value – 0 to indicate that the current (empty) solution has value 0
      best_assignment – Uninitialized
      best_err – The sum of all values

  3. John says:

    Is it possible to make these two subsets of equal size (half of the set size) ?
    Assuming that set size is always even.

    • RodStephens says:

      Obviously this only applies to a set with an even number of items. I can think of a couple of approaches.

      1. The more obvious approach is to just examine each solution when you find it and see if the two subsets have the same number of elements. Branch and bound will visit every even partitioning (if it doesn’t stop early) so it will find such a solution if there is one.

      2. A sneakier solution would be to add up all of the values. Say the total is 1000. Then add 1000 to every element’s value. In brief, the items are so large that the two subsets cannot have the same total value unless they have the same number of items.

      To see that, suppose you divide the elements into two groups of unequal size. Assume one group has K elements less than the other. Then the first group has at least 1000*K-1000 smaller total value than the larger group. (The “1000*K” is from the extra 1000s we added. The “-1000” assumes all of the items’ values are in that first group. For example, the items in the larger group could have value 0.)

      Rewriting this, the difference is at least 1000*(K-1). If K>1, then this value is at least 1000 so the two subsets cannot have the same total value. Furthermore, we assume that the number of items is even, so K must be even. (I’ll let you think about that one.) That means K is either 0 so the subsets have the same size, or K is greater than 1 and the subsets cannot have the same total value.

      What all this means is that the two subsets cannot have the same total value unless K=0 so the subsets contain the same number of items. The extra 1000s are just too big.

      Note that the second method will prefer subsets containing the same number of elements. The 1000s makes subsets with different numbers of values much more different than equal-sized subsets no matter how the items are divided, so it will pick out the same-sized subsets.

      You can make the final test in the first solution decide whether it should prefer sets with closer totals, more even divisions, or some combination of the two. For example, is it better to have two subsets with sizes that differ by 2 but with equal total values, or two subsets with the same number of items and total values that differ by 1?

      • RodStephens says:

        You could also modify the recursive step to see if it would be possible to end up with two sets of the same size. The sneaky solution would have the same result.

        It would be interesting to try all three and see which is fastest.

Comments are closed.