[Beginning Database Design Solutions, Second Edition]

Title: Use branch and bound to solve the partitioning problem in C#

[Use branch and bound to solve the partitioning problem in C#]

The post Use exhaustive search to solve the partition problem in C# explains the partition problem and how you can use an exhaustive search to find solutions for it.

Unfortunately the number of possible solutions is enormous. If you are dividing N items, then there are 2^N possible ways you can divide them into two groups. That means there are too many possible solutions to try them all unless N is quite small. For example, if you can examine 1 million possible solutions per second, it would take 18 minutes to examine every possible solution for 30 items, 12.7 days for 40 items, and 35.7 years for 50 items.

Branch and bound is a technique for reducing the number of branches that you need to visit while searching a tree. You can consider the partitioning problem as a tree search if you consider each decision whether to put an item in set 1 or set 2 as selecting a branch in the tree. If there are N items, then the tree is a full binary tree with N levels so it contains 2^N leaf nodes corresponding to possible solutions.

In branch and bound, the program keeps track of the most improvement a branch could yield if you continued to follow it. Then if the improvement cannot possibly make the current test solution better than best solution found so far, the program stops exploring that branch.

For example, suppose you have already found a division of 30 items where the two sets' values differ by 10. Now suppose you are considering a test solution that has already assigned 20 of the items so the sets in the test solution so far differ by 20 and the total value of the remaining 10 items is 5. Even if you add all of the remaining items to the smaller of the two sets, the best you could do would be to make the sets differ by 15. That's not an improvement over the best solution found so far so there's no point continuing to consider this test solution. That lets you skip the recursive calls that would visit the rest of the tree and that can potentially save a huge amount of time.

The following version of the PartitionValuesFromIndex method takes an additional unassigned_value parameter to keep track of the total amount of value that is not yet assigned in the test solution. It only calls itself recursively if that value can potentially improve the best solution.

// Partition the values with those before index start_index fixed. // test_assignment is the test assignment so far. // test_value = total value of the first set in test_assignment. // Initially best assignment and its error are in // best_assignment and best_err. // Update those to reflect any improved solution we find. private void PartitionValuesFromIndex(int[] values, int start_index, int total_value, int unassigned_value, bool[] test_assignment, int test_value, ref bool[] best_assignment, ref int best_err) { // If start_index is beyond the end of the array, // then all entries have been assigned. if (start_index >= values.Length) { // We're done. See if this assignment is better than // what we have so far. int test_err = Math.Abs(2 * test_value - total_value); if (test_err < best_err) { // This is an improvement. Save it. best_err = test_err; best_assignment = (bool[])test_assignment.Clone(); Console.WriteLine(best_err); } } else { // See if there's any way we can assign // the remaining items to improve the solution. int test_err = Math.Abs(2 * test_value - total_value); if (test_err - unassigned_value < best_err) { // There's a chance we can make an improvement. // We will now assign the next item. unassigned_value -= values[start_index]; // Try adding values[start_index] to set 1. test_assignment[start_index] = true; PartitionValuesFromIndex(values, start_index + 1, total_value, unassigned_value, test_assignment, test_value + values[start_index], ref best_assignment, ref best_err); // Try adding values[start_index] to set 2. test_assignment[start_index] = false; PartitionValuesFromIndex(values, start_index + 1, total_value, unassigned_value, test_assignment, test_value, ref best_assignment, ref best_err); } } }

If you compare this example to the previous one, you'll find that this one is much faster because it visits only a small fraction of the search tree.

It still visits some percentage of the tree, however, so the problem size that you can handle is still quite limited. In my next post, I'll explain what options you have for solving bigger problems.

Download the example to experiment with it and to see additional details.