[Beginning Database Design Solutions, Second Edition]

Title: Solution to puzzle: Zero rows and columns in an array in C#

This post gives four solutions to Puzzle: Zero rows and columns in an array in C#. If you want to try the puzzle for yourself, see that post before you read this one.

This is a fairly long post, but it's also fairly simple.

The obvious thing to try is to simply loop over the entries in the array and, when you find an entry that is zero, zero out the row an column containing that entry. Unfortunately that destroys information that you will need later. For example, suppose entry [3, 5] is 0. Then you need to zero out row 3 and column 5. Later when you consider entry [4, 5], that entry is 0 but you can't tell whether it is 0 because it was originally zero or because you set it to 0 when you visited entry [3, 5].

The most obvious solution is to make a second array. That array can either indicate where the 0s are in the original array or where they belong in the result. The first solution available for download uses the following code to demonstrate the first approach approach.

private void ZeroRowsAndColumns(int[,] values) { // Make an array to hold values // indicating where there are zeros. bool[,] is_zero = new bool[ values.GetUpperBound(0) + 1, values.GetUpperBound(1) + 1]; // Set is_zero for values that are 0. for (int row = 0; row <= values.GetUpperBound(0); row++) { for (int col = 0; col <= values.GetUpperBound(1); col++) { is_zero[row, col] = (values[row, col] == 0); } } // Zero out the appropriate rows and columns. for (int row = 0; row <= values.GetUpperBound(0); row++) { for (int col = 0; col <= values.GetUpperBound(1); col++) { // See if this entry in the original array is zero. if (is_zero[row, col]) { // Zero out this entry's row and column. for (int r = 0; r <= values.GetUpperBound(0); r++) { values[r, col] = 0; } for (int c = 0; c <= values.GetUpperBound(1); c++) { values[row, c] = 0; } } } } }

The code makes an array of Booleans named is_zero that is big enough to hold an entry for each of the values in the original array. (Note that the code assumes that Boolean arrays are initialized to hold false in every entry. This is true in C#.)

The program then loops through the original array and sets the corresponding Boolean value to true for the original entries that are 0. The code then loops over the Boolean array and, when it finds an entry that is true (indicating that the original array had a 0 in that position), the code zeros out the corresponding row and column.

There is a slight improvement you can make to this algorithm. Suppose you are looping through the Boolean array and you find that is_zero[3, 5] is true. In that case you next zero out row 3 and column 5. As you do that, this version checks the is_zero entry for the items that it is zeroing. While zeroing out column 5, suppose you find that entry [1, 5] is true. In that case, the program already zeroed out column 5 when it considered is_zero[1, 5] so it doesn't need to zero out that column again.

The second solution uses the following code to avoid re-zeroing a row or column if it has already done so. That saves some time if there are lots of zeros in the same row or column.

// Insert the algorithm here. private void ZeroRowsAndColumns(int[,] values) { // Make an array to hold values // indicating where there are zeros. bool[,] is_zero = new bool[ values.GetUpperBound(0) + 1, values.GetUpperBound(1) + 1]; // Set is_zero for values that are 0. for (int row = 0; row <= values.GetUpperBound(0); row++) { for (int col = 0; col <= values.GetUpperBound(1); col++) { is_zero[row, col] = (values[row, col] == 0); } } // Zero out the appropriate rows and columns. for (int row = 0; row <= values.GetUpperBound(0); row++) { for (int col = 0; col <= values.GetUpperBound(1); col++) { // See if this entry in the original array is zero. if (is_zero[row, col]) { // Zero out the column. // If this is the first row or this column has // not already been zeroed, zero it out. if ((row == 0) || (!is_zero[0, col])) { for (int r = 0; r <= values.GetUpperBound(0); r++) { values[r, col] = 0; } } else { Console.WriteLine("Skipping column " + col.ToString() + "."); } // Zero out the row. // If this is the first column or this row has // not already been zeroed, zero it out. if ((col == 0) || (!is_zero[row, 0])) { for (int c = 0; c <= values.GetUpperBound(1); c++) { values[row, c] = 0; } } else { Console.WriteLine("Skipping row " + row.ToString() + "."); } // Set is_zero to mark the row and column as zeroed. is_zero[row, 0] = true; is_zero[0, col] = true; } } } Console.WriteLine("Done"); }

This version uses the first row and column in is_zero to remember if the program has already zeroed out a row or column. After it zeros out row r and column c, it sets is_zero[r, 0] and is_zero[0, c] to true. Later, when the program must zero out column C, it checks is_zero[0, C]. If that entry is true, then this column has already been zeroed out so the program doesn't need to do it again. Similarly when the program must zero out row R, it checks is_zero[R, 0] and, if that value is true, the code doesn't need to zero out row R again.

The trickiest part here is that the code must always zero out a column if the program is considering row 0 and vice versa because at that point is_zero holds information about whether the original value is 0 and not information about whether the row/column has been zeroed.

This optimization saves some time if there are lots of 0s in the same row or column, but it still requires that you allocate and initialize the is_zero array. If the original array has M rows and N columns, then the new array has M * N entries. Unless the array is enormous, this extra space won't be a big deal, but you can do better.

The book Cracking the Coding Interview by Gayle Laakmann McDowell points out that you can use two one-dimensional arrays to indicate which rows and columns should be zeroed. You pass through the values array once to initialize the new arrays. Then you pass through the new arrays zeroing out the appropriate rows and columns.

The third solution available for download uses the following code to demonstrate this approach.

private void ZeroRowsAndColumns(int[,] values) { // Make arrays to hold values // indicating where there are zeros. bool[] row_is_zero = new bool[values.GetUpperBound(0) + 1]; bool[] col_is_zero = new bool[values.GetUpperBound(1) + 1]; // Set is_zero for values that are 0. for (int row = 0; row <= values.GetUpperBound(0); row++) { for (int col = 0; col <= values.GetUpperBound(1); col++) { if (values[row, col] == 0) { row_is_zero[row] = true; col_is_zero[col] = true; } } } // Zero out the appropriate rows. for (int row = 0; row <= values.GetUpperBound(0); row++) { if (row_is_zero[row]) { for (int c = 0; c <= values.GetUpperBound(1); c++) { values[row, c] = 0; } } } // Zero out the appropriate columns. for (int col = 0; col <= values.GetUpperBound(1); col++) { if (col_is_zero[col]) { for (int r = 0; r <= values.GetUpperBound(0); r++) { values[r, col] = 0; } } } }

The code creates two arrays row_is_zero and col_is_zero to indicate whether a particular row or column should be zeroed. (Note that this code again assumes that Boolean arrays are initialized to hold false in every entry.) The code then loops through the values array, setting the row_is_zero and col_is_zero values to true for rows and columns that should be zeroed. Finally the code uses the two arrays to decide which rows and columns should be zeroed and zeros them.

If the values array has M rows and N columns, then this solution requires only M + N additional space. Like the previous solution, it also avoids zeroing the same row or column twice.

After a bit of thought, I realized you could do even better by using the first row and column of the values array to store the information in the row_is_zero and col_is_zero arrays. The only catch is what happens if there is a 0 in the first row or column. In that case, you would set values[0, 0] to 0. That would indicate that both the first row and column should be zero and that may not be the case.

For example, consider the following values:

5 7 9 0 8 4 7 3 9 1 2 3 2 5 1

The entry values[0, 3] so the code should set values[0, 0] to 0 to indicate that the first row should be zeroed, but that would also indicate that the first column should be zeroed and that's incorrect.

To avoid losing information about the first row and column, the program must create and initialize two new Boolean variables. After that, it can use the first row and column to remember which columns and rows should be zeroed

The following code demonstrates this approach.

private void ZeroRowsAndColumns(int[,] values) { // See if the first row should be zero. bool row0_is_zero = false; for (int col = 0; col <= values.GetUpperBound(1); col++) { if (values[0, col] == 0) { row0_is_zero = true; break; } } // See if the first column should be zero. bool col0_is_zero = false; for (int row = 0; row <= values.GetUpperBound(0); row++) { if (values[row, 0] == 0) { col0_is_zero = true; break; } } // Copy zeros into the first row and column. for (int row = 1; row <= values.GetUpperBound(0); row++) { for (int col = 1; col <= values.GetUpperBound(1); col++) { if (values[row, col] == 0) { values[row, 0] = 0; values[0, col] = 0; } } } // Zero out the required rows. for (int row = 1; row <= values.GetUpperBound(0); row++) { if (values[row, 0] == 0) { // Zero out this row. for (int c = 1; c <= values.GetUpperBound(1); c++) { values[row, c] = 0; } } } // Zero out the required columns. for (int col = 1; col <= values.GetUpperBound(1); col++) { if (values[0, col] == 0) { // Zero out this column. for (int r = 1; r <= values.GetUpperBound(0); r++) { values[r, col] = 0; } } } // If the first row should be zero, zero it out. if (row0_is_zero) { for (int c = 0; c <= values.GetUpperBound(1); c++) { values[0, c] = 0; } } // If the first column should be zero, zero it out. if (col0_is_zero) { for (int r = 0; r <= values.GetUpperBound(0); r++) { values[r, 0] = 0; } } }

This version also uses only 2 extra Boolean variables instead of M + N or M * N extra pieces of memory so it uses the least extra space.

However, this code is longer than the previous solutions because it needs to handle the first row and column as special cases. The code is still fairly simple so it's not too hard to read.

Download the example to experiment with it and to see additional details.