# Find a polynomial least squares fit for a set of points in C# This example shows how to make a polynomial least squares fit to a set of data points. It’s kind of confusing, but you can get through it if you take it one step at a time.

The example Find a linear least squares fit for a set of points in C# explains how to find a line that best fits a set of data points. If you haven’t read that example yet, you should do so now because this example follows the same basic strategy.

With a degree d polynomial least squares fit, you need to find the coefficients A0, A1, … Ad to make the following equation fit the data points as closely as possible:

`    A0 * x0 + A1 * x1 + A2 * x2 + ... +  + Ad * xd`

The goal is to minimize the sum of the squares of the vertical distances between the curve and the points.

Keep in mind that you know all of the points so, for given coefficients, you can easily loop through all of the points and calculate the error.

If you store the coefficients in a List<double>, then the following function calculates the value of the function F(x) at the point x.

```// The function.
public static double F(List<double> coeffs, double x)
{
double total = 0;
double x_factor = 1;
for (int i = 0; i < coeffs.Count; i++)
{
total += x_factor * coeffs[i];
x_factor *= x;
}
}```

The following function uses the function F to calculate the total error squared between the data points and the polynomial curve.

```// Return the error squared.
public static double ErrorSquared(List<PointF> points,
List<double> coeffs)
{
double total = 0;
foreach (PointF pt in points)
{
double dy = pt.Y - F(coeffs, pt.X);
total += dy * dy;
}
}```

The following equation shows the error function: To simplify this, let Ei be the error in the ith term so: The steps for finding the best solution are the same as those for finding the best linear least squares solution:

• Take the partial derivatives of the error function with respect to the variables, in this case A0, A1, … Ad.
• Set the partial derivatives equal to 0 to get N + 1 equations and N + 1 unknowns A0, A1, … Ad.
• Solve the equations for A0, A1, … Ad.

As was the case in the previous example, this may sound like an intimidating problem. Fortunately the partial derivatives of the error function are simpler than you might think because that function only involves simple powers of the As. For example, the partial derivative with respect to Ak is: The partial derivative of Ei with respect to Ak contains lots of terms involving powers of xi and different As, but with respect to Ak all of those are constants except the single term Ak * xik. All of the other terms drop out leaving: If you substitute the value of Ei, multiply the -xik term through, and add the As separately you get: As usual, this looks pretty messy, but if you look closely you’ll see that most of the terms are values that you can calculate using the xi and yi values. For example, the first term is simply the sum of the products of the yi values and the xi values raised to the kth power. The next term is A0 times the sum of the xi values raised to the kth power. Because the yi and xi values are all known, this equation is the same as the following for a particular set of constants S: This is a relatively simple equation with d + 1 unknowns A0, A1, …, Ad.

When you take the partial derivatives for the other values of k as k varies from 0 to d, you get d + 1 equations with d + 1 unknowns, and you can solve for the unknowns.

Linear least squares is a specific case where d = 1 and it’s easy to solve the equations. For the more general case, you need to use a more general method such as Gaussian elimination. For an explanation of Gaussian elimination, see Solve a system of equations with Gaussian elimination in C#.

The following code shows how the example program finds polynomial least squares coefficients.

```// Find the least squares linear fit.
public static List<double> FindPolynomialLeastSquaresFit(
List<PointF> points, int degree)
{
// Allocate space for (degree + 1) equations with
// (degree + 2) terms each (including the constant term).
double[,] coeffs = new double[degree + 1, degree + 2];

// Calculate the coefficients for the equations.
for (int j = 0; j <= degree; j++)
{
// Calculate the coefficients for the jth equation.

// Calculate the constant term for this equation.
coeffs[j, degree + 1] = 0;
foreach (PointF pt in points)
{
coeffs[j, degree + 1] -= Math.Pow(pt.X, j) * pt.Y;
}

// Calculate the other coefficients.
for (int a_sub = 0; a_sub <= degree; a_sub++)
{
// Calculate the dth coefficient.
coeffs[j, a_sub] = 0;
foreach (PointF pt in points)
{
coeffs[j, a_sub] -= Math.Pow(pt.X, a_sub + j);
}
}
}

// Solve the equations.

// Return the result converted into a List<double>.
}``` The code simply builds an array holding the coefficients (the Ss in the previous equation) and then calls the GaussianElimination method to find the As. It converts the result from an array into a List<double> for convenience and returns it.

Give the program a try. It’s pretty cool!

Tip: Use the smallest degree that makes sense for your problem. If you use a very high degree, the curve will fit the points very closely but it will probably emphasize structure that isn’t really there. For example, the previous picture on the right fits a degree 4 polynomial to points that really should be fit with a degree 2 polynomial.

For a slightly different explanation, see my DevX article Predicting Your Firm’s Future with Least Squares, Part II (free registration required).    This entry was posted in algorithms, curve fitting, graphics, mathematics and tagged , , , , , , , , , , . Bookmark the permalink.

### 7 Responses to Find a polynomial least squares fit for a set of points in C#

1. Any Gupta says:

GaussianElimination function will work for c++????

• RodStephens says:

C# is not exactly the same as C++ so you may have to do a little translation, but they are very similar so you shouldn’t have too much trouble.

2. ecko says:

Nice tool, works fine for me! Thank you a lot.

3. Louis Mezei says:

Hi, I have been trying to get the code to work correctly for nearly a week.

I have attached the code below:

I believe that the issue is with the LoadArray array function. Your web site wasn’t very clear on what the textboxes actually contained in them. I was hoping to feed a list of classes (each class contains an X and Y datapair) into the function and use that.

I was hoping you could take a look and let me know what I am missing.

Here is the code.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Windows.Forms;

namespace MeziLuMathExtensions
{
public class clsPolynomialRegression : CommonMathCode
{
#region Data Structures …
public List Coefficients
{
internal set;
get;
}
#endregion
#region Calculators …
///
/// Find the least squares linear fit for an NthDegree polynomial.
///
///
///
///
public Tuple<List, bool> Calculate(List DataPairs, int NthDegree)
{ //http://csharphelper.com/blog/2014/10/find-a-polynomial-least-squares-fit-for-a-set-of-points-in-c/
try
{
#region Allocate space for (degree + 1) equations with (degree + 2) terms each (including the constant term) …
double[,] coeffs = new double[NthDegree + 1, NthDegree + 2];
#endregion
#region Calculate the coefficients for the equations …
for (int j = 0; j <= NthDegree; j++)
{
#region Calculate the coefficients for the jth equation …
#region Calculate the constant term for this equation …
coeffs[j, NthDegree + 1] = 0;
foreach(DataPair Pair in DataPairs)
{
coeffs[j, NthDegree + 1] -= Math.Pow(Pair.X, j) * Pair.Y;
}
#endregion
#region Calculate the other coefficients …
for (int a_sub = 0; a_sub <= NthDegree; a_sub++)
{
#region Calculate the dth coefficient.
coeffs[j, a_sub] = 0;
foreach(DataPair Pair in DataPairs)
{
coeffs[j, a_sub] -= Math.Pow(Pair.X, a_sub + j);
}
#endregion
}
#endregion
#endregion
}
#endregion
#region Solve the equations and return …
return GaussianElimination(DataPairs, coeffs);
#endregion
}
catch(Exception ex)
{
errHandler(ex);
return new Tuple<List, bool> (new List(), false);
}
}
#endregion
#region Helper Functions …
///
/// Loads the augmented array for Gaussian Elimination.
/// Tuple Definitions: Item1 = The Array; Item2 = num_rows; Item3 = num_cols; Item4 = Success.
/// Column num_cols holds the result values.
/// Column num_cols + 1 will hold the variables’ final values after backsolving.
///
///
///
///
private Tuple LoadArray(List DataPairs, double[,] coeffs)
{
try
{
int num_rows = coeffs.GetUpperBound(0) + 1;
int num_cols = coeffs.GetUpperBound(1) + 1;
#region Obsolete Code (Original Rod Stevens code.) …
//TextBox txtValues = new TextBox();
//TextBox txtCoefficients = new TextBox();
// Build the augmented matrix.
//string[] value_rows = txtValues.Text.Split(
// new string[] {“\r\n”},
// StringSplitOptions.RemoveEmptyEntries);
//string[] coef_rows = txtCoefficients.Text.Split(
// new string[] { “\r\n” },
// StringSplitOptions.RemoveEmptyEntries);
//string[] one_row = coef_rows.Split(
// new string[] { ” ” },
// StringSplitOptions.RemoveEmptyEntries);
//num_rows = coef_rows.GetUpperBound(0) + 1;
//num_cols = one_row.GetUpperBound(0) + 1;
//num_rows = coeffs.GetLength(0);
//num_cols = coeffs.GetLength(1);
#endregion
double[,] arr = new double[num_rows, num_cols + 2];
for (int r = 0; r < num_rows; r++)
{ //Original Rod Stevens Code:
//one_row = coef_rows[r].Split(
// new char[] {' '},
// StringSplitOptions.RemoveEmptyEntries);
for (int c = 0; c < num_cols; c++)
{
arr[r, c] = coeffs[r, c];//double.Parse(one_row[c]);
}
arr[r, num_cols] = DataPairs[r].X;// double.Parse(value_rows[r]);
}
return new Tuple (arr, num_rows, num_cols, true);
}
catch(Exception ex)
{
errHandler(ex);
return new Tuple (null, 0, 0, false);
}
}
///
/// Solve the system of equations using Gaussian Elimination.
///
///
///
///
private Tuple<List, bool> GaussianElimination(List DataPairs, double[,] coeffs)
{ //http://csharphelper.com/blog/2014/10/solve-a-system-of-equations-with-gaussian-elimination-in-c/
try
{
#region Setup …
Coefficients = new List();
const double tiny = 0.00000000001;
#endregion
#region Build the augmented matrix …
// The values num_rows and num_cols are the number of rows
// and columns in the matrix, not the augmented matrix.
//int num_rows, num_cols;
// double[,] arr = LoadArray(DataPairs, coeffs, out num_rows, out num_cols);
if(!Results.Item4)
{
return new Tuple<List, bool>(new List(), false);
}
double[,] arr = Results.Item1;
int num_rows = Results.Item2;
int num_cols = Results.Item3;
#endregion
// This is probably not used any more:
double[,] orig_arr = (double[,]) arr.Clone();
#region Start solving …
for (int r = 0; r < num_rows – 1; r++)
{
#region
// Zero out all entries in column r after this row.
// See if this row has a non-zero entry in column r.
if (Math.Abs(arr[r, r]) < tiny)
{
#region Too close to zero. Try to swap with a later row …
for (int r2 = r + 1; r2 tiny)
{
#region This row will work. Swap them …
for (int c = 0; c tiny)
{
#region Zero out this column in later rows …
for (int r2 = r + 1; r2 < num_rows; r2++)
{
#region
double factor = -arr[r2, r] / arr[r, r];
for (int c = r; c <= num_cols; c++)
{
arr[r2, c] = arr[r2, c] + factor * arr[r, c];
}
#endregion
}
#endregion
}
#endregion
#endregion
}
#endregion
#region See if there is a solution and process …
if (arr[num_rows – 1, num_cols – 1] == 0)
{
#region There is no solution …
// See if all of the entries in this row are 0.
bool all_zeros = true;
for (int c = 0; c = 0; r–)
{
#region
double tmp = arr[r, num_cols];
for (int r2 = r + 1; r2 < num_rows; r2++)
{
tmp -= arr[r, r2] * arr[r2, num_cols + 1];
}
arr[r, num_cols + 1] = tmp / arr[r, r];
#endregion
}
#region Transfer the Coefficients to the List …
for (int r = 0; r < num_rows; r++)
{
//Original Rod Stevens Code:
//txt += "\r\nx" + r.ToString() + " = " +
// arr[r, num_cols + 1].ToString();
}
#endregion
#region Verify (Disabled) …
//string txt = "Check:" + Environment.NewLine;
//for(int r = 0; r < num_rows; r++)
//{
// double tmp = 0;
// for(int c = 0; c < num_cols; c++)
// {
// tmp += orig_arr[r, c] * arr[c, num_cols + 1];
// }
// txt += "\r\n" + tmp.ToString();
//}
//txt = txt.Substring("\r\n".Length + 1);
//MessageBox.Show(txt);
#endregion
#endregion
}
#endregion
return new Tuple<List, bool> (Coefficients, true);
}
catch(Exception ex)
{
errHandler(ex);
return new Tuple<List, bool> (new List(), false);
}
}
#endregion
#region Error Analysis …
public double ErrorSquared(List points)
{
double total = 0;
foreach (DataPair pt in points)
{
double dy = pt.Y – PredictY(pt.X);
total += dy * dy;
}
}
#endregion
#region Predictors …
public double PredictY(double XValue)
{
try
{
double RetVal = 0.0;
int CurrentPower = 0;
foreach(double dbl in Coefficients)
{
RetVal += dbl * Math.Pow(XValue, CurrentPower);
CurrentPower += 1;
}
return RetVal;
}
catch (Exception ex)
{
errHandler(ex);
return InvalidReturn;
}
}
#endregion
} // End of Class.
} // End of Namespace.

• RodStephens says:

The text boxes are for output. You click on the graph to select points. When you click Fit, the program finds the least squares fit and displays the coefficients in the As text box.

You can download the example program to see the details that are not covered in the post.

You might also want to look at the earlier post Find a linear least squares fit for a set of points in C#. It uses the same approach but with a line so it’s a bit easier to understand.

If that doesn’t help, let me know and I’ll take a look at your code.

4. Marcos says:

I am trying to run this software in Visual Studio 2019 and I am getting several error messages:
Error CS0518 Predefined type ‘System.Object’ is not defined or imported
Error CS0246 The type or namespace name ‘STAThreadAttribute’ could not be found (are you missing a using directive or an assembly reference?)
Error CS0518 Predefined type ‘System.Void’ is not defined or imported
Error The reference assemblies for .NETFramework,Version=v3.5 were not found. To resolve this, install the Developer Pack (SDK/Targeting Pack) for this framework version or retarget your application. You can download .NET Framework Developer Packs at https://aka.ms/msbuild/developerpacks
Error .NET Framework v3.5 Service Pack 1 was not found. In order to target “.NETFramework,Version=v3.5”, .NET Framework v3.5 Service Pack 1 or later must be installed
Error CS0103 The name ‘Application’ does not exist in the current context

I have several other similar errors (15 in total). I have the .NET developer framework v3.5 installed.

Any ideas on how to resolve this issue?

• RodStephens says:

Did you download the example program rather than just copying and pasting code form the post into your program? Often there are many details that don’t fit in the text of the post so you need to download the post to see them all.

With this post, however, I don’t know why you would see those errors. For example, there’s no excuse for System.Object to be undefined.

If you didn’t download the example, try that. VS 2019 should be able to upgrade it with no problems. If that doesn’t work, email me and I’ll take a closer look.