Okay, so here I am sharing a code for fitting a polynomial to a given set of data-points using the Least Squares Approximation Method(Wikipedia).

Let’s say we have data-point pairs and we are trying to fit them using a polynomial of degree . If N=n+1 then the polynomial will pass exactly through each point and it will correspond to the interpolating polynomial that I wrote about earlier.

Let’s say the polynomial we are using is given as:

with errors given by

Here, we are using to represent the observed data-points corresponding to . We now minimize the following quantity

At the minimum all the partial derivatives with respect to the coefficients will vanish. This will give us the following equations:

.

.

.

Dividing each by -2 and rearranging gives the normal equations to be solved simultaneously:

where and are the data-points entered by the user and which are the required coefficients.

So we just need to build up the above system of equations and then solve it using Gaussian elimination to get the coefficients.

The following program illustrates the process.

### CODE:

/****************************************************** *************Chi-square fitting************** Polynomial Fitting ******************************************************/ #include<stdio.h> #include<math.h> /******* Function that performs Gauss-Elimination and returns the Upper triangular matrix and solution of equations: There are two options to do this in C. 1. Pass the augmented matrix (a) as the parameter, and calculate and store the upperTriangular(Gauss-Eliminated Matrix) in it. 2. Use malloc and make the function of pointer type and return the pointer. This program uses the first option. ********/ void gaussEliminationLS(int m, int n, double a[m][n], double x[n-1]){ int i,j,k; for(i=0;i<m-1;i++){ //Partial Pivoting for(k=i+1;k<m;k++){ //If diagonal element(absolute vallue) is smaller than any of the terms below it if(fabs(a[i][i])<fabs(a[k][i])){ //Swap the rows for(j=0;j<n;j++){ double temp; temp=a[i][j]; a[i][j]=a[k][j]; a[k][j]=temp; } } } //Begin Gauss Elimination for(k=i+1;k<m;k++){ double term=a[k][i]/ a[i][i]; for(j=0;j<n;j++){ a[k][j]=a[k][j]-term*a[i][j]; } } } //Begin Back-substitution for(i=m-1;i>=0;i--){ x[i]=a[i][n-1]; for(j=i+1;j<n-1;j++){ x[i]=x[i]-a[i][j]*x[j]; } x[i]=x[i]/a[i][i]; } } /******* Function that prints the elements of a matrix row-wise Parameters: rows(m),columns(n),matrix[m][n] *******/ void printMatrix(int m, int n, double matrix[m][n]){ int i,j; for(i=0;i<m;i++){ for(j=0;j<n;j++){ printf("%lf\t",matrix[i][j]); } printf("\n"); } } main(){ //no. of data-points int N; //degree of polynomial int n; printf("Enter the no. of data-points:\n"); scanf("%d",&N); //arrays to store the c and y-axis data-points double x[N], y[N]; printf("Enter the x-axis values:\n"); int i,j; for(i=0;i<N;i++){ scanf("%lf",&x[i]); } printf("Enter the y-axis values:\n"); for(i=0;i<N;i++){ scanf("%lf",&y[i]); } printf("Enter the degree of polynomial to be used:\n"); scanf("%d",&n); // an array of size 2*n+1 for storing N, Sig xi, Sig xi^2, ...., etc. which are the independent components of the normal matrix double X[2*n+1]; for(i=0;i<=2*n;i++){ X[i]=0; for(j=0;j<N;j++){ X[i]=X[i]+pow(x[j],i); } } //the normal augmented matrix double B[n+1][n+2]; // rhs double Y[n+1]; for(i=0;i<=n;i++){ Y[i]=0; for(j=0;j<N;j++){ Y[i]=Y[i]+pow(x[j],i)*y[j]; } } for(i=0;i<=n;i++){ for(j=0;j<=n;j++){ B[i][j]=X[i+j]; } } for(i=0;i<=n;i++){ B[i][n+1]=Y[i]; } double A[n+1]; printf("The polynomial fit is given by the equation:\n"); printMatrix(n+1,n+2,B); gaussEliminationLS(n+1,n+2,B,A); for(i=0;i<=n;i++){ printf("%lfx^%d+",A[i],i); } }

### OUTPUT:

So that’s it! That’s how you perform a polynomial fit to a given set of data.

I had written a C++ code for this a long time ago, and coincidentally it got very popular for some reason. But then I felt the need to make an Android app that does the same.

So I ported my code to JAVA so that it works in my Android App.

So if you want you can check out those posts too.

Hope you guys find it useful!

If you have any questions/doubts, hit me up in the comments section below.

You can refer to the following links for more info:

Linear Fitting – Lab Write-Up

Linear Fitting – C++ Program

Linear Fitting – Scilab Code

Curve Fit Tools – Android App (using the above code)

Curve Fit Tools – Documentation

Curve Fit Tools – Play Store

Curve Fit Tools – GitHub Repository

Curve Fitters – Scilab Toolbox

Well done, this is a bit cleaner implementation. Also, this balances out to a database implementation that I have developed.

I’m glad you found it useful!

I already have a couple of .dat files, and do not want to type the X and Y values, because it would take too long. Basically, let a single program give the polynomial fit equations for different sets of data.

That’s extremely easy.

Just add a few lines.

If you’re looking for some code go through some codes on this blog, for ex: plotting exercises

Thanks for leading me here, this is quite handy! So I am still trying to incorporating weights to perform a weighted least squares fit. Considering the theoretical equation I would incorporate the weight in the following position as ‘w’:

Y[i]=Y[i]+ w * pow(x[j],i)*y[j];

Since their it should be reside after the derivations.

Thank you in advance for you input!

Greetings

Eric

Hi Eric,

I’m looking at doing the same thing, weighted LS fit. Did you ever try this? Do you have hints?

Thanks,

Randy

Wonderful program! It just worked without any problems.

And the explanation too is really easy to understand.

Greetings,

Avinash

Thanks!

Hello Manas,

can you share a surface fit polynomial c/c++ code or the logic to write one?

Thanks,

Rajeev

Hello Manas,

I am trying to write a linear surface fit algorithm f(x,y)=A+Bx+Cy+Dxy+Ex^2……..; having the order of 3, 4; 4, 4 and/or 4, 3; Can you suggest me how does the augmented matrix looks like? what is the best approach to get PseudoInverse? what changes to the above code, I should be doing to address a surface fit on z; Any pointers would be much appreciated.

I do not think you even compiled it 🙂

a) gaussEliminationLS(int m, int n, double a[m][n], double x[n-1])

<== cannot have variable array sizez

b) Y[i]=Y[i]+pow(x[j],i)*y[j];<== 'y' is not even declared

c) indecis i,j are not declared anywhere …

These do work on certain compilers. For example if you use Dev C++ it will run guaranteed.

No it doesnt….I compiled it with Dev C++ which I installed last month and it fails just like Hook says….plus you did not answer Hooks questions about how to fix this problem.

You’ve got to be blind if you cannot see the indices i and j being declared. Secondly, not only is y[N] declared, it is also explained what it is with a comment. Y[n+1] is the RHS of the ‘normal’ system of equations.

I didn’t explain because there was nothing to explain. I will make a video of me running it in DevC++ if you can give me a proper error that you get. You just say that it fails. Can’t you give the error you get? How can I help with you just the information you give.

It does not even compile

a) cannot declare the variable array size like function (m,n, a[m][n] ..)

b) indecis i,j are never declared

c) “y” is not declared ( should be capital Y

Hi Manas

Thanks for the code I have adapted this and have it running VS 2019 as a C++ program. I made the some changes to make it run properly they are:

//arrays to store the c and y-axis data-points

//double x[N], y[N];

double* x = new double[N];

double* y = new double[N];

this needs to be done for all single dimensioned arrays.

he following change is needed for 2 dimensional arrays:

//the normal augmented matrix

// double B[n + 1][n + 2];

double** B = new double* [n + 1];

for (int i = 0; i < n + 1; i++)

B[i] = new double[n + 2];

Then when passing the arrays as parameters to the functions the following changes are required.

void gaussEliminationLS(int m, int n, double **a, double *x) {

void printMatrix(int m, int n, double **matrix) {

Also the compiler doesn't like scanf this needs to be replaced with scanf_s

Then it will build with out errors.

Note many compilers do not handle dynamic arrays, but most will allow the changes I have made. It seem that some of the above commenters don't understand how to create dynamic arrays in C++.

Hi, Thanks a lot for your efforts. I am sure these would be helpful for others. What you say is correct about arrays. But I still think that a lot of compilers can run this code contrary to what the others have pointed out above.

For example you can even run it online:

https://onlinegdb.com/yd6DgVP98

Hi Manas

You are correct I have just tried this on my personal PC using the gcc compiler version 8.3.0 an d it compiles and runs exactly as you show. This shows that the open source is often ahead of proprietary compiler.

Regards

Ron

This is amazing! I’ve been looking for a way to use polynomial regression for my capstone project. Would it be okay with you if I use this code to analyze a dataset based on Google PlayStore apps? Thank you!

Hi, I’m from south korea.

I was impressed with your research and it helped me a lot.

In case of overfitting, what part of the code above can be modified?