Okay, so here I am sharing a code for fitting a polynomial to a given set of data-points using the Least Squares Approximation Method(Wikipedia).
Let’s say we have data-point pairs and we are trying to fit them using a polynomial of degree . If N=n+1 then the polynomial will pass exactly through each point and it will correspond to the interpolating polynomial that I wrote about earlier.
Let’s say the polynomial we are using is given as:
with errors given by
Here, we are using to represent the observed data-points corresponding to . We now minimize the following quantity
At the minimum all the partial derivatives with respect to the coefficients will vanish. This will give us the following equations:
.
.
.
Dividing each by -2 and rearranging gives the normal equations to be solved simultaneously:
where and are the data-points entered by the user and which are the required coefficients.
So we just need to build up the above system of equations and then solve it using Gaussian elimination to get the coefficients.
The following program illustrates the process.
CODE:
/****************************************************** *************Chi-square fitting************** Polynomial Fitting ******************************************************/ #include<stdio.h> #include<math.h> /******* Function that performs Gauss-Elimination and returns the Upper triangular matrix and solution of equations: There are two options to do this in C. 1. Pass the augmented matrix (a) as the parameter, and calculate and store the upperTriangular(Gauss-Eliminated Matrix) in it. 2. Use malloc and make the function of pointer type and return the pointer. This program uses the first option. ********/ void gaussEliminationLS(int m, int n, double a[m][n], double x[n-1]){ int i,j,k; for(i=0;i<m-1;i++){ //Partial Pivoting for(k=i+1;k<m;k++){ //If diagonal element(absolute vallue) is smaller than any of the terms below it if(fabs(a[i][i])<fabs(a[k][i])){ //Swap the rows for(j=0;j<n;j++){ double temp; temp=a[i][j]; a[i][j]=a[k][j]; a[k][j]=temp; } } } //Begin Gauss Elimination for(k=i+1;k<m;k++){ double term=a[k][i]/ a[i][i]; for(j=0;j<n;j++){ a[k][j]=a[k][j]-term*a[i][j]; } } } //Begin Back-substitution for(i=m-1;i>=0;i--){ x[i]=a[i][n-1]; for(j=i+1;j<n-1;j++){ x[i]=x[i]-a[i][j]*x[j]; } x[i]=x[i]/a[i][i]; } } /******* Function that prints the elements of a matrix row-wise Parameters: rows(m),columns(n),matrix[m][n] *******/ void printMatrix(int m, int n, double matrix[m][n]){ int i,j; for(i=0;i<m;i++){ for(j=0;j<n;j++){ printf("%lf\t",matrix[i][j]); } printf("\n"); } } main(){ //no. of data-points int N; //degree of polynomial int n; printf("Enter the no. of data-points:\n"); scanf("%d",&N); //arrays to store the c and y-axis data-points double x[N], y[N]; printf("Enter the x-axis values:\n"); int i,j; for(i=0;i<N;i++){ scanf("%lf",&x[i]); } printf("Enter the y-axis values:\n"); for(i=0;i<N;i++){ scanf("%lf",&y[i]); } printf("Enter the degree of polynomial to be used:\n"); scanf("%d",&n); // an array of size 2*n+1 for storing N, Sig xi, Sig xi^2, ...., etc. which are the independent components of the normal matrix double X[2*n+1]; for(i=0;i<=2*n;i++){ X[i]=0; for(j=0;j<N;j++){ X[i]=X[i]+pow(x[j],i); } } //the normal augmented matrix double B[n+1][n+2]; // rhs double Y[n+1]; for(i=0;i<=n;i++){ Y[i]=0; for(j=0;j<N;j++){ Y[i]=Y[i]+pow(x[j],i)*y[j]; } } for(i=0;i<=n;i++){ for(j=0;j<=n;j++){ B[i][j]=X[i+j]; } } for(i=0;i<=n;i++){ B[i][n+1]=Y[i]; } double A[n+1]; printf("The polynomial fit is given by the equation:\n"); printMatrix(n+1,n+2,B); gaussEliminationLS(n+1,n+2,B,A); for(i=0;i<=n;i++){ printf("%lfx^%d+",A[i],i); } }
OUTPUT:
So that’s it! That’s how you perform a polynomial fit to a given set of data.
Note: You can run the program yourself and verify the results here: https://www.onlinegdb.com/yd6DgVP98
I had written a C++ code for this a long time ago, and coincidentally it got very popular for some reason. But then I felt the need to make an Android app that does the same.
So I ported my code to JAVA so that it works in my Android App.
So if you want you can check out those posts too.
Hope you guys find it useful!
If you have any questions/doubts, hit me up in the comments section below.
You can refer to the following links for more info:
Linear Fitting – Lab Write-Up
Linear Fitting – C++ Program
Linear Fitting – Scilab Code
Curve Fit Tools – Android App (using the above code)
Curve Fit Tools – Documentation
Curve Fit Tools – Play Store
Curve Fit Tools – GitHub Repository
Curve Fitters – Scilab Toolbox
Ph.D. researcher at Friedrich-Schiller University Jena, Germany. I’m a physicist specializing in computational material science. I write efficient codes for simulating light-matter interactions at atomic scales. I like to develop Physics, DFT, and Machine Learning related apps and software from time to time. Can code in most of the popular languages. I like to share my knowledge in Physics and applications using this Blog and a YouTube channel.
Well done, this is a bit cleaner implementation. Also, this balances out to a database implementation that I have developed.
I’m glad you found it useful!
I already have a couple of .dat files, and do not want to type the X and Y values, because it would take too long. Basically, let a single program give the polynomial fit equations for different sets of data.
That’s extremely easy.
Just add a few lines.
If you’re looking for some code go through some codes on this blog, for ex: plotting exercises
Thanks for leading me here, this is quite handy! So I am still trying to incorporating weights to perform a weighted least squares fit. Considering the theoretical equation I would incorporate the weight in the following position as ‘w’:
Y[i]=Y[i]+ w * pow(x[j],i)*y[j];
Since their it should be reside after the derivations.
Thank you in advance for you input!
Greetings
Eric
Hi Eric,
I’m looking at doing the same thing, weighted LS fit. Did you ever try this? Do you have hints?
Thanks,
Randy
Wonderful program! It just worked without any problems.
And the explanation too is really easy to understand.
Greetings,
Avinash
Thanks!
Hello Manas,
can you share a surface fit polynomial c/c++ code or the logic to write one?
Thanks,
Rajeev
Hello Manas,
I am trying to write a linear surface fit algorithm f(x,y)=A+Bx+Cy+Dxy+Ex^2……..; having the order of 3, 4; 4, 4 and/or 4, 3; Can you suggest me how does the augmented matrix looks like? what is the best approach to get PseudoInverse? what changes to the above code, I should be doing to address a surface fit on z; Any pointers would be much appreciated.
I do not think you even compiled it 🙂
a) gaussEliminationLS(int m, int n, double a[m][n], double x[n-1])
<== cannot have variable array sizez
b) Y[i]=Y[i]+pow(x[j],i)*y[j];<== 'y' is not even declared
c) indecis i,j are not declared anywhere …
These do work on certain compilers. For example if you use Dev C++ it will run guaranteed.
No it doesnt….I compiled it with Dev C++ which I installed last month and it fails just like Hook says….plus you did not answer Hooks questions about how to fix this problem.
You’ve got to be blind if you cannot see the indices i and j being declared. Secondly, not only is y[N] declared, it is also explained what it is with a comment. Y[n+1] is the RHS of the ‘normal’ system of equations.
I didn’t explain because there was nothing to explain. I will make a video of me running it in DevC++ if you can give me a proper error that you get. You just say that it fails. Can’t you give the error you get? How can I help with you just the information you give.
It does not even compile
a) cannot declare the variable array size like function (m,n, a[m][n] ..)
b) indecis i,j are never declared
c) “y” is not declared ( should be capital Y
Hi Manas
Thanks for the code I have adapted this and have it running VS 2019 as a C++ program. I made the some changes to make it run properly they are:
//arrays to store the c and y-axis data-points
//double x[N], y[N];
double* x = new double[N];
double* y = new double[N];
this needs to be done for all single dimensioned arrays.
he following change is needed for 2 dimensional arrays:
//the normal augmented matrix
// double B[n + 1][n + 2];
double** B = new double* [n + 1];
for (int i = 0; i < n + 1; i++)
B[i] = new double[n + 2];
Then when passing the arrays as parameters to the functions the following changes are required.
void gaussEliminationLS(int m, int n, double **a, double *x) {
void printMatrix(int m, int n, double **matrix) {
Also the compiler doesn't like scanf this needs to be replaced with scanf_s
Then it will build with out errors.
Note many compilers do not handle dynamic arrays, but most will allow the changes I have made. It seem that some of the above commenters don't understand how to create dynamic arrays in C++.
Hi, Thanks a lot for your efforts. I am sure these would be helpful for others. What you say is correct about arrays. But I still think that a lot of compilers can run this code contrary to what the others have pointed out above.
For example you can even run it online:
https://onlinegdb.com/yd6DgVP98
Hi Manas
You are correct I have just tried this on my personal PC using the gcc compiler version 8.3.0 an d it compiles and runs exactly as you show. This shows that the open source is often ahead of proprietary compiler.
Regards
Ron
This is amazing! I’ve been looking for a way to use polynomial regression for my capstone project. Would it be okay with you if I use this code to analyze a dataset based on Google PlayStore apps? Thank you!
Hi, I’m from south korea.
I was impressed with your research and it helped me a lot.
In case of overfitting, what part of the code above can be modified?
Hi Manas, This code is extremely useful for me. Can you write a code on curve fitting using any equation?
Thanks a lot
Tamal Pal
Hi and greetings from Finland, ;>)
I’m old engineer with some programming background and I remember having
this solution using HP 9816 technical workstation running basic.
I used it for solving many given tables to formulas in my work. Now I have a
a new table to be solved, but the code has vanished.
Here I found this, which you have written in C++. I know some of C, but
this is new to me. Have you written it in VisualBasic, so that I
could port it to my Excell app+ ??
BR Matti
Unfortunately, I have no experience with Visual Basic. Fortunately, ChatGPT exists, so you can use that to port the code. Here is what I got:
Please note: It is prone to making mistakes, so you would need to spend some time debugging. I haven’t checked the code and I don’t even know how to run it.
‘******************************************************
‘*************Chi-square fitting**************
‘Polynomial Fitting
‘******************************************************
Module Module1
End Sub
Here is a version generated completely by ChatGPT without using my input
Public Class Form1
Private Sub btnFit_Click(sender As Object, e As EventArgs) Handles btnFit.Click
' Input data
Dim xVals() As Double = {1, 2, 3, 4, 5, 6}
Dim yVals() As Double = {1.2, 1.5, 1.8, 2.1, 2.4, 2.7}
End Class
Thanks for the straightforward code. I tested it against my routines for base lining spectra and it works well. I ran it against 200 spectral points with sharp peaks at O(6). I find round off error is problematic with the coefficients spanning around 10**12. Adding the “.15” designator on the printf output gives enough decimal places to do the job.
printf( “%+.15lf*x^%d “, A[i],i);
thanks again
Fritz
Thanks a lot for the helpful suggestion!
This was never a production-quality code so there might be such oversights that would need to be sorted by the users.
I had written these codes during my bachelor’s and master’s for assignments, but somehow they got a lot of views.
Maybe when I find time in the future I will try to improve them.