C++ Program for Polynomial Fit (Least Squares)

//Polynomial Fit
#include<iostream>
#include<iomanip>
#include<cmath>
using namespace std;
int main()
{
    int i,j,k,n,N;
    cout.precision(4);                        //set precision
    cout.setf(ios::fixed);
    cout<<"\nEnter the no. of data pairs to be entered:\n";        //To find the size of arrays that will store x,y, and z values
    cin>>N;
    double x[N],y[N];
    cout<<"\nEnter the x-axis values:\n";                //Input x-values
    for (i=0;i<N;i++)
        cin>>x[i];
    cout<<"\nEnter the y-axis values:\n";                //Input y-values
    for (i=0;i<N;i++)
        cin>>y[i];
    cout<<"\nWhat degree of Polynomial do you want to use for the fit?\n";
    cin>>n;                                // n is the degree of Polynomial 
    double X[2*n+1];                        //Array that will store the values of sigma(xi),sigma(xi^2),sigma(xi^3)....sigma(xi^2n)
    for (i=0;i<2*n+1;i++)
    {
        X[i]=0;
        for (j=0;j<N;j++)
            X[i]=X[i]+pow(x[j],i);        //consecutive positions of the array will store N,sigma(xi),sigma(xi^2),sigma(xi^3)....sigma(xi^2n)
    }
    double B[n+1][n+2],a[n+1];            //B is the Normal matrix(augmented) that will store the equations, 'a' is for value of the final coefficients
    for (i=0;i<=n;i++)
        for (j=0;j<=n;j++)
            B[i][j]=X[i+j];            //Build the Normal matrix by storing the corresponding coefficients at the right positions except the last column of the matrix
    double Y[n+1];                    //Array to store the values of sigma(yi),sigma(xi*yi),sigma(xi^2*yi)...sigma(xi^n*yi)
    for (i=0;i<n+1;i++)
    {    
        Y[i]=0;
        for (j=0;j<N;j++)
        Y[i]=Y[i]+pow(x[j],i)*y[j];        //consecutive positions will store sigma(yi),sigma(xi*yi),sigma(xi^2*yi)...sigma(xi^n*yi)
    }
    for (i=0;i<=n;i++)
        B[i][n+1]=Y[i];                //load the values of Y as the last column of B(Normal Matrix but augmented)
    n=n+1;                //n is made n+1 because the Gaussian Elimination part below was for n equations, but here n is the degree of polynomial and for n degree we get n+1 equations
    cout<<"\nThe Normal(Augmented Matrix) is as follows:\n";    
    for (i=0;i<n;i++)            //print the Normal-augmented matrix
    {
        for (j=0;j<=n;j++)
            cout<<B[i][j]<<setw(16);
        cout<<"\n";
    }    
    for (i=0;i<n;i++)                    //From now Gaussian Elimination starts(can be ignored) to solve the set of linear equations (Pivotisation)
        for (k=i+1;k<n;k++)
            if (B[i][i]<B[k][i])
                for (j=0;j<=n;j++)
                {
                    double temp=B[i][j];
                    B[i][j]=B[k][j];
                    B[k][j]=temp;
                }
    
    for (i=0;i<n-1;i++)            //loop to perform the gauss elimination
        for (k=i+1;k<n;k++)
            {
                double t=B[k][i]/B[i][i];
                for (j=0;j<=n;j++)
                    B[k][j]=B[k][j]-t*B[i][j];    //make the elements below the pivot elements equal to zero or elimnate the variables
            }
    for (i=n-1;i>=0;i--)                //back-substitution
    {                        //x is an array whose values correspond to the values of x,y,z..
        a[i]=B[i][n];                //make the variable to be calculated equal to the rhs of the last equation
        for (j=0;j<n;j++)
            if (j!=i)            //then subtract all the lhs values except the coefficient of the variable whose value                                   is being calculated
                a[i]=a[i]-B[i][j]*a[j];
        a[i]=a[i]/B[i][i];            //now finally divide the rhs by the coefficient of the variable to be calculated
    }
    cout<<"\nThe values of the coefficients are as follows:\n";
    for (i=0;i<n;i++)
        cout<<"x^"<<i<<"="<<a[i]<<endl;            // Print the values of x^0,x^1,x^2,x^3,....    
    cout<<"\nHence the fitted Polynomial is given by:\ny=";
    for (i=0;i<n;i++)
        cout<<" + ("<<a[i]<<")"<<"x^"<<i;
    cout<<"\n";
    return 0;
}//output attached as .jpg

Verification using Excel
Verification using Excel
Sample Output
Sample Output

Explanation of the code:

I'm a physicist specializing in theoretical, computational and experimental condensed matter physics. I like to develop Physics related apps and softwares from time to time. Can code in most of the popular languages. Like to share my knowledge in Physics and applications using this Blog and a YouTube channel.



38 thoughts on “C++ Program for Polynomial Fit (Least Squares)

  1. Nice. Thanks for the write up. I’m going to steal this for my own math library. 🙂
    Do you know how to fit a 3D polynomial? I was reading the wiki page about guassian elimination and they claim that there is no straightforward way of generalizing to higher dimensions

    1. 3d polynomial??? you mean like polynomial of order 3?? Ihave no idea what a 3d polynomial is sorry. however if you explain it to me then i can try to code it.

      1. Sorry, ‘3D polynomial’ probably isn’t the right term. I mean a polynomial in a three dimensional domain, so instead of fitting f(x) = y I’d like to fit f(x, y) = z, where f(x, y) would probably be something along the lines of c0 + c1 * x + c2 * y + c3 * x^2 + c4 * y^2 + c5 * x * y. I need it for approximating images using a couple of polynomials.

        1. Oh..okay.. I understand what you mean. I haven’t ever worked on these before… but i’ll look into some literature on it.. And get back to you if I find something interesting.

          1. Brilliant! Thank you.

            Meanwhile I’ll take a look at spherical guassians and see if they can solve my problem. 🙂

            I also think I’ve found a couple of bugs that you might want to get rid of.

            * Your A array is one element larger than it has to be.
            * When you perform back substitution you expect the uninitalized members of A to be 0. I’m not entirely sure that you can make this assumption without explicitly initializing all entries yourself. It’s going to be true 99% of the time, but damn it’ll be a tough bug to catch when it breaks. You can avoid this assumption and the check for j != i if you simply initialize j to i+1 in the loop. Overall that’ll improve performance ever so slightly as well.
            * I don’t know if you care or not, but the code is not cross platform, as C++ does not specify allocating non-constant sized arrays. C does, which is probably why it works. Instead you need to use new or a std::vector. Basically this means that the code won’t compile in VS.

            Cheers

          2. I really appreciate your feedback.
            I love your suggestions.
            Actually I am just a Physics Major and just a novice programmer and we had to create several programs for various Numerical Methods. And Gaussian Elimination was really the one that had me confused for days. So i was just happy that I had managed to create something that worked for me.

            I would really love it if you could share your improvised code here. As it will be of great help to others.
            And you’re right about it not being cross-platform. I noticed that when I tried to run it on Visual Studio. Again it’s not my specialization.
            There is one more problem with it that I would like to work on when I get the time. Which would never show up in a polynomial fitting usage so it’s alright.

  2. Manas ,

    I have made a variation of you C program. If you furnish a email address, then I will transmit a copy for your use.

    1. Thanks for your comment John. I would love a copy of your code. To avoid sharing my email here I would rather email you as I have your email from the comment.

  3. Hi Manas,

    Thanks for the clear explanation on Youtube.
    I would like to use your code base but modified a bit to fit my project.

    However, the code download link seems no longer exist in this post.
    If possible, could you email me the source code for the “polynomial_fit.cc”

    Thanks for the time.

    1. Hi Hao,

      Thanks for your comment.
      I noticed that there was an issue with the code not being displayed properly in the above post.
      I have fixed it now.

      You can just copy the above code and paste it in a text editor.

      Have a good day!

  4. Hello fro Switzerland!

    Many many thanks for your work.
    I have experienced some strange behaviour though.
    Sometimes it calculates normal values for a 5th grade polynomial:
    x^0=0.285714
    x^1=26.9197
    x^2=-8.70833
    x^3=1.67803
    x^4=-0.185606
    x^5=0.00833333
    then another time it calculates very weird values (both for the identical x/y values):
    x^0=4.89839e+199
    x^1=-9.44052e+199
    x^2=6.17726e+199
    x^3=-1.79157e+199
    x^4=2.36964e+198
    x^5=-1.16597e+197

    x: 1,2,3,4,5,6,7
    y: 30,29,27,24,20,16,9

    1. Hi Thomas!
      I just ran the code with your given set of data, and the program returns the same result everytime.
      x^0=34.1429
      x^1=-8.2258
      x^2=5.8750
      x^3=-2.0758
      x^4=0.3068
      x^5=-0.0167

      I ran the code a few times and always got this result which is expected from every computer program.
      A program cannot return different results for different times, unless it uses some dynamic data.

      Also, I would recommend that you check out this app:https://play.google.com/store/apps/details?id=com.bragitoff.curvefit_leastsquares

      I made this app using the exact above code. You can use it to save your data and run it many times. Or even cross check your answers easily.

      Thanks!
      Manas

    2. I found that this code sometimes messes data up, ending in a[] containing +-NANs. Seems like the problem is that a[] is being used without proper prior initialization here:

      for (i=n-1;i>=0;i–) //back-substitution
      { //x is an array whose values correspond to the values of x,y,z..
      a[i]=B[i][n]; //make the variable to be calculated equal to the rhs of the last equation
      for (j=0;j=0;i–)
      a[i]=B[i][n];
      should be done first and then that back-subsititution should be performed. What’s the correct way?

    1. Hi Thomas!
      Please specify the data-set for which the code crashes.
      The code is known to work fairly well for a lot of test data. I’ve never seen a data-set that didn’t work.

      Thanks!
      Manas

  5. Hello Manas

    Thanks a lot for your quick response. I packed your code into a function:

    QVector MainWindow::werte_ermitteln(int number_of_coordinates, QVector x, QVector y, int degree){
        int i,j,k,N,n;
        N = number_of_coordinates;
        n = degree;
        cout &lt;&lt; N &lt;&lt; &quot; / &quot; &lt;&lt; n &lt;&lt; &quot; / &quot; &lt;&lt; x[0] &lt;&lt; &quot; / &quot; &lt;&lt; y[0] &lt;&lt; &quot;\n&quot;;
        double X[2*n+1];                        //Array that will store the values of sigma(xi),sigma(xi^2),sigma(xi^3)....sigma(xi^2n)
        double B[n+1][n+2],a[n+1];            //B is the Normal matrix(augmented) that will store the equations, &#039;a&#039; is for value of the final coefficients
        for (i=0;i&lt;2*n+1;i++)
        {
            X[i]=0;
            for (j=0;j&lt;N;j++)
                X[i]=X[i]+pow(x[j],i);        //consecutive positions of the array will store N,sigma(xi),sigma(xi^2),sigma(xi^3)....sigma(xi^2n)
        }
        for (i=0;i&lt;=n;i++)
            for (j=0;j&lt;=n;j++)
                B[i][j]=X[i+j];            //Build the Normal matrix by storing the corresponding coefficients at the right positions except the last column of the matrix
        double Y[n+1];                    //Array to store the values of sigma(yi),sigma(xi*yi),sigma(xi^2*yi)...sigma(xi^n*yi)
        for (i=0;i&lt;n+1;i++)
        {
            Y[i]=0;
            for (j=0;j&lt;N;j++)
            Y[i]=Y[i]+pow(x[j],i)*y[j];        //consecutive positions will store sigma(yi),sigma(xi*yi),sigma(xi^2*yi)...sigma(xi^n*yi)
        }
        for (i=0;i&lt;=n;i++)
            B[i][n+1]=Y[i];                //load the values of Y as the last column of B(Normal Matrix but augmented)
        n=n+1;                //n is made n+1 because the Gaussian Elimination part below was for n equations, but here n is the degree of polynomial and for n degree we get n+1 equations
        cout&lt;&lt;&quot;\nThe Normal(Augmented Matrix) is as follows:\n&quot;;
        for (i=0;i&lt;n;i++)            //print the Normal-augmented matrix
        {
            for (j=0;j&lt;=n;j++)
                cout&lt;&lt;B[i][j]&lt;&lt;setw(16);
            cout&lt;&lt;&quot;\n&quot;;
        }
        for (i=0;i&lt;n;i++)                    //From now Gaussian Elimination starts(can be ignored) to solve the set of linear equations (Pivotisation)
            for (k=i+1;k&lt;n;k++)
                if (B[i][i]&lt;B[k][i])
                    for (j=0;j&lt;=n;j++)
                    {
                        double temp=B[i][j];
                        B[i][j]=B[k][j];
                        B[k][j]=temp;
                    }
    
        for (i=0;i&lt;n-1;i++)            //loop to perform the gauss elimination
            for (k=i+1;k&lt;n;k++)
                {
                    double t=B[k][i]/B[i][i];
                    for (j=0;j=0;i--)                //back-substitution
        {                        //x is an array whose values correspond to the values of x,y,z..
            a[i]=B[i][n];                //make the variable to be calculated equal to the rhs of the last equation
            for (j=0;j&lt;n;j++)
                if (j!=i)            //then subtract all the lhs values except the coefficient of the variable whose value                                   is being calculated
                    a[i]=a[i]-B[i][j]*a[j];
            a[i]=a[i]/B[i][i];            //now finally divide the rhs by the coefficient of the variable to be calculated
        }
        cout&lt;&lt;&quot;\nThe values of the coefficients are as follows:\n&quot;;
        for (i=0;i&lt;n;i++)
            cout&lt;&lt;&quot;x^&quot;&lt;&lt;i&lt;&lt;&quot;=&quot;&lt;&lt;a[i]&lt;&lt;endl;            // Print the values of x^0,x^1,x^2,x^3,....
        cout&lt;&lt;&quot;\nHence the fitted Polynomial is given by:\ny=&quot;;
        for (i=0;i&lt;n;i++)
            cout&lt;&lt;&quot; + (&quot;&lt;&lt;a[i]&lt;&lt;&quot;)&quot;&lt;&lt;&quot;x^&quot;&lt;&lt;i;
        cout&lt;&lt;&quot;\n&quot;;
        QVector transwerte;
        for (i=0;i&lt;n;i++){
            transwerte.append(a[i]);
        }
        return transwerte;
    }
    

    Strange is that if i comment the first line "cout <<N…..", the function gives all values as "nan". If I uncomment the line again, the following values are given:

    7 / 5 / 1 / 30

    The Normal(Augmented Matrix) is as follows:
    7 28 140 784 4676 29008 154
    28 140 784 4676 29008 184820 518
    140 784 4676 29008 184820 1.2003e+006 2254
    784 4676 29008 184820 1.2003e+006 7.9074e+006 11354
    4676 29008 184820 1.2003e+006 7.9074e+006 5.26668e+007 62374
    29008 184820 1.2003e+006 7.9074e+006 5.26668e+007 3.53816e+008 362498

    The values of the coefficients are as follows:
    x^0=3.36106e+251
    x^1=-6.36989e+251
    x^2=4.11283e+251
    x^3=-1.1805e+251
    x^4=1.54876e+250
    x^5=-7.57174e+248

    Hence the fitted Polynomial is given by:
    y= + (3.36106e+251)x^0 + (-6.36989e+251)x^1 + (4.11283e+251)x^2 + (-1.1805e+251)x^3 + (1.54876e+250)x^4 + (-7.57174e+248)x^5

    The coordinates for x are: 1,2,3,4,5,6,7
    The coordinates for y are: 30,29,27,25,20,15,9

    Thanks!

    Thomas

    1. Hi Thomas!
      I am really busy so I can’t really go through your code and pin-point the problem.
      BUT!
      I can give you some advice or hints as to what may be the problem.

      You see the problem is related to the way the arguments are being passed in the function.
      My code changes the value of ‘n’ to ‘n+1’ and maybe some more changes are there.
      And whenever one is changing the variables then you should use pass by reference and not pass by value.
      http://courses.washington.edu/css342/zander/css332/passby.html
      I suspect the problem has something to do with this.

      Also, tell me that do you pass the ‘degree’ parameter as a hard value like 3,4,5… or is it stored in some int variable and you pass that variable?
      If you do the latter, then please try the former method once.

      Good Day!

      1. Hello Manas

        Thank you really much for your kind reply.

        I Pass the degree as an integer from the calling routine. It works with value 5, but not with value 4 or 3.
        How can I pass it as a “hard” value?

        The program gives me the same code now all the time, but i dont think its correct. The values you showed me are reasonable, but not the ….e+259 values i received.
        Did you pass the same coordinates as i did?

        The coordinates for x are: 1,2,3,4,5,6,7
        The coordinates for y are: 30,29,27,25,20,15,9

        1. Yes I passed the exact same values. You can check out the app that I mentioned above. Even that is using the same code.

          Try running the above code without wrapping it up in a function.

          I checked by copy pasting the above code exactly and it gave the correct results.

    1. Also try this, run the code once, note down the results. Close your IDE. Then re-open your IDE and run the code again. Note down the results.

      Are the results same? (They should be)

      Now try running the code twice. If the results are different it tells you that some value was changed at run-time when you ran the ode last time. So make sure any changes that were made were reverted back.

      Thanks!

    2. Thomas,

      If you can describe a method to transmit a C/C++ header file, then I will furnish some code that has worked for 5th order line fit in a production tester. Higher order might be possible. I never tried any polynomials higher than a 5th order. The code originated with Mr. Sharma’s original code. I remedied some minor flaws, otherwise Mr. Sharma’s original code works as well as the line fit in Microsoft Excel.

  6. Crashing error:
    ASSERT failure in QVector::operator[]: “index out of range”, file D:\Qt\5.8\mingw53_32\include/QtCore/qvector.h, line 437

    1. try initializing the array ‘a[n]’ to zero when you declare it. It helped some people

  7. Hello,

    I run into some problems when I try to use this algorithm to fit a quadratic (n=2) into a very large data sets (87047 points).

    In my case I can average every 15 points and get the data set down to 5804 points and then the algorithm works. Here is my output:

    Large Data set (87047 points):
    The values of the coefficients are as follows:
    x^0=3.3591850981237339e+47
    x^1=-2.8915118623859456e+46
    x^2=4.6502491137785113e+44

    Averaged Data Set (5804 points):
    The values of the coefficients are as follows:
    x^0=-2.97164351167442
    x^1=-0.012392634053185785
    x^2=-0.0001107962989747693

    For now this works for me, but I just want to understand why this is happening? Is it possible that I am hitting a limit for the size of a double? Is this a floating point arithmetic thing or some other problem?

    I’d gladly send you the data set, but I’m not sure how to since its quite large.

    Let me know if you can help,
    Thanks,
    Markos

    1. Hi Markos,
      If it’s possible, email the data-set file to me at [email protected].
      And I’ll try to see and compare the results with other softwares.
      Although I gotta admit, that this code does have some mistakes in the Gaussian Elimination part, but the nature of those was such that they would never create a problem for the fitting application, so I never rectified those.
      I have recently written a way better program on Gaussian Elimination, you can check that out here: https://www.bragitoff.com/2018/02/gauss-elimination-c-program/
      Also, if you only need quadratic fit, then this program is an overkill for that. You can probably write a shorter and much simpler code for that. This method relies on matrices and their manipulations, which might introduce problems as the sizes of the matrices grows large due to the propagation of errors. Moreover the Gaussian Elimination process isn’t a 100% efficient either. For some pathological cases the errors pile up very fast.

      Hope this helps,
      Thanks,
      Manas Sharma

  8. Hey Manas,

    Thanks for the code! Here is a update I made to your code to make it compatible with C/C++ with dynamic sized arrays and I also made it a modular function so you don’t need to rely on user input. I also renamed your variables, sorry about that!

    void polyFit(int x[],int y[], int degree, int numOfPoints) {

    int i, j, k, polyFitDegree, numOfPointsGiven;
    polyFitDegree = degree;
    numOfPointsGiven = numOfPoints;
    // polyFitDegree is the degree of Polynomial
    double * sigmaX=(double *)malloc((2 * polyFitDegree + 1) * sizeof(double)); //Array that will store the values of sigma(xi),sigma(xi^2),sigma(xi^3)….sigma(xi^2n)
    for (i = 0; i < 2 * polyFitDegree + 1; i++) { sigmaX[i] = 0; for (j = 0; j < numOfPointsGiven; j++) sigmaX[i] = sigmaX[i] + pow(x[j], i); //consecutive positions of the array will store numOfPointsGiven,sigma(xi),sigma(xi^2),sigma(xi^3)….sigma(xi^2n) } double ** normalMatrix=(double **)malloc((polyFitDegree+2) * (polyFitDegree+1)* sizeof(double)); double * finalCoeff=(double *)malloc( (polyFitDegree + 1) * sizeof(double)); //normalMatrix is the Normal matrix(augmented) that will store the equations, finalCoeff is for value of the final coefficients for (i = 0; i

Leave a Reply

Your email address will not be published. Required fields are marked *