'how to get the slope of a linear regression line using c++?

I need to attain the slope of a linear regression similar to the way the Excel function in the below link is implemented:

http://office.microsoft.com/en-gb/excel-help/slope-function-HP010342903.aspx

Is there a library in C++ or a simple coded solution someone has created which can do this?

I have implemented code according to this formula, however it does not always give me the correct results (taken from here http://easycalculation.com/statistics/learn-regression.php) ....

Slope(b) = (NΣXY - (ΣX)(ΣY)) / (NΣX2 - (ΣX)2)
         = ((5)*(1159.7)-(311)*(18.6))/((5)*(19359)-(311)2)
         = (5798.5 - 5784.6)/(96795 - 96721)
         = 13.9/74
         = 0.19 

If I try it against the following vectors, I get the wrong results (I should be expecting 0.305556): x = 6,5,11,7,5,4,4 y = 2,3,9,1,8,7,5

Thanks in advance.



Solution 1:[1]

Why don't you just write a simple code like this (not the best solution, for sure, just an example based on the help article):

double slope(const vector<double>& x, const vector<double>& y){
    if(x.size() != y.size()){
        throw exception("...");
    }
    size_t n = x.size();

    double avgX = accumulate(x.begin(), x.end(), 0.0) / n;
    double avgY = accumulate(y.begin(), y.end(), 0.0) / n;

    double numerator = 0.0;
    double denominator = 0.0;

    for(size_t i=0; i<n; ++i){
        numerator += (x[i] - avgX) * (y[i] - avgY);
        denominator += (x[i] - avgX) * (x[i] - avgX);
    }

    if(denominator == 0.0){
        throw exception("...");
    }

    return numerator / denominator;
}

Note that the third argument of accumulate function must be 0.0 rather than 0, otherwise the compiler will deduct its type as int and there are great chances that the result of accumulate calls will be wrong (it's actually wrong using MSVC2010 and mingw-w64 when passing 0 as the third parameter).

Solution 2:[2]

Here is a C++11 implementation:

#include <algorithm>
#include <iostream>
#include <numeric>
#include <vector>

double slope(const std::vector<double>& x, const std::vector<double>& y) {
    const auto n    = x.size();
    const auto s_x  = std::accumulate(x.begin(), x.end(), 0.0);
    const auto s_y  = std::accumulate(y.begin(), y.end(), 0.0);
    const auto s_xx = std::inner_product(x.begin(), x.end(), x.begin(), 0.0);
    const auto s_xy = std::inner_product(x.begin(), x.end(), y.begin(), 0.0);
    const auto a    = (n * s_xy - s_x * s_y) / (n * s_xx - s_x * s_x);
    return a;
}

int main() {
    std::vector<double> x{6, 5, 11, 7, 5, 4, 4};
    std::vector<double> y{2, 3, 9, 1, 8, 7, 5};
    std::cout << slope(x, y) << '\n';  // outputs 0.305556
}

You can add a test for the mathematical requirements (x.size() == y.size() and x is not constant) or, as the code above, assume that the user will take care of that.

Solution 3:[3]

The following is a templatized function I use for linear regression (fitting). It takes std::vector for data

template <typename T>
std::vector<T> GetLinearFit(const std::vector<T>& data)
{
    T xSum = 0, ySum = 0, xxSum = 0, xySum = 0, slope, intercept;
    std::vector<T> xData;
    for (long i = 0; i < data.size(); i++)
    {
        xData.push_back(static_cast<T>(i));
    }
    for (long i = 0; i < data.size(); i++)
    {
        xSum += xData[i];
        ySum += data[i];
        xxSum += xData[i] * xData[i];
        xySum += xData[i] * data[i];
    }
    slope = (data.size() * xySum - xSum * ySum) / (data.size() * xxSum - xSum * xSum);
    intercept = (ySum - slope * xSum) / data.size();
    std::vector<T> res;
    res.push_back(slope);
    res.push_back(intercept);
    return res;
}

The function returns a vector with the first element being the slope, and the second element being the intercept of your linear regression.

Example to use it:

std::vector<double> myData;
myData.push_back(1);
myData.push_back(3);
myData.push_back(4);
myData.push_back(2);
myData.push_back(5);

std::vector<double> linearReg = GetLinearFit(myData);
double slope = linearReg[0];
double intercept = linearReg[1];

Notice that the function presumes you have a series of numbers for your x-axis (which is what I needed). You may change that in the function if you wish.

Solution 4:[4]

I had to create a similar function, but I needed it to handle a bunch of near-vertical slopes. I started out with Cassio Neri's code and then modified it to recalculate slopes that appear to be steeper than 1 after mirroring each point around the line x=y (which can be done easily by switching the x and y values). Then it will mirror it back and return a more accurate slope.

#include <algorithm>
#include <iostream>
#include <numeric>
#include <vector>

double slope(const std::vector<double>& x, const std::vector<double>& y) {

    const double n     = x.size();
    const double s_x   = std::accumulate(x.begin(), x.end(), 0.0);
    const double s_y   = std::accumulate(y.begin(), y.end(), 0.0);
    const double s_xx  = std::inner_product(x.begin(), x.end(), x.begin(), 0.0);
    const double s_xy  = std::inner_product(x.begin(), x.end(), y.begin(), 0.0);
    const double numer = n * s_xy - s_x * s_y;  // The same regardless of inversion (both terms here are commutative)
    const double denom = n * s_xx - s_x * s_x;  // Will change if inverted; use this for now
    double       a;

    if (denom == 0) a = 2;  // If slope is vertical, force variable inversion calculation
    else a = numer / denom;

    if (std::abs(a) > 1) {  // Redo with variable inversion if slope is steeper than 1
        const double s_yy = std::inner_product(y.begin(), y.end(), y.begin(), 0.0);
        const double new_denom = n * s_yy - s_y * s_y;
        a = new_denom / numer;  // Invert the fraction because we've mirrored it around x=y
    }
    return a;
}

int main() {
    std::vector<double> x{6, 5, 11, 7, 5, 4, 4};
    std::vector<double> y{2, 3, 9, 1, 8, 7, 5};
    std::cout << slope(x, y) << '\n';
}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 radical7
Solution 2 Cassio Neri
Solution 3
Solution 4 3000farad