11

I am working with linear regression and I would like to know the Time complexity in big-O notation. The cost function of linear regression without an optimisation algorithm (such as Gradient descent) needs to be computed over iterations of the weight combinations (as a brute force approach). This makes computation time dependent on the number of weights and obviously on the number of training data.

If $n$ is the number of training data, $W$ is the number of weights and each resolution of the weight space is set to $m$ meaning that each weight will iterate through $m$ number of possible values. Then the time complexity of this linear regression is

$O(m^Wn)$

Is this correct?

user134132523
  • 169
  • 1
  • 4
  • 13

3 Answers3

7

In linear regression you have to solve $$ (X'X)^{-1}X'Y, $$ where $X$ is a $n\times p$ matrix. Now, in general the complexity of the matrix product $AB$ is O(abc) whenever $A$ is $a\times b$ and $B$ is $b\times c$. Therefore we can evaluate the following complexities:

a) the matrix product $X'X$ with complexity $O(p^2n)$.

b) the matrix-vector product $X'Y$ with complexity $O(pn)$.

c) the inverse $(X'X)^{-1}$ with complecity $O(p^3)$,

Therefore the complexity is $O(np^2 + p^3)$.

Car Loz
  • 171
  • 1
  • 2
3

It highly depends on the "solver" you are using.

Calling $n$ the number of observations and $p$ the number of weights, the overall complexity should be $n^2p+p^3$.

Indeed, when performing a linear regression you are doing matrices multiplication whose complexity is $n^2p$ (when evaluating $X'X$) and inverting the resulting matrix. It is now a square matrix with $p$ rows, the complexity for matrix inversion usually is $p^3$ (though it can be lowered).

Hence a theoretical complexity : $n^2p+p^3$.

Side notes

However, numerical simulations (using python's scikit library) seem to have a time complexity close to $n^{0.72} p^{1.3}$

This may be due to the fact that no implementation actually perform the full inversion (instead, the system can be solved using gradient descents) or due to the fact that there are other ways to calibrate the weights of a linear regression.

Source

An article from my blog : computational complexity of machine learning algorithms

RUser4512
  • 167
  • 4
0

Most of the answers above calculate the time complexity of an algorithm, not the complexity of the problem itself. Buyer beware.