8

When calculating runtime dependence on the input, what calculations are considered? For instance, I think I learned that array indexing as well as assignment statements don't get counted, why is that?

Raphael
  • 73,212
  • 30
  • 182
  • 400

2 Answers2

6

When doing complexity calculations we generally use RAM model. In this model we assume that array indexing is O(1). Assignment statement is just like assigning some value to some variable ins array of variables. This is just for convenience. It simplifies analysis of the algorithm. In real world array indexing takes $\approx$ O(log I) where I is number of things indexed.

We generally consider things that depend on size of input e.g. loops. Even if there is O(1) operation in the loop and it is executed n times then algorithm runs for O(n) time.

But O(1) operation outside of loop take only constant time and O(n) + O(1) = O(n).

Read about radix sort algorithm from CLRS.

Pratik Deoghare
  • 1,751
  • 2
  • 14
  • 22
5

The ultimate goal is "execution time in seconds" or more generally (disregarding modern CPU features) "number of clock cycles". As it turns out, this is hard to analyse, and it is also machine or at least instruction-set specific. Therefore, it is usually not done.

The next level of abstraction is to precisely count all operations (of some assembly-style pseudo code), keeping individual operation cost (in clock cycles) as parameters. Note that such analysis is found in Knuth's "The Art of Computer Programming" among others, so there certainly is a place for this kind of approach, even though it is hard and tends to be harder in presence of memory hierarchy.

Last but not least -- and certainly most prominent -- is analysis of dominant operations ignoring constant factors and asymptotically vanishing constributions ("$O$-analysis"). The reasoning is that runtime is bound to behave asymptotically like the number of most frequently executed operations, times some factor depending on the actual implementation and machine. This kind of analysis yields general results that apply to all machines (covered by the RAM model) and is easier to perform than the above. It can, however, lack specifity.

Leaving "classic" frameworks behind, many algorithms are dominated by memory and/or communication cost, so in that case counting the number and volume of memory acceses resp. network transmissions is reasonable (and maybe sufficient).

Furthermore, keep in mind that we are often not that interested in absolute performance of an algorithm but in comparing it to others. This, too, may inform the choice of analysed parameter.

So you see, there is no one definite answer. Depending on the goals of the analysis at hand, different answers can be given.

See also my answer here for some related thoughts, and Sebastian's answer on Quicksort for an example.

Raphael
  • 73,212
  • 30
  • 182
  • 400