I have developed two existing data structures and I want to see their performances over a certain algorithm. In this case I use Dijkstra's algorithm with binary and Fibonacci heaps. Just to ask, if I have 100 to 1000 number of vertices in the tested sparse digraphs, how many times should I execute my program for a single n vertices? How do I know that the empirical differences in performance that I've obtained between data structures are not due to chance?
1 Answers
You can use the program ministat to get statistical information about runtime differences. This blog post gives an example for the usage.
But in general, comparing runtimes is difficult and very machine dependent. Ideally you would find another performance measure that doesn't change so much between processors.
You should also take care that your input instances are well chosen. Random instances might behave very differently from real-world cases, but the latter are typically hard to come by. You should try modelling "interesting" properties of real-world instances and generating instances that also have those properties. You can also try to look very hard at the data structures and try to generate instances that trigger worst-case behaviour, but that is typically also very hard.
- 5,991
- 19
- 27