Consider the following two implementations of dijkstra's algorithm:
Using set:
// V - total number of vertices
// S - source node
void dijkstra(int V, vector<vector<int>> edge[], int S)
{
vector<int> dist(V, 1e9);
dist[S] = 0;
set<pair<int, int>> s;
for (int i=0; i<V; i++)
s.insert({dist[i], i}); // log(V) time
while (!s.empty()) // exactly V times
{
auto top = *(s.begin()); // constant time
int dis = top.first;
int node = top.second;
s.erase(top); // amortised constant time
for (auto it: edge[node]) // For all the iterations of outer while loop this will sum up to E, where E is the total number of edges in the graph
{
int nb = it[0];
int edge_weight = it[1];
if (dist[nb] > dis + edge_weight)
{
s.erase({dist[nb], nb}); // log(V) time
dist[nb] = dis + edge_weight;
s.insert({dist[nb], nb}); // log(V) time
}
}
}
}
Using priority queue:
// V - total number of vertices
// S - source node
void dijkstra(int V, vector<vector<int>> edge[], int S)
{
vector<int> dist(V, 1e9);
dist[S] = 0;
priority_queue<pair<int, int> , vector<pair<int, int>>, greater<pair<int, int>>> pq;
pq.push({dist[S], S});
while (!pq.empty()) // Can be more than V times, let's call this heap_size times
{
int node = pq.top().second; // O(1) time
pq.pop(); // log(heap_size) time
for (int i=0; i<edge[node].size(); i++) // Can be approximated to (E/V), where E is the total number of edges in the graph
{
int nb = edge[node][i][0];
int edge_weight = edge[node][i][1];
if (dist[nb] > dist[node] + edge_weight) {
dist[nb] = dist[node] + edge_weight;
pq.push({dist[nb], nb}); // log(heap_size) time
}
}
}
return dist;
}
Finding the time complexity using set based approach is easy as the number of elements in set is exactly V(number of vertices) and the inner for loop runs for every edge, so it's time complexity is O(V*log(V) + V + E*log(V)) which is equivalent to O(E*log(V)) (reason: What's the time complexity of Dijkstra's Algorithm)
But I have trouble in understanding the time complexity of the priority_queue approach. Here the same node can be present multiple times in the priority_queue with different distances. How do I calculate the upper bound on the number of nodes that are added to the heap?
I also want to decide which implementation to use based on the nature of the graph(sparse vs dense), are both these implementations equivalent for any type of graph?