Unit 4: Graph 2 (Shortest Path, Minimum Spanning Tree)

 
Prog
Disc
LJ
 
📌
Algorithms Part II - Princeton University (Coursera)

DFS & BFS Recap

# goal: systemaltically search through a graph # idea: mimic maze exploration DFS (to visit a vertex v): Mark vertex v as visited Recursively visit all unmarked vertices adjacent to v
# compute shortest paths (fewest number of edges) from s to all other N vertices # time complexity: O(E+V) BFS (from source vertex s): Put s onto a FIFO queue and mark s as visited Repeat until the queue is empty: Remove the least recently added vertex v Add each of v's unvisited neighbors to the queue Mark them as visisted
 

Connected Components

  • A connected component is a maximal set of connected vertices
Connected Components: Initialize all vertices v as unmarked For each unmarked vertices: Run DFS (to identify all vertices discovered as part of the same components): Mark vertex v as visited Mark vertex v as belonged to the current component (Components_Num) Recursively visit all unmarked vertices adjacent to v Components_Num++ # number of components
 

Digraph

  • DFS enable direct solution of simple digraph problems
    • Reachability
    • Path finding
    • Topological sort
    • Directed cycle detection
 

Reachability

  • Program control-flow analysis: there are many usages in program compilation and runtime
    • notion image
  • Mark-sweep algorithm
    • Mark: mark all reachable objects
    • Sweep: if object is unmarked, it is garbage (so add to free list)
      • notion image
 
  • BFS is used to find the shortest paths.
    • It is useful in web crawling
    • notion image
      notion image
       
       

Minimum Spanning Tree

notion image
  • MST size (# of edges) is
📌
MST is fundamental problem with diverse applications.
・Dithering. ・Cluster analysis. ・Max bottleneck paths. ・Real-time face verification. ・LDPC codes for error correction. ・Image registration with Renyi entropy. ・Find road networks in satellite and aerial imagery. ・Reducing data storage in sequencing amino acids in a protein. ・Model locality of particle interactions in turbulent fluid flows. ・Autoconfig protocol for Ethernet bridging to avoid cycles in a network. ・Approximation algorithms for NP-hard problems (e.g., TSP, Steiner tree). ・Network design (communication, electrical, hydraulic, computer, road).
 
 
 
  • There are many algorithms to compute the MST

Greedy MST Algorithm

  • Greedy algorithms seek to find the best local solution (in the short term)
  • Cut edges and choose the edge with the minimum weight
 

Kruskal’s Algorithm

  • It is a special case of greedy algorithm
Kruskal Algo (weighted graph G): Sort edges in G ascending order of weight For each edge in the ascending order: Add next edge to tree T unless doing so would create a cycle Return T
  • How to find if adding an edge e would create a cycle? The efficient solution is to use union-find data structure
    • Maintain a set for each connected component in
    • If and are in the same set, then adding would create a cycle
    • To add to , merge sets containing and
    • notion image
  • Java implementation
    • Queue for holding the MST edges
    • Priority Queue (Minimum Queue) for sorting the edges
    • Union-Find for maintaining the connected components (prevent cycles)
    • notion image
 
  • Running time is in the worst case
    • notion image
 

Prim’s Algorithm

  • It is a special case of greedy algorithm
Prim (weighted edge graph G): Start with vertex 0 and greedily grow tree T Add to T the min weight edge with exactly one endpoint in T Repeat until V-1 edges Return T
  • How to find if adding an edge e has the min weight with exactly one endpoint in T? There are two solutions/implementations for that:
    • Lazy implementation
    • Eager implementation
 

Lazy Implementation

Lazy Prim (weighted edge graph G): Start with vertex 0 and greedily grow tree T Create a Priority Queue (PQ) sorted by the edges weights Add all edges adjacent to the vertex v to the queue From the PQ, add to T the min weight edge with exactly one endpoint in T If e has more than one endpoint in T then delete it Repeat until V-1 edges Return T
 
  • Java Implementation
    • Queue for holding the MST edges
    • Priority Queue (Minimum Queue) for sorting the edges
    • Boolean Matrix for marking visited vertices
notion image
notion image
 
  • Running Time: it computes the MST in time proportional to
  • Space Cost: it needs extra space proportional to
notion image
 
 

Eager Implementation

  • In this implementation we need to create an indexed PQ (special implementation of PQ)
notion image
notion image
notion image
 
📌
Prim’s algo can be used to find the Euclidean MST in runtime
📌
MST is used in clustering (K-Clustering & Decision Tree)
 

Shortest Paths

  • Shortest Paths algorithms work in Directed Weighted Graphs
  • Variants:
    • Source-sink: from one vertex to another
    • Single source: from vertex to every other
    • All pairs: between all pairs
  • Restrictions on edge weights:
    • Non-negative weights
    • Arbitrary weights
    • Euclidean weights
 

Single-source shortest paths API

  • Goal: find the shortest path from s to every other vertex
public class SP { SP(EdgeWeightedDigraph G, int s) {...} double distanceTo(int v) {...} Iterable<DirectedEdge> pathTo(int v) {...} }
  • Basic SSSP algorithms:
    • Bellman-Ford:
      • Slow and steady
      • Works on any graph, any weight
      • Detects negative weight cycles
    • Dijkstra
      • Faster
      • No negative edge weights
    • Directed Acyclic Graph (DAG)
      • Negative edge weights is okay

Edge Relaxation

For start vertex s:
  • If you find a path from s → u of total weight x, the shortest path from s → u has no more than x
  • If there is a path from s → u of total weight x, and an edge form u → v of weight y, there is a path from s → v of total weight x+y
This is called Relaxing the Edge (meaning: relax, there are no edges shorter than that)
Relax(e): u = e.from() v = e.to() If (u.d + weight(e) < v.d): v.d = u.d + weight(e) v.parent = u
 

Generic Single-Source Shortest Path Algorithm

  • The following is a pseudocode represents a generalization of SSSP algorithm
Generic SSSP Algorithm(G, s): SSSP Initialize(G, s) For each edge e in G (possibly repeating edges) in same order: Relax(e) // find shorter valid path to each vertex Relax(e): u = e.from() v = e.to() // don't change the edge endpoints' d and parent values if // the new path is longer or the same distance If (u.d + weight(e) < v.d): v.d = u.d + weight(e) v.parent = u SSSP Initialize(G, s): For all v in G: v.d = infinity // d stores an upperbound distance estimate to reach the vertex v v.parent = None // parent stores the parent node used on the path of length d s.d = 0 Find Path(s, v): If v == s: return new List().add(v) if v.parent == None: return None return Find Path(s, v.parent).add(v)
 
  • The generic algorithm builds a data structure called short path tree

Bellman-Ford Algorithm

  • An SSSP algorithm that:
    • Is slow and steady
      • It has a time complexity of O(EV) (worst than Dijkstra)
      • This makes it not ideal for most SSSP problems
    • Works on any graph, any edge weight
    • Detects reachable negative weight cycles (which Dijkstra’s algorithm cannot). In the following figure S→G→H is a reachable negative cycle and the others are not.
      • notion image
       
  • This algorithm runs through 2 iterations (for each vertex in the graph):
      1. First iteration: apply relaxation function on each edge
      1. Second iteration: detect negative cycles and mark the nodes that are part of these cycles
  • Let’s implement the Bellman-Ford algorithm based on the generic SSSP algorithm
SSSP Initialize(G, s): For all v in G: v.d = infinity // d stores an upperbound distance estimate to reach the vertex v v.parent = None // parent stores the parent node used on the path of length d s.d = 0 Relax(e, NegativeCycle): u = e.from() v = e.to() // don't change the edge endpoints' d and parent values if // the new path is longer or the same distance If (u.d + weight(e) < v.d): // if relaxation is used to detect nodes in negative cycles If NegativeCycle == True: v.d = Negative Infinity Else: v.d = u.d + weight(e) v.parent = u Bellman-Ford Algorithm(Digraph G, s): SSSP Initialize(G, s) For each vertex v in G: For each edge e in G: Relax(e, False) // find shorter valid path to each vertex // run algorithm a second time to detect which nodes // are part of a negative cycle. A negative cycle has occured if we // can find a better path beyond the optimal solution For each vertex v in G: For each edge e in G: Relax(e, True) // build an array containing the shortest distances to every node distances = [] For each vertex v in G: distances.add(v.d) return distances
  • Runtime analysis:
     

    Resources