Unit 3: Graphs 1 (BFS, DFS, Greedy)

Graphs provide the ultimate in data structure flexibility. Therefore, Graphs can model both real-world systems and abstract problems

Terminology and Representations

A graph consists of vertices and edges

Edges:

is a connection between a pair of vertices in
|E| is the number of edges
(directed or undirected)
if the graph is simple
Special case: Some graph applications require that a given pair of vertices can have multiple or parallel edges connecting them, or that a vertex can have an edge to itself

Vertices:

Two vertices are adjacent or neighbors if they are joined by an edge
An edge connecting Vertices and is written . Such an edge is said to be incident on Vertices and

Paths: A sequence of vertices forms a path of length if there exist edges from to for .

Simple path: A path is simple if all vertices on the path are distinct
The length of a path is the number of edges it contains

Cycle: is a path of length three or more that connects some vertex to itself

Simple cycle: A cycle is simple if the path is simple, except for the first and last vertices being the same

Subgraph is formed from graph by selecting a subset s of ’s vertices and a subset s of ’s edges such that for every edge in s, both of ’s vertices are in s.

Types of graphs:

Sparse: A graph with relatively few edges is called
Dense: A graph with many edges
Complete: A graph containing all possible edges

Directed graph (Digraph) - figure b: A graph with edges directed from one vertex to another
Undirected graph - figure a: A graph whose edges are not directed

Connected undirected graph: An undirected graph is connected if there is at least one path from any vertex to any other

Labeled graph (or Weighted graph) - figure c: A graph with labels or weights associated with its vertices
Acyclic graph: A graph w/o cycles
Directed acyclic graph (DAG): A directed graph w/o cycles

Free Tree: a connected, undirected graph with no simple cycles

or, it is a connected and has edges

Graph Representation

There are two commonly used methods for representing graphs:

Adjacency Matrix

Adjacency List

Adjacency Matrix

It is illustrated by figure (b) below

It is a array

Column in row is marked if there is an edge from to and is not marked otherwise

In labeled or weighted graphs, we put the weight or the distance between the two vertices instead of 1

Space cost:

The space requirements for the adjacency matrix are
The adjacency matrix requires space for each potential edge, whether it exists or not
We need only 1 bit to represent an edge
As the graph becomes denser, the adjacency matrix becomes relatively more space efficient

Asymptotic cost:

The adjacency matrix requires no overhead for pointers, which can be a substantial cost
The adjacency matrix often requires a higher asymptotic cost for an algorithm than would result if the adjacency list were used.

The reason is that it is common for a graph algorithm to visit each neighbor of each vertex.
The adjacency matrix must look at each of its potential edges, yielding a total cost of time
This is a considerable disadvantage when the graph is sparse, but not when the graph is closer to full

Time complexity:

Check if there is edge
List all neighbors of node
Traversing the whole graph

Adjacency List

It is illustrated by figure (c) below

It is an array of linked lists and it is a items long array

The position in the array stores a pointer to a linked list of edges for vertex

This linked list represents the edges by the vertices that are adjacent to vertex

The adjacency list is therefore a generalization of the “list of children” representation for trees

Space cost:

The space requirements depend on both the number of edges and the number of vertices in the graph
The cost is
We need bit to represent an edge
The adjacency list stores information only for those edges that actually appear in the graph
They are likely more space efficient in sparse graphs

Handshaking Lemma (undirected graph)

The degree of each vertex in HS Lemma is

Because we record in and out degrees

Undirected Graph vs. Directed Graph Representation

Both the adjacency matrix and the adjacency list can be used to store directed or undirected graphs.

Directed Graph:

The edge is recorded only once

Undirected Graph

The edge is recorded in both directions (twice).
From to and from to

This create symmetry in the matrix as shown in figure (b)

Implementation

ADT Specifications:

Vertices are defined by an integer index value.
Vertex can store any additional information (e.g. name)
ADT is not implemented using a template, because it is the Graph class users’ responsibility to maintain information related to the vertices themselves.
The Graph class need have no knowledge of the type or content of the information associated with a vertex, only the index number for that vertex

Methods:

n()returns the number of vertices
e()returns the number of edges
weight(int v0, int v1)returns the weight of a given edge identified by its two incident vertices
setEdge(unsigned int w)sets the weight of an edge

w should be larger than zero

delEdge()removes the edge from the graph
setMark(int v, int val)sets the mark value for a vertex (Mark array is used for traversal)
getMark(int v, int val)gets the mark value for a vertex

Code (from Shafer’s textbook)

Graph ADT (C++)


// Graph abstract class. This ADT assumes that the number
// of vertices is fixed when the graph is created.
class Graph {
	private:
		void operator =(const Graph&) {} // Protect assignment
		Graph(const Graph&) {} // Protect copy constructor
	public:
		Graph() {} // Default constructor
		virtual ˜Graph() {} // Base destructor
		// Initialize a graph of n vertices
		virtual void Init(int n) =0;
		// Return: the number of vertices and edges
		virtual int n() =0;
		virtual int e() =0;
		// Return v’s first neighbor
		virtual int first(int v) =0;
		// Return v’s next neighbor
		virtual int next(int v, int w) =0;
		// Set the weight for an edge
		// i, j: The vertices
		// wgt: Edge weight
		virtual void setEdge(int v1, int v2, int wght) =0;
		// Delete an edge
		// i, j: The vertices
		virtual void delEdge(int v1, int v2) =0;
		// Determine if an edge is in the graph
		// i, j: The vertices
		// Return: true if edge i,j has non-zero weight
		virtual bool isEdge(int i, int j) =0;
		// Return an edge’s weight
		// i, j: The vertices
		// Return: The weight of edge i,j, or zero
		virtual int weight(int v1, int v2) =0;
		// Get and Set the mark value for a vertex
		// v: The vertex
		// val: The value to set
		virtual int getMark(int v) =0;
		virtual void setMark(int v, int val) =0;
};

Graph Implementation using Adjacency Matrix (C++)


// Implementation for the adjacency matrix representation
class Graphm : public Graph {
	private:
		int numVertex, numEdge; // Store number of vertices, edges
		int **matrix; // Pointer to adjacency matrix
		int *mark; // Pointer to mark array

	public:
		Graphm(int numVert) // Constructor
			{ Init(numVert); }
		˜Graphm() { // Destructor
			delete [] mark; // Return dynamically allocated memory
			for (int i=0; i<numVertex; i++)
			delete [] matrix[i];
			delete [] matrix;
		}
		void Init(int n) { // Initialize the graph
			int i;
			numVertex = n;
			numEdge = 0;
			mark = new int[n]; // Initialize mark array
			for (i=0; i<numVertex; i++)
				mark[i] = UNVISITED;
			matrix = (int**) new int*[numVertex]; // Make matrix
			for (i=0; i<numVertex; i++)
				matrix[i] = new int[numVertex];
			for (i=0; i< numVertex; i++) // Initialize to 0 weights
				for (int j=0; j<numVertex; j++)
					matrix[i][j] = 0;
		}
		int n() { return numVertex; } // Number of vertices
		int e() { return numEdge; } // Number of edges
		
		// Return first neighbor of "v"
		int first(int v) {
			for (int i=0; i<numVertex; i++)
				if (matrix[v][i] != 0) return i;
			return numVertex; // Return n if none
		}
		// Return v’s next neighbor after w
		int next(int v, int w) {
			for(int i=w+1; i<numVertex; i++)
				if (matrix[v][i] != 0)
				return i;
			return numVertex; // Return n if none
		}
		// Set edge (v1, v2) to "wt"
		void setEdge(int v1, int v2, int wt) {
			Assert(wt>0, "Illegal weight value");
			if (matrix[v1][v2] == 0) numEdge++;
			matrix[v1][v2] = wt;
		}

		void delEdge(int v1, int v2) { // Delete edge (v1, v2)
			if (matrix[v1][v2] != 0) numEdge--;
			matrix[v1][v2] = 0;
		}
		bool isEdge(int i, int j) // Is (i, j) an edge?
		{ return matrix[i][j] != 0; }
		int weight(int v1, int v2) { return matrix[v1][v2]; }
		int getMark(int v) { return mark[v]; }
		void setMark(int v, int val) { mark[v] = val; }
};

Graph Implementation using Adjacency List (C++)


// Edge class for Adjacency List graph representation
class Edge {
	int vert, wt;
	public:
		Edge() { vert = -1; wt = -1; }
		Edge(int v, int w) { vert = v; wt = w; }
		int vertex() { return vert; }
		int weight() { return wt; }
};


class Graphl : public Graph {
	private:
		List<Edge>** vertex; // List headers
		int numVertex, numEdge; // Number of vertices, edges
		int *mark; // Pointer to mark array
	
	public:
		Graphl(int numVert)
			{ Init(numVert); }
		˜Graphl() { // Destructor
			delete [] mark; // Return dynamically allocated memory
			for (int i=0; i<numVertex; i++) delete [] vertex[i];
			delete [] vertex;
		}
		void Init(int n) {
			int i;
			numVertex = n;
			numEdge = 0;
			mark = new int[n]; // Initialize mark array
			for (i=0; i<numVertex; i++) mark[i] = UNVISITED;
				// Create and initialize adjacency lists
			vertex = (List<Edge>**) new List<Edge>*[numVertex];
			for (i=0; i<numVertex; i++)
				vertex[i] = new LList<Edge>();
		}
		int n() { return numVertex; } // Number of vertices
		int e() { return numEdge; } // Number of edges

		int first(int v) { // Return first neighbor of "v"
			if (vertex[v]->length() == 0)
				return numVertex; // No neighbor
			vertex[v]->moveToStart();
			Edge it = vertex[v]->getValue();
			return it.vertex();
		}

		// Get v’s next neighbor after w
		int next(int v, int w) {
			Edge it;
			if (isEdge(v, w)) {
				if ((vertex[v]->currPos()+1) < vertex[v]->length()) {
					vertex[v]->next();
					it = vertex[v]->getValue();
					return it.vertex();
				}
			}
			return n(); // No neighbor
		}

		// Set edge (i, j) to "weight"
		void setEdge(int i, int j, int weight) {
			Assert(weight>0, "May not set weight to 0");
			Edge currEdge(j, weight);
			if (isEdge(i, j)) { // Edge already exists in graph
				vertex[i]->remove();
				vertex[i]->insert(currEdge);
			}
			else { // Keep neighbors sorted by vertex index
				numEdge++;
				for (vertex[i]->moveToStart();
						vertex[i]->currPos() < vertex[i]->length();
						vertex[i]->next()) {
					Edge temp = vertex[i]->getValue();
					if (temp.vertex() > j) break;
				}
				vertex[i]->insert(currEdge);
			}
		}

		void delEdge(int i, int j) { // Delete edge (i, j)
			if (isEdge(i,j)) {
				vertex[i]->remove();
				numEdge--;
			}
		}

		bool isEdge(int i, int j) { // Is (i,j) an edge?
			Edge it;
			for (vertex[i]->moveToStart();
					vertex[i]->currPos() < vertex[i]->length();
					vertex[i]->next()) { // Check whole list
				Edge temp = vertex[i]->getValue();
				if (temp.vertex() == j) return true;
			}
			return false;
		}

		int weight(int i, int j) { // Return weight of (i, j)
			Edge curr;
			if (isEdge(i, j)) {
				curr = vertex[i]->getValue();
				return curr.weight();
			}
			else return 0;
		}

		int getMark(int v) { return mark[v]; }
		void setMark(int v, int val) { mark[v] = val; }
};

Graph Traversal

The most popular algorithms:

Depth First Search (DFS)

A recursive technique (using Stack)
It focuses on going deeper

Breadth First Search (BFS)

An iterative technique (using Queue)
It focuses on going parallel

These algorithms work with both, directed and undirected graphs

Usually, graph traversal algorithms have linear time complexity in terms of their nodes and edges

Depth First Search (DFS)

It is a recursive algorithm

We can implement it using Stack as an iterative algorithm instead

Goal: Systematically search through a graph

Idea: Mimic maze exploration

It is not all that useful by itself, but it shines when augmented to perform such tasks:

Count connected components
Determine connectivity
Find bridges/articulations points

It starts from a given node DFS(node) and continue recursively. This is a pseudo code:


DFS(node):
	if (node.neighbor == null AND node.mark == VISTED):
		return
	edge = node.pick_edge()
	edge.neighbor.mark = VISITED
	DFS(edge.neighbor)

C++ Code for Reachability Algorithm (by Ibrahim Moustafa)

Finding the reachable nodes starting from a node


void dfs(GRAPH &graph, int node, vector<bool> &visited) {
	visited[node] = true;

	for (int neighbour : graph[node]) {
		if (!visited[neighbour]) { // Avoid cycling
			cout << "\tWe can reach " << neighbour << "\n";
			dfs(graph, neighbour, visited); // Recursion
		}
	}
}

void reachability(GRAPH &graph) {
	int nodes = graph.size();
	for (int i = 0; i < nodes; ++i) {
		vector<bool> visited(nodes);	// RESET
		cout << "Reachable set of node " << i << "\n";
		dfs(graph, i, visited);
	}
}

Time Complexity

In Adjacency Matrix Implementation where is the number of vertices/nodes

Because we have to iterate in for all possible edges

In Adjacency List Implementation where is the number of edges and is the number of vertices/nodes.

This is more efficient than in matrix implementation, because the vertex is visited once at most

Space Complexity

We create a boolean array to store the visit marks for each node:
The recursion stack will have at most calls, so the total is

The code and complexity is the same for directed and undirected graphs

Breadth First Search (BFS)

Breadth First Search Algorithm | Shortest Path | Graph Theory

Breadth First Search (BFS) algorithm explanation video with shortest path code Algorithms repository: https://github.com/williamfiset/algorithms#graph-theory Video Slides: https://github.com/williamfiset/Algorithms/tree/master/slides/graphtheory ===================================== Practicing for interviews? I have used, and recommend `Cracking the Coding Interview` which got me a job at Google.

https://y.com.sb/watch?v=oDqjPvD54Ss

It is particularly useful for one thing: finding the shortest path on unweighted graphs

Time complexity:

Worst case

For a random graph, the time complexity is O(V+E): Breadth-first search
As stated in the link, according to the topology of your graph, O(E) may vary from O(V) (if your graph is acyclic) to O(V^2) (if all vertices are connected with each other).
Therefore the time complexity varies fromO(V + V) = O(V) to O(V + V^2) = O(V^2) according to the topology of your graph.
Besides, since |V| <= 2 |E|, then O(3E) = O(E) is also correct, but the bound is looser.

https://stackoverflow.com/a/18615593

It is implemented with the aid of a queue to record and trace all unvisited neighbors, and to track which node to visit next.

Current vertex is put in the head/front of the queue (ready to dequeue)
We enqueue unvisited neighbors of the current vertex

If one vertex is already existed in the queue, skip that (keep it)

After that, we dequeue the current vertex

This process runs until we run out of nodes in the queue.

Pseudocode


# global variables
n = number of nodes in the graph
g = adjacency list representing unweighted graph


# s = start node, e = end node
# 0 =< e
# s < n
function bfs(s, e):
	# do a BFS starting at node s
	prev = solve(s)

	# return reconstruct path from s -> e
	return reconstructPath(s, e, prev)


function solve(s):
	q = queue with support for enqueue and dequeue
	q.enqueue(s)
	
	# if the node is visited, then:
		# it has either been visited (dequeued)
		# or is being visited and on the queue (enqueued)
	visited = [false, ..., false] # size n
	visited[s] = true

	prev = [null, ..., null] # size n
	while !q.isEmpty():
		node = q.dequeue()
		neighbours = g.get(node)

		for(next : neighbours):
			if !visited[next]:
				q.enqueue(next)
				visited[next] = true
				prev[next] = node
	return prev


function reconstructPath(s, e, prev):
	# reconstruct path going backwards from e
	# prev was initialized with null values to indicate where the for loop should stop
	path = []
	for(at = e; at != null; at = prev[at]):
		path.add(at)

	path.reverse()
	
	# if s and e are connected return the path
	if path[0] == s:
		return path
	return []

Greedy

It is a strategy used to solve optimization problems

Optimization Problem: is the problem that requires either minimum result or maximum result

It aims to find the best solution at every stage based on the constraint defined in the problem

The constraint may be a condition

In this case, there will be multiple Feasible Solutions that satisfy the given condition
Example: 0-1 Knapsack

The constraint may be a statement asking for solving the problem at the minimum/maximum cost

In this case, there will be only one Optimal Solution that gives the best result
Example: Fractional Knapsack

Minimum Spanning Tree (MST)

12. Greedy Algorithms: Minimum Spanning Tree

MIT 6.046J Design and Analysis of Algorithms, Spring 2015 View the complete course: http://ocw.mit.edu/6-046JS15 Instructor: Erik Demaine In this lecture, Professor Demaine introduces greedy algorithms, which make locally-best choices without regards to the future. License: Creative Commons BY-NC-SA More information at http://ocw.mit.edu/terms More courses at http://ocw.mit.edu

https://y.com.sb/watch?v=tKwnms5iRBU&listen=false

Tree: Connected Acyclic Graph

Spanning Tree: A subset of edges of that form a tree that hit all vertices of

Minimum Spanning Tree:

Input: Given graph and edge weights ,
Output: A spanning tree that connect all the vertices of minimum weight (for simplicity, assuming that all weights are distinct)

This algorithm is important in distributed systems
Finding the MST in the Naïve method (testing all possible spanning trees) has an exponential time complexity