Graph Algorithms Explained: Traversal, Shortest Paths, and Network Analysis
What Is a Graph?
A graph G is defined as a pair (V, E) where V is a set of vertices and E is a set of edges connecting pairs of vertices. Graphs can be directed, where edges have a direction (like one-way streets), or undirected, where edges go both ways (like friendship relationships). Edges can be weighted, carrying numerical values representing costs, distances, or capacities, or unweighted, where all connections are treated equally.
Graphs are stored in computers using two main representations. An adjacency matrix is a two-dimensional array where entry (i, j) indicates whether an edge exists between vertices i and j. This representation uses O(V squared) space and allows O(1) edge lookup but wastes memory on sparse graphs. An adjacency list stores, for each vertex, a list of its neighbors. This uses O(V + E) space and is more efficient for sparse graphs, which are the norm in most real-world applications. Social networks, road maps, and the internet are all sparse graphs where each node connects to a tiny fraction of all other nodes.
Graph Traversal
Breadth-First Search (BFS)
BFS explores a graph layer by layer, visiting all vertices at distance 1 from the starting vertex before visiting vertices at distance 2, and so on. It uses a queue data structure to track which vertices to visit next. BFS finds the shortest path (in terms of number of edges) between any two vertices in an unweighted graph. It runs in O(V + E) time, visiting each vertex and edge exactly once.
BFS has many applications beyond simple traversal. It determines whether a graph is connected, finds the shortest path in unweighted graphs, detects bipartiteness (whether a graph can be two-colored so no adjacent vertices share a color), and computes the diameter of a graph. Web crawlers use BFS to discover pages, and social network analysis uses BFS to compute degrees of separation between users.
Depth-First Search (DFS)
DFS explores as far as possible along each branch before backtracking, using a stack (or recursion) to track the path. Starting from a vertex, DFS visits an unvisited neighbor, then visits one of that neighbor's unvisited neighbors, continuing until it reaches a dead end, at which point it backtracks to the most recent vertex with unvisited neighbors. DFS also runs in O(V + E) time.
DFS is the foundation for many important graph algorithms. It detects cycles in directed and undirected graphs. It performs topological sorting of directed acyclic graphs (DAGs), which is essential for scheduling tasks with dependencies, build systems, and course prerequisite planning. DFS identifies strongly connected components in directed graphs using Tarjan algorithm or Kosaraju algorithm. It also solves maze problems, generates mazes, and checks graph connectivity.
Shortest Path Algorithms
Dijkstra Algorithm
Dijkstra algorithm finds the shortest path from a single source vertex to all other vertices in a graph with non-negative edge weights. It maintains a set of vertices whose shortest distances are known and repeatedly selects the nearest unvisited vertex, updating the distances of its neighbors. Using a binary heap priority queue, Dijkstra algorithm runs in O((V + E) log V) time. With a Fibonacci heap, it achieves O(E + V log V), which is faster for dense graphs.
Dijkstra algorithm powers GPS navigation systems, network routing protocols like OSPF, and any application requiring shortest paths in networks with non-negative costs. Its limitation is that it cannot handle negative edge weights, where the Bellman-Ford algorithm is needed instead.
Bellman-Ford Algorithm
The Bellman-Ford algorithm handles graphs with negative edge weights by relaxing all edges V-1 times, where V is the number of vertices. It runs in O(VE) time, which is slower than Dijkstra, but it correctly computes shortest paths even with negative weights and can detect negative-weight cycles (cycles whose total weight is negative, making shortest paths undefined). Currency arbitrage detection and certain network routing scenarios use Bellman-Ford for this capability.
Floyd-Warshall Algorithm
Floyd-Warshall computes shortest paths between all pairs of vertices simultaneously using dynamic programming. It runs in O(V cubed) time and O(V squared) space. While slower than running Dijkstra from each vertex for sparse graphs, Floyd-Warshall is simpler to implement and handles negative weights (though not negative cycles). It is used when shortest paths between all pairs are needed, such as computing the center of a network or precomputing a distance matrix for clustering.
Minimum Spanning Trees
A minimum spanning tree (MST) connects all vertices in a weighted undirected graph with the minimum total edge weight and no cycles. MSTs have applications in network design, clustering, and approximation algorithms for other graph problems.
Kruskal algorithm builds the MST by sorting all edges by weight and adding them one by one, skipping any edge that would create a cycle. Cycle detection uses a union-find (disjoint set) data structure, making the algorithm run in O(E log E) time, dominated by the sorting step.
Prim algorithm builds the MST by starting from an arbitrary vertex and repeatedly adding the cheapest edge that connects a vertex in the tree to a vertex outside it. Using a priority queue, Prim algorithm runs in O(E log V) time. Prim tends to be faster on dense graphs, while Kruskal is simpler and often preferred for sparse graphs.
Advanced Graph Algorithms
Topological sorting orders the vertices of a directed acyclic graph so that for every edge from vertex u to vertex v, u appears before v in the ordering. This ordering is essential for dependency resolution: compiling source code files in the right order, scheduling tasks with prerequisites, and resolving package dependencies in software systems. Both DFS-based and BFS-based (Kahn algorithm) approaches compute topological orderings in O(V + E) time.
Network flow algorithms compute the maximum amount of material that can flow through a network from a source to a sink, respecting capacity constraints on edges. The Ford-Fulkerson method repeatedly finds augmenting paths from source to sink and pushes flow along them. The Edmonds-Karp variant uses BFS to find augmenting paths, guaranteeing O(VE squared) time. Network flow has applications in transportation logistics, telecommunication network design, image segmentation, and bipartite matching problems like assigning workers to tasks.
Strongly connected components are maximal subsets of vertices in a directed graph where every vertex is reachable from every other vertex in the subset. Tarjan algorithm finds all strongly connected components in O(V + E) time using a single DFS traversal. This decomposition simplifies analysis of directed graphs and is used in compiler optimization, model checking, and social network analysis.
Graph algorithms provide the mathematical tools to analyze networks and relationships. From BFS and DFS for traversal to Dijkstra and Bellman-Ford for shortest paths to Kruskal and Prim for spanning trees, these algorithms solve problems that arise wherever connected structures exist, in transportation, communication, social systems, and computing infrastructure itself.