Design pattern for graph processing.
Since we consider a large number of graph-processing algorithms, our initial design goal is to decouple our implementations from the graph representation. To do so, we develop, for each given task, a task-specific class so that clients can create objects to perform the task. Generally, the constructor does some preprocessing to build data structures so as to efficiently respond to client queries. A typical client program builds a graph, passes that graph to an algorithm implementation class (as argument to a constructor), and then calls client query methods to learn various properties of the graph. We use the term source to distinguish the vertex provided as argument to the constructor from the other vertices in the graph.
Depth-first search
To search a graph, invoke a recursive method that visits vertices. To visit a vertex:
■ Mark it as having been visited.
■ Visit (recursively) all the vertices that are adjacent to it and that have not yet been marked.
This method is called depth-first search (DFS). DFS marks all the vertices connected to a given source in time proportional to the sum of their degrees.
public class DepthFirstSearch { private boolean[] marked; // marked[v] = is there an s-v path? private int count; // number of vertices connected to s // single source public DepthFirstSearch(Graph G, int s) { marked = new boolean[G.V()]; dfs(G, s); } // depth first search from v private void dfs(Graph G, int v) { count++; marked[v] = true; for (int w : G.adj(v)) { if (!marked[w]) { dfs(G, w); } } } // is there an s-v path? public boolean marked(int v) { return marked[v]; } // number of vertices connected to s public int count() { return count; } }
Finding paths
The single-source paths problem is fundamental to graph processing, below is a DFS-based implementation of Paths that extends the DepthFirstSearch.
public class DepthFirstPaths { private boolean[] marked; // marked[v] = is there an s-v path? private int[] edgeTo; // edgeTo[v] = last edge on s-v path private final int s; // source vertex public DepthFirstPaths(Graph G, int s) { this.s = s; edgeTo = new int[G.V()]; marked = new boolean[G.V()]; dfs(G, s); } // depth first search from v private void dfs(Graph G, int v) { marked[v] = true; for (int w : G.adj(v)) { if (!marked[w]) { edgeTo[w] = v; dfs(G, w); } } } // is there a path between s and v? public boolean hasPathTo(int v) { return marked[v]; } // return a path between s to v; null if no such path public Iterable<Integer> pathTo(int v) { if (!hasPathTo(v)) return null; Stack<Integer> path = new Stack<Integer>(); for (int x = v; x != s; x = edgeTo[x]) path.push(x); path.push(s); return path; } }
Breadth-first search
The paths discovered by depth-first search depend not just on the graph, but also on the representation and the nature of the recursion. Naturally, we are often interested in solving the following problem:
Single-source shortest paths. Given a graph and a source vertex s, support queries of the form Is there a path from s to a given target vertex v? If so, find a shortest such path (one with a minimal number of edges).
The classical method for accomplishing this task, called breadth-first search (BFS ), is also the basis of numerous algorithms for processing graphs.
public class BreadthFirstPaths { private static final int INFINITY = Integer.MAX_VALUE; private boolean[] marked; // marked[v] = is there an s-v path private int[] edgeTo; // edgeTo[v] = previous edge on shortest s-v path private int[] distTo; // distTo[v] = number of edges shortest s-v path // single source public BreadthFirstPaths(Graph G, int s) { marked = new boolean[G.V()]; distTo = new int[G.V()]; edgeTo = new int[G.V()]; bfs(G, s); assert check(G, s); } // multiple sources public BreadthFirstPaths(Graph G, Iterable<Integer> sources) { marked = new boolean[G.V()]; distTo = new int[G.V()]; edgeTo = new int[G.V()]; for (int v = 0; v < G.V(); v++) distTo[v] = INFINITY; bfs(G, sources); } // BFS from single source private void bfs(Graph G, int s) { Queue<Integer> q = new Queue<Integer>(); for (int v = 0; v < G.V(); v++) distTo[v] = INFINITY; distTo[s] = 0; marked[s] = true; q.enqueue(s); while (!q.isEmpty()) { int v = q.dequeue(); for (int w : G.adj(v)) { if (!marked[w]) { edgeTo[w] = v; distTo[w] = distTo[v] + 1; marked[w] = true; q.enqueue(w); } } } } // BFS from multiple sources private void bfs(Graph G, Iterable<Integer> sources) { Queue<Integer> q = new Queue<Integer>(); for (int s : sources) { marked[s] = true; distTo[s] = 0; q.enqueue(s); } while (!q.isEmpty()) { int v = q.dequeue(); for (int w : G.adj(v)) { if (!marked[w]) { edgeTo[w] = v; distTo[w] = distTo[v] + 1; marked[w] = true; q.enqueue(w); } } } } // is there a path between s (or sources) and v? public boolean hasPathTo(int v) { return marked[v]; } // length of shortest path between s (or sources) and v public int distTo(int v) { return distTo[v]; } // shortest path between s (or sources) and v; null if no such path public Iterable<Integer> pathTo(int v) { if (!hasPathTo(v)) return null; Stack<Integer> path = new Stack<Integer>(); int x; for (x = v; distTo[x] != 0; x = edgeTo[x]) path.push(x); path.push(x); return path; } // check optimality conditions for single source private boolean check(Graph G, int s) { // check that the distance of s = 0 if (distTo[s] != 0) { StdOut.println("distance of source " + s + " to itself = " + distTo[s]); return false; } // check that for each edge v-w dist[w] <= dist[v] + 1 // provided v is reachable from s for (int v = 0; v < G.V(); v++) { for (int w : G.adj(v)) { if (hasPathTo(v) != hasPathTo(w)) { StdOut.println("edge " + v + "-" + w); StdOut.println("hasPathTo(" + v + ") = " + hasPathTo(v)); StdOut.println("hasPathTo(" + w + ") = " + hasPathTo(w)); return false; } if (hasPathTo(v) && (distTo[w] > distTo[v] + 1)) { StdOut.println("edge " + v + "-" + w); StdOut.println("distTo[" + v + "] = " + distTo[v]); StdOut.println("distTo[" + w + "] = " + distTo[w]); return false; } } } // check that v = edgeTo[w] satisfies distTo[w] + distTo[v] + 1 // provided v is reachable from s for (int w = 0; w < G.V(); w++) { if (!hasPathTo(w) || w == s) continue; int v = edgeTo[w]; if (distTo[w] != distTo[v] + 1) { StdOut.println("shortest path edge " + v + "-" + w); StdOut.println("distTo[" + v + "] = " + distTo[v]); StdOut.println("distTo[" + w + "] = " + distTo[w]); return false; } } return true; } }
For any vertex v reachable from s, BFS computes a shortest path from s to v (no path from s to v has fewer edges), also BFS takes time proportional to V-E in the worst case. Note that we can also use BFS to implement the Search API that we implemented with DFS, since the solution depends on only the ability of the search to examine every vertex and edge connected to the source.
As implied at the outset, DFS and BFS are the first of several instances that we will examine of a general approach to searching graphs. We put the source vertex on the data structure, then perform the following steps until the data structure is empty:
■ Take the next vertex v from the data structure and mark it.
■ Put onto the data structure all unmarked vertices that are adjacent to v. The algorithms differ only in the rule used to take the next vertex from the data structure (least recently added for BFS, most recently added for DFS). This difference leads to completely different views of the graph, even though all the vertices and edges connected to the source are examined no matter what rule is used.
Connected components
Our next direct application of depth-first search is to find the connected components of a graph.
public class CC { private boolean[] marked; // marked[v] = has vertex v been marked? private int[] id; // id[v] = id of connected component containing v private int[] size; // size[id] = number of vertices in given component private int count; // number of connected components public CC(Graph G) { marked = new boolean[G.V()]; id = new int[G.V()]; size = new int[G.V()]; for (int v = 0; v < G.V(); v++) { if (!marked[v]) { dfs(G, v); count++; } } } // depth first search private void dfs(Graph G, int v) { marked[v] = true; id[v] = count; size[count]++; for (int w : G.adj(v)) { if (!marked[w]) { dfs(G, w); } } } // id of connected component containing v public int id(int v) { return id[v]; } // size of connected component containing v public int size(int v) { return size[id[v]]; } // number of connected components public int count() { return count; } // are v and w in the same connected component? public boolean areConnected(int v, int w) { return id(v) == id(w); } }
How does the DFS-based solution for graph connectivity in CC compare with the union-find approach? In theory, DFS is faster than union-find because it provides a constant-time guarantee, which union-find does not; in practice, this difference is negligible, and union-find is faster because it does not have to build a full representation of the graph. More important, union-find is an online algorithm (we can check whether two vertices are connected in near-constant time at any point, even while adding edges), whereas the DFS solution must first preprocess the graph. Therefore, for example, we prefer union-find when determining connectivity is our only task or when we have a large number of queries intermixed with edge insertions but may find the DFS solution more appropriate for use in a graph ADT because it makes efficient use of existing infrastructure.
The problems that we have solved with DFS are fundamental. It is a simple approach, and recursion provides us a way to reason about the computation and develop compact solutions to graph-processing problems. Two additional examples, for solving the following problems, are given in the following.
Cycledetection. Support this query: Is a given graph acylic?
Two-colorability. Support this query:Can the vertices of a given graph be assigned one of two colors in such a way that no edge connects vertices of the same color? which is equivalent to this question: Is the graph bipartite?
public class Cycle { private boolean[] marked; private int[] edgeTo; private Stack<Integer> cycle; public Cycle(Graph G) { if (hasSelfLoop(G)) return; if (hasParallelEdges(G)) return; marked = new boolean[G.V()]; edgeTo = new int[G.V()]; for (int v = 0; v < G.V(); v++) if (!marked[v]) dfs(G, -1, v); } // does this graph have a self loop? // side effect: initialize cycle to be self loop private boolean hasSelfLoop(Graph G) { for (int v = 0; v < G.V(); v++) { for (int w : G.adj(v)) { if (v == w) { cycle = new Stack<Integer>(); cycle.push(v); cycle.push(v); return true; } } } return false; } // does this graph have two parallel edges? // side effect: initialize cycle to be two parallel edges private boolean hasParallelEdges(Graph G) { marked = new boolean[G.V()]; for (int v = 0; v < G.V(); v++) { // check for parallel edges incident to v for (int w : G.adj(v)) { if (marked[w]) { cycle = new Stack<Integer>(); cycle.push(v); cycle.push(w); cycle.push(v); return true; } marked[w] = true; } // reset so marked[v] = false for all v for (int w : G.adj(v)) { marked[w] = false; } } return false; } public boolean hasCycle() { return cycle != null; } public Iterable<Integer> cycle() { return cycle; } private void dfs(Graph G, int u, int v) { marked[v] = true; for (int w : G.adj(v)) { // short circuit if cycle already found if (cycle != null) return; if (!marked[w]) { edgeTo[w] = v; dfs(G, v, w); } // check for cycle (but disregard reverse of edge leading to v) else if (w != u) { cycle = new Stack<Integer>(); for (int x = v; x != w; x = edgeTo[x]) { cycle.push(x); } cycle.push(w); cycle.push(v); } } } }
public class TwoColor { private boolean[] marked; private boolean[] color; private boolean isTwoColorable = true; public TwoColor(Graph G) { marked = new boolean[G.V()]; color = new boolean[G.V()]; for (int s = 0; s < G.V(); s++) if (!marked[s]) dfs(G, s); } private void dfs(Graph G, int v) { marked[v] = true; for (int w : G.adj(v)) if (!marked[w]) { color[w] = !color[v]; dfs(G, w); } else if (color[w] == color[v]) isTwoColorable = false; } public boolean isBipartite() { return isTwoColorable; } }
Symbol graphs
Typical applications involve processing graphs defined in files or on web pages, using strings, not integer indices, to define and refer to vertices. To accommodate such applications, we define an input format with the following properties:
■ Vertex names are strings.
■ A specified delimiter separates vertex names (to allow for the possibility of spaces in names).
■ Each line represents a set of edges, connecting the first vertex name on the line to each of the other vertices named on the line.
■ The number of vertices V and the number of edges E are both implicitly defined.
public class SymbolGraph { private ST<String, Integer> st; // string -> index private String[] keys; // index -> string private Graph G; public SymbolGraph(String filename, String delimiter) { st = new ST<String, Integer>(); // First pass builds the index by reading strings to associate // distinct strings with an index In in = new In(filename); while (in.hasNextLine()) { String[] a = in.readLine().split(delimiter); for (int i = 0; i < a.length; i++) { if (!st.contains(a[i])) st.put(a[i], st.size()); } } // inverted index to get string keys in an array keys = new String[st.size()]; for (String name : st.keys()) { keys[st.get(name)] = name; } // second pass builds the graph by connecting first vertex on each // line to all others G = new Graph(st.size()); in = new In(filename); while (in.hasNextLine()) { String[] a = in.readLine().split(delimiter); int v = st.get(a[0]); for (int i = 1; i < a.length; i++) { int w = st.get(a[i]); G.addEdge(v, w); } } } public boolean contains(String s) { return st.contains(s); } public int index(String s) { return st.get(s); } public String name(int v) { return keys[v]; } public Graph G() { return G; } }
The above implementation builds three data structures:
■ A symbol table st with String keys (vertex names) and int values (indices)
■ An array keys[] that serves as an inverted index, giving the vertex name associated with each integer index
■ A Graph G built using the indices to refer to vertices
SymbolGraph uses two passes through the data to build these data structures, primarily because the number of vertices V is needed to build the Graph. In typical real world applications, keeping the value of V and E in the graph definition file (as in our Graph constructor at the beginning of this section) is somewhat inconvenient—with SymbolGraph, we can maintain files such as routes.txt or movies.txt by adding or deleting entries without regard to the number of different names involved.
相关推荐
无向图(Undirected Graphs)是一种基本的数据结构,在计算机科学和离散数学中占有重要地位。图由一组顶点(vertices)和连接这些顶点的边(edges)组成。在无向图中,边没有方向性,即如果一条边连接了顶点A和顶点B...
无向图是由顶点(节点)和连接这些顶点的无向边组成的图形数据结构。与有向图不同,无向图中的边不指明方向。在计算机科学和数学中,无向图被广泛用于表示各种结构,例如社交网络、分子结构或任何其他需要通过图形来...
6. **图的矩阵表示**:邻接矩阵和邻接表是图的两种常见数据结构,MATLAB中可以方便地处理这两种结构,进行图的构建和操作。 7. **图的剪枝和生成**:如生成树的剪枝操作,以及Euler图和Hamilton图的生成。 8. **...
根据给定文件的信息,我们可以将主要内容分为以下几个部分:数学与公式、大数处理与字符读入、数论...这些知识点涵盖了从数学公式到复杂数据结构和算法的广泛领域,对于深入理解计算机科学中的算法设计和分析至关重要。
### 《算法导论》完整课件下载:深入解析图算法与贪心算法 #### 图的概念及表示方法 在《算法导论》这门课程中,图...对于学习计算机科学与算法的学生来说,这些内容至关重要,有助于理解更复杂的算法和数据结构。
在计算机科学中,图是一种非常重要的数据结构,用于表示对象之间的关系。本主题将深入探讨两种基本类型的图:有向图(Directed Graph)和无向图(Undirected Graph),并结合Python语言介绍它们的实现方法。 有向图...
在计算机科学领域,图是一种非常重要的数据结构,被广泛应用于各种实际问题的建模与解决。 #### 二、图的基本定义 ##### 1.1 简单图 简单图(simple graph)是由两个集合组成的有序对\( (V, E) \),其中: - \( V \...
NetworkX库提供了多种基础的图(Graphs)类型,包括无向图(Undirected graphs)、有向图(Directed graphs)、多重图(Multigraphs)以及多重有向图(Multidirected graphs)等。用户可以根据需要选择最适合的图...
它提供了数据结构来表示各种类型的图(graph)以及用于图分析的标准算法。NetworkX 的设计目的是为了促进可读性、清晰度以及模块化。 #### 二、NetworkX用户群体 1. **科研人员**:在社会学、生物学、物理学等领域的...
在IT领域,图(Graphs)是一种非常重要的数据结构,广泛应用于各种算法设计和问题解决。标题中的"graphs"指的是这个主题,而描述虽然简洁,但暗示我们将探讨与图相关的概念和技术。标签“C”表明我们将从C语言的角度...
在IT领域,图和树是数据结构与算法中极为重要的概念,它们广泛应用于网络设计、数据库索引、操作系统、人工智能等多个方面。在这个“Graphs-and-Trees”项目中,我们很可能会探讨C++语言实现图和树的相关知识。C++是...
宠物图(PetGraph)是Rust编程语言中的一个开源库,专门用于处理和操作图形数据结构。这个库为开发者提供了创建、操作...在实际项目中,这个库不仅简化了图形数据结构的实现,还为高效、可靠的图算法提供了坚实的基础。
grapho 是图论数据结构和算法的 Go 实现。 完整的文档可以在找到 图表 有向图和无向图都是使用Graph结构创建的: graph := grapho.NewGraph(false) // true for directed Graphs, false for undirected Nodes ...
加权无向图和最小生成树是图论中的核心概念,它们在计算机科学,特别是算法设计和数据结构中占有重要地位。在这个加州大学伯克利分校的CS-61B项目中,学生将深入理解并实践这两个概念。下面将详细阐述它们的原理以及...
总结,NetworkX是Python中强大的图论算法实现工具,提供了丰富的数据结构和算法,能够方便地处理有向图、多图以及进行各种图操作和可视化。通过深入理解和熟练使用这些功能,我们可以解决各种复杂的问题,如社交网络...
在计算机科学中,离散数学是分析和表达算法复杂性、数据结构、计算机程序设计语言和计算理论的基础。 4. 组合数学 组合数学主要研究离散对象的组合性质,例如,组合计数(如排列和组合)、生成函数、递归关系、图的...
- **PQ树(PQ-Trees)**:一种数据结构,用于存储和操作间隔图中的区间信息。 - **使用PQ树识别间隔图(Recognizing Interval Graphs with PQ-Trees)**:详细解释了如何利用PQ树来判断一个图是否为间隔图。 **其他...
1. **树(Tree)**:无环连通图称为树,是图论中的基础结构,有着广泛应用,如网络设计、数据结构等。书中可能会讨论生成树、最小生成树(如Prim算法和Kruskal算法)、树的遍历等。 2. **哈密顿回路(Hamiltonian ...
GraphLibrary 是一个基于 C# 编程语言的图形库,专为处理图数据结构和算法而设计。在软件工程、计算机科学以及许多其他领域,图是表示对象间关系的常见工具,例如网络、社交关系、计算机硬件架构等。GraphLibrary ...