The use of Geographic Information Systems has increased considerably since the eighties and nineties. As one of their most demanding applications we can mention shortest paths search. Several studies about shortest path search show the feasibility of using graphs for this purpose. Dijkstra’s algorithm is one of the classic shortest path search algorithms. This algorithm is not well suited for shortest path search in large graphs. This is the reason why various modifications to Dijkstra’s algorithm have been proposed by several authors using heuristics to reduce the run time of shortest path search. One of the most used heuristic algorithms is the A* algorithm, the main goal is to reduce the run time by reducing the search space. This article proposes a modification of Dijkstra’s shortest path search algorithm in reduced graphs. It shows that the cost of the path found in this work, is equal to the cost of the path found using Dijkstra’s algorithm in the original graph. The results of finding the shortest path, applying the proposed algorithm, Dijkstra’s algorithm and A* algorithm, are compared. This comparison shows that, by applying the approach proposed, it is possible to obtain the optimal path in a similar or even in less time than when using heuristic algorithms.
Introduction
From a practical point of view, a Geographic Information System (GIS) is a computer system capable of handling georeferenced data. These kinds of data refer to information associated with geographic coordinates (longitude, latitude). A GIS should also facilitate the relationship between socio-economic data (i.e. population density) and geographic data, this can be achieved through the generation of thematic maps (Jiang et al. 2010), a service for generating this kind of maps is described by (Rodríguez-Torres and Rodríguez-Puente 2010). The relevance of a GIS is closely related to the ability of building models or representations coming from the real world. This kind of system is very important because it facilitates the decision-making process and has a high social impact. Among the most demanded features in GIS we can mention those related to the analysis of routes, some examples are as follows:
What is the shortest path between places x and y?
What is the optimal path between places x and y considering a certain criterion?
What is the lowest cost path between x and y via places x1, x2, …, xn?
Shortest path search has been widely studied. Many applications can be found in various branches of science, specifically in GIS. The road networks used by GIS to respond to the above requests are usually large and could have thousands of streets, that is why one should pay particular attention to how such information is processed.
One of the classic and most used algorithms for calculating the shortest path from an origin to a destination is Dijkstra’s algorithm, it was first enunciated by Edsger Wybe Dijkstra (1959) and is one of the most used and discussed algorithms in the literature of graphs, the temporal complexity is O(|E| + |V|log|V|), where |E| is the number of edges and |V| is the number of vertices of the graph. However, this algorithm is not efficient for searching shortest path in large graphs (Fuhao and Jiping 2009).
Various modifications to Dijkstra’s algorithm have been proposed by several authors. Some of these algorithms use heuristics to reduce the run time of shortest path search and we can classify them as follows:
1.
Without data preprocessing, i.e.:
A* (A-star) algorithm (Hart et al. 1968). Improved Live long planing A* (Huang et al. 2007).
Precomputed Cluster Distances (PCD) (Maue et al. 2010).
Delling et al. (2009) show an overview of routing algorithms; all approaches show important advances in shortest path search and make possible a low response time in large graphs using heuristics.
One of the most used heuristic algorithms is the A* algorithm, the main goal is to reduce the run time by reducing the search space analyzing only the vertices that have better possibilities to appear in the shortest path. The results obtained by this algorithm depend on the heuristic function used to determine the order in which vertices are visited. If the selected heuristic is optimal the computational complexity is reduced to O(n). That is why the A* algorithm is widely used for shortest path search.
One approach studied for shortest path search on large graphs is related to the use of some properties of the road networks, mainly to reduce the search space of the shortest path.
In the following paragraphs we will be referring to some relevant researches:
Gutman proposes an approach (Gutman 2004) in which he defines a formal attribute of vertex called reach, in order to measure vertex relevance. The reach attribute is precalculated using the graph to reduce the run time of shortest path search.
A relevant approach that uses a property of a road network is related to the hierarchy present in this kind of network. Many strategies use this approach, for example, Sanders and Schultes propose algorithms for constructing and querying highway hierarchies achieving a small run time and show the feasibility of this approach (Sanders and Schultes 2005).
Bast et al. define an approach based on relevant nodes (transit nodes) for long-distance travel (Bast et al. 2007). It consists of making precalculations of shortest path between all pairs of transit nodes and from each potential source or destination to its access transit nodes. This approach needs an effective notion of “far away” and the optimal results are guaranteed depending on the local filter selected.
Gonzalez et al. use the hierarchy of roads for partitioning the network into areas and make precalculations of shortest path in these areas (Gonzalez et al. 2007). This approach uses the fact that some roads are more traveled than others and drivers usually use the largest roads.
Geisberger et al. propose an approach that uses only edges that are related with “important” nodes (Geisberger et al. 2008). Pfoser et al. present a shortest path algorithm that imitates human driving behavior by exploiting road network hierarchies (Pfoser et al. 2009).
As an important characteristic of the approaches described above, it may be determined that they are based on the idea that for calculation of large paths (in large networks), only high levels roads (highways, roads more traversed, etc.) of the hierarchical road network are needed. This consideration can reduce the run time of shortest path search algorithms, but can not guarantee to return the optimal path.
Various commercial systems use heuristic algorithms with the aim of reducing the run time (Bast et al. 2007). Various authors have defined heuristics for achieving this goal (Fei et al. 2010; Liu and Yang 2009; Nazari et al. 2008; Sun et al. 2008; Xu 2005). Fu et al. show a review of this kind of algorithms for shortest path search in transportation applications (Fu et al. 2006).
Heuristic algorithms are relevant for shortest path search in large graphs, even when an error is introduced, acceptable in most of the situations, but they do not guarantee to obtain the optimal path in all cases.
On the other hand, there are algorithms for reducing a graph (Liu et al. 2010; Lu and Liu 2007; Sadiq and Orlowska 2000). With the application of any algorithm on the reduced graph, obviously, a lower response time is achieved. However, in this case, reduction of data brings loss of information. Thus, obtaining a path that is the optimal in the original graph can not be guaranteed.
Rodríguez-Puente proposes a graph reduction algorithm without loss of information (Rodríguez-Puente 2010). It specifies a mechanism to obtain the original graph from which the reduced graph was obtained. This algorithm can be applied naturally to a GIS because a map is usually divided into: zip code, states, regions, etc. This fragmentation of the map contribute to create a partition according to the algorithm requirements. This algorithm has a computational complexity O(n4), which is a high cost for a response in real-time environment. However, in the proposed approach we make a graph reduction for each graph, only once, and the execution of the reduction algorithm is done only for data preprocessing. Highlighting that it does not affect the run time of shortest path search.
This article presents a modification of Dijkstra’s shortest path search algorithm. It shows that it is possible to obtain the lowest cost path in all cases in a time similar to A* algorithm. Thus, the application of this algorithm in GIS can make improvements in services provided by this kind of systems. The use of the proposed algorithm integrated with the mentioned reduction algorithm will ensure efficiency in shortest path search, while maintaining accuracy.
The paper is organized as follows: first, a brief description of the graph reduction algorithm is provided. Second, the algorithm for finding shortest paths in reduced graphs is presented. Then, correctness of the algorithm is proved. Finally, some experimental results and conclusions are discussed.
Graph reduction
In order to achieve a better understanding of the proposal, certain definitions and notations related with graph theory must be introduced. Then, the selected graph reduction algorithm, used in the proposed approach, is presented.
Definitions and notations
Relevant definitions and notations related to the proposed approach are as follows:
Definition 1
A graph is a pair G = (V, E), where:
V is a set of vertices.
E is a set of edges. An edge is an unordered pair of vertices (vi, vj) such that vi, vj∈V.
Definition 2
A weighted graph is defined as a structure G = (V, E, fc), where:
V is a set of vertices.
E is a set of edges.
The function assigns to each edge a positive real value called cost.
Definition 3
A graph rewrite rule R = (Gi, Gj, ψin, ψout) over a graph G = (V, E, fc) consists of:
a graph Gi = ({vi}, ϕ), where vi∈V.
a graph Gj = (Vj, Ej).
two sets of embedding information ψin, ψout of the form {(vm, c1, c2, vn)}, where: ; in the case of ψin, ∃(vn, vi) ∈E, such that fc(vn, vi) = c1. After applying the rewrite rule, a new graph H = (V1, E1, fc 1) is obtained and it holds that ∃(vn, vm) ∈E1, such that fc 1(vn, vm) = c2. Analogously to ψin, we define ψout, with edges orientation as the only difference.
V1 = {V − {vi}} ∪Vj. E1 = E − Et∪Ej∪Ek, (vt 1, vt 2) ∈Et if and only if (vt 1 = vi and vt 2∈V) or (Vt 1∈V and vt 2 = vi). (vm, vn) ∈Ek if and only if (vm, c1, c2) ∈ (ψin∪ψout), . ,
A graph rewrite rule also can be defined over an undirected graph, in this case, the sets ψin and ψout must be represented as an only set called ψ.
The set of edges that join vertex vi with the vertices of the graph G − Gi are called pre-embedding edges. After applying a rewrite rule, the edges that join a vertex of the graph Gj with a vertex of the graph G − Gj are called post-embedding edges. The function ψin transforms the set of pre-embedding edges that are incident in a vertex vi in post-embedding edges that are incident in one or more vertices vj∈Vj. Similarly, the function ψout transforms pre-embedding outgoing edges from a vertex vi in one or more post-embedding outgoing edges from several vertices vj∈Vj.
Definition 4
A reduced graph is a tuple Gr = (Vr, Er, f, R), where:
Vr is a set of vertices.
Er is a set of edges.
, is a function that for each (vi, vj, vk) returns the cost of going from vi to vk through vj, with vk adjacent to vj and vj adjacent to vi. Function f is obviously also defined for the cases where vi = vj and/or vj = vk. In the trivial case, f(v, v, v) = 0.
R is a set of rewrite rules over (Vr, Er, fc), where fc is defined as fc(v, w) = f(v, v, w).
This definition is particularly important when it is associated with another graph, i.e., when a graph is reduced from another graph. We can state that a graph Gr = (Vr, Er, f, R) is reduced from a graph G = (V, E, fc), when applying the set of rewrite rules R to the graph Gr, the graph G is obtained.
In the case of function f, for all 3-tuple of vertices vi, vj, vk∈Vr it holds that f(vi, vj, vk) = fc(vi, vj) + fc(vj, vk). Notice that f(vi, vi, vj) = fc(vi, vj). If vi and vj are not adjacent, the image of both functions would be infinite. This is the way in which we specify that two vertices are not adjacent.
Graph reduction algorithm
The reduction algorithm enunciated in (Rodríguez-Puente 2010) has as a key characteristic that it guarantees no loss of information through the incorporation of rewrite rules. However, an improved version is presented here, since it is necessary to differentiate between what are defined as internal and external vertices below.
This algorithm has two variables as input: a reduced graph G = (V, E, f, R) and a partition over the set of vertices of the graph. On the other hand, the algorithm has as output, a reduced graph.
In first place, it is necessary to refine partition P in order to achieve an optimal path having the same cost of the optimal path obtained by Dijkstra’s algorithm in the original graph; to do this, we introduce the following definition:
Definition 5
Let a graph G = (V, E) and a partition P on V, a vertex vi∈V is internal if ∀vj∈V, such that vi and vj are adjacent, it holds that vi and vj are in the same class of P; i.e. [vi] = [vj] otherwise vi is external.
For refining P, we use the following strategy:
Two vertices vi and vj are in the same class of refined partition if, and only if:
vi and vj are in the same class in the original partition P.
vi and vj are internal vertices.
For each external vertex a new equivalence class is created as a singleton containing only this vertex.
In Figure 1, we show an example of how to refine a partition using definitions of internal and external vertex.
Next, we create a new vertex wi for each Ai∈P,|Ai|>1, i=1..s. V′=wi is a set of reduced vertices and V − V′ is the set of unreduced vertices in the reduced graph.
We add a vertex in the reduced graph for each class of the partition calculated in the previous step. If the cardinality of the class is 1, the vertex is considered as an unreduced vertex; in any other case, it is considered as a reduced one (GetReducedVertices method). Next, a set of edges is calculated. One edge can be added to the reduced graph if the two vertices of the edge belong to different equivalence classes (GetEdges method). With the addition of edges to the reduced graph, the cost function fr of the reduced graph must be updated.
The creation of the set of rewrite rules is an essential step in the reduction algorithm. With the rewrite rules, the original graph can be obtained from the reduced graph. Therefore, rewrite rules guarantee no loss of information, and so the reduction process is reversible.
According to Definition 3, a graph rewrite rule is a quadruple of the form (Gi, Gj, ψin, ψout). Then, we create a rewrite rule for each reduced vertex in V′, where:
Gi = ({wi}, ϕ), wi∈V′.
Gj = (Ai, Ei, fcj) is a subgraph of G = (V, E, fc, R), where exists an edge (u, v) ∈Ei if and only if (u, v) ∈E and u, v∈Ai; in addition fcj(u, v) = fc(u, v).
ψin is a set of quadruples of the form (vm, c1, c2, vn) such that for vm∈Ai and vn∈ (V − Ai) and (vn, vm) ∈E and (vn, vi) ∈Er it holds that c1 = fcj(vn, vm); and c2 = fcj(vn, vi).
ψout is a set of quadruples of the form (vm, c1, c2, vn) such that for vm∈Aj and vn∈ (V − Aj) and (vm, vn) ∈E and (vi, vn) ∈Er it holds that c1 = fcj(vm, vn); and c2 = fcj(vi, vn).
The previous explanation corresponds to the implementation of GetRewriteRules method.
Another step that contributes to obtain the optimal path is the calculation of function fr. Function fr stores the cost of the shortest path from one vertex to another, traversing a reduced one.
Function fr is calculated, initially, (Updatefr method) for each reduced vertex. This step is made in this way:
Create an auxiliary graph. First, this graph is equal to the graph Gj = (Vj, Ej, fcj) of the rewrite rule. Second, we add to this graph, vertices that are adjacent (in the original graph) to vertices of graph Gj (notice that these vertices are internal taking into account original graph and set vj), and the edges that connect them.
We apply MDijkstra algorithm (see next section) using all pairs of related vertices, identified in the previous step, as origin and destination vertices.
The obtained costs and path are stored in fr.
Additionally, for all 3-tuples of vertices vi, vj, vk∈V, where vj is a non-reduced vertex, fr(vi, vj, vk) = f(vi, vj, vk).
Path from vi to vk is also stored, with the goal of avoiding additional run time, when the shortest path search in a reduced graph is retrieved.
Algorithm 1 provides the detailed pseudo-code of the graph reduction algorithm.
Algorithm 1 GraphReduction
The complexity of the reduction algorithm would be determined by steps 6-8. According to the above description of Updatefr, this method calculates shortest path from all external vertices (taking into account the original graph) of vj to all vertices of the auxiliary graph.
In a graph obtained from a network in a map, a vertex represents the intersection of two or more lines and an edge represents the connection between two intersections. That is why, in this kind of graph, there are no edges that intersect among them. Thus, we can assume that graphs representing the modeled network through a map are planar.
Moreover, in a graph with these characteristics, the degree of a vertex is generally equal to 4, except in a few cases. Thus it is assumed, without loss of generalization, that the degree of a graph that represents a network of this type is less than or equal to 10. Let Δ(G+) the degree of G, the auxiliary graph has, at most, a · Δ(G+) vertices. In Updatefr method, MDijkstra algorithm is called for each adjacent vertex to any vertex of vj (see Shortest path search algorithm section for temporal complexity of this algorithm), so the temporal complexity, in the worst case, is: O(a · Δ(G+) · a · Δ(G+) log(a · Δ(G+))) = O(Δ(G+)2 · a2· log(a) + log(Δ(G+)))
The terms involving Δ(G+) are constant, so the temporal complexity is O(a2· log(a)).
As a conclusion, the temporal complexity of Algorithm 1 is of polynomial order. The reduction process is made only once, as data preprocessing. This preprocessing task causes an increased in the spatial complexity but, with this approach, we can obtain lower run time in every shortest path computation over the reduced graph.
Reduction example
In this section we explain a very simple example to show the reduction process.
In first place, we create the reduced vertices, one per each equivalence class of P. Thus, after this step, Gr = ({vr 1, v4, v6, v5, v7}, {}, {}, {}). Notice that Vr = {vr 1, v4, v6, v5, v7}, Er = {}, fr = {}, Rr = {}.
Then, we need to calculate the edges of Gr as is specified in the description of Algorithm 1. If there is an edge between two vertices of G, and these vertices are unreduced in Gr, this edge is added to the reduced graph; for example the edge (v5, v7) in G is added to Gr. Additionally, if there is a vertex v∈Pi in a class of P(v∈Vr), and there exists an edge from v to other vertex u of G (u is unreduced vertex in Gr), the edge from the reduced vertex, that represents the class Pi of P, to the vertex u is added to Gr; for example the edge (v2, v4) in G is added to Gr as the edge (vr 1, v4), v2 is in the class of P represented by vr 1.
Therefore, the graph of Figure 2(b) is obtained. In addition, the rewrite rules are created. The graph Gi of the rewrite rule is Gi = ({vr 1},{}) (see left of Figure 3), the graph Gj is created with the vertices of the class of Pi, represented by vri, and edges among them on G, as is presented on the right side of Figure 3. Once we created graphs Gi and Gj, the embedding information (ψin and ψout) must be specified, as is described in the specification of the reduction algorithm.
Figure 3
Rewrite rule example. On the left side is the graph Gi = ({vr 1}, {}), on the right side is the graph Gj = ({v1, v2, v3}, {(v1, v2), (v2, v1), (v2, v3)}) and on the bottom is the embedding information ψout.
Finally, the function fr is calculated. In the example of the reduced graph of Figure 2(b), we need to store the path from v5 to v4 and the path from v5 to v6, both through vr 1. In this case, fr(v5, vr 1, v4) = 6, fr(v5, vr 1, v6) = 9.
The application of the rewrite rules obtained (Figure 3) to Gr (Figure 2(b)) allows us to obtain the original graph G (Figure 2(a)). For this purpose, we enunciated Algorithm 2 based on Definition 3.
This algorithm has as input a reduced graph and a rewrite rule. If a reduced graph has more than one reduced vertex, the application of this algorithm for each reduced vertex would be sufficient to obtain the original graph.
Algorithm 2 Graph Rewrite Rule Application
Following, we show an example of application of the rewrite rule of Figure 3, using Algorithm 2:
Add to Gr (Figure 2(b)) the graph Gj of the rewrite rule (Gj is the right side graph of the rewrite rule).
The pre-embedding edge (vr 1, v5) of cost 1 is transformed in post-embedding edge (v1, v5) of cost 1.
The pre-embedding edge (vr 1, v4) of cost 2 is transformed in post-embedding edge (v2, v4) of cost 2.
The pre-embedding edge (vr 1, v6) of cost 3 is transformed in post-embedding edge (v3, v6) of cost 3.
The pre-embedding edge (vr 1, v5) of cost 1 is transformed in post-embedding edge (v3, v5) of cost 4.
The vertex vr 1 is eliminated from G3.
After applying the rewrite rule we have obtained the graph G (Figure 2(a)). Thus, in the reduction process does not exist loss of information, that is, the reduction is reversible.
Shortest path search algorithm
In this section, a modification of Dijkstra’s shortest path search algorithm is shown. The goal of the proposal is to obtain an optimal path with the same cost as the path returned by Dijkstra’s algorithm, for the same origin and destination, but using a reduced graph.
Both, Dijkstra’s algorithm and the one proposed, are based on iterations over the set of vertices. At each iteration, the algorithm will find a vertex so that the distance from the origin vertex to the selected vertex is minimal. This vertex is called pivot. Usually, the vertices are stored in a priority queue considering, as priority, the distance from the origin vertex. This data structure is used to facilitate the selection of the pivot. Besides, two vectors are updated during the execution of the algorithm. One of them (vector D) is updated with the lowest distance from the origin vertex to each vertex vi (we refer to this distance as D[vi]). The other one (vector Pr) is updated with the predecessor of each vertex in the shortest path from the origin vertex.
Every time that a pivot wn is selected, the distances to its adjacent vertices are updated. If the distance from the origin vertex to the pivot (D[wn]) plus the distance from the pivot to vertex vi is lower than the distance from the origin vertex to vi (D[vi]), D[vi] is updated.
Additionally, there are two differences between Dijkstra’s algorithm and the proposed one.
In the first place, a cost function f:V × V × V→(R+∪ {0, ∞}) is used for calculating the cost from one unreduced vertex to another one, traversing a reduced vertex. Notice that, traditionally, the cost function of a graph has the cost of an edge.
The other difference in the proposed algorithm, is related to the actualization of distances to a reduced vertex. Let us consider an unreduced vertex wn as pivot, it is necessary to update the distances to all adjacent vertices as described above. If a reduced vertex Vr is adjacent to the pivot, we have to update the distances to all vertices that are adjacent to Vr (see lines 15-22 of Algorithm 3) using the cost function f, for guaranteeing the optimal result.
When analyzing the temporal complexity of the proposed algorithm, there are two differences with respect to Dijkstra’s algorithm. The first one is the use of function f, this function is calculated at preprocessing time, so it does not affect the temporal complexity.
The second one implies the execution of one cycle. However, it should be noted that this cycle is repeated Δ(G+) (constant, Δ(G+) < 10) times for each vertex that is stored in the queue.
Thus, Δ(G+) < log(|V|) for large graphs, this new cycle does not affect the temporal complexity. Concluding, temporal complexity of Dijkstra and MDijkstra algorithms are the same order. Also notice that, in a planar graph, we can establish a linear relation between vertices and edges. From the Euler’s formula (Diestel 2010), it follows that |E| ≤ 3|V| − 6 if |V| ≥ 3. So, in the case of Dijkstra’s algorithm in planar graphs, we can state that the temporal complexity is O(|E| + |V| log(|V|)) = O(|V| log(|V|)).
For applying the proposed approach, we need to reduce a graph only once. Then, we can make several shortest path search computations. In other words, we propose to make a data preprocessing for achieving a performance improvement in shortest path search.
This approach brings us the benefit of performing shortest path search in graphs with less vertices than other algorithms use, for instance, Dijkstra and A*. Therefore, it is logical for the proposal to achieve a lower run time. Nevertheless, it is necessary to demonstrate, that the path obtained by this proposal is optimal and equal (in terms of cost) to the one obtained by Dijkstra’s algorithm. These demonstrations are shown in the following section.
The detailed pseudo-code of the proposed modification is presented in Algorithm 3.
Table 1 shows a comparison of temporal complexity of Dijkstra, A* and MDijsktra algorithms. When analyzing A* algorithm considering optimal heuristics, it can be stated that its temporal complexity is O(n), where n is the number of vertices of the graph. Besides, the temporal complexity of Algorithm 3 (MDijkstra) is O(n1 log(n1)) < O(n 12), where n1 is the number of vertices of the reduced graph. Thus, if in the reduction process we obtain a graph G = (Vr, Er), such that , the temporal complexity of both algorithms must be similar.
Table 1
Temporal and spatial complexity of Dijkstra, A* and MDijkstra algorithms
However, as is impractical to obtain an optimal heuristics for this purpose, we can state that the proposal obtains a response in a lower run time than Dijkstra and A* algorithm if a condition is satisfied.
Generally, there is a trade off between efficiency and accuracy in algorithms that have large amount of data as input. The main result of the present work is the efficiency improvement of shortest path search in large graphs without affecting accuracy.
We have the possibility to make a shortest path search in the reduced graph between any pair of vertices of the original graph. It can be achieved by applying a rewrite rule to a proper reduced vertex. However, this involves an additional cost to shortest path search.
It is hard to state that an algorithm for shortest path search is better than other in all cases. In this case, our proposal need a higher space, associated to a preprocessing stage to calculate function f (see Definition 4), than classical Dijkstra’s and A* algorithms (nevertheless, it should be highlighted that the preprocessing is made only once, but shortest path searches are made several times). However, MDijkstra algorithm gives a response in a lower run time.
Below, we prove the correctness of MDijkstra algorithm, with the aim of establishing that the proposed algorithm obtains an optimal path, and the cost of this path is the same as the cost of the path obtained by Dijkstra’s algorithm. Next, we state a theoretical measure to ensure that the response time is lower than A* algorithm. This is the algorithm selected in the literature of shortest path search, to compare run times.
Correctness proof
In this paper, a new shortest path search algorithm is proposed. Therefore, it is necessary to prove that the path obtained by the proposal is optimal in all cases.
With the aim of facilitating the understanding of this section, the correctness proof of several lemmas is presented in Appendix A.
By Lemma 3, DN−1(v) has the minimum distance from vertex vo to vertex v.
To prove the correctness of Algorithm 3, we shall prove that for any path Ca = (vo, v1, v2, ..., vd) with distance vector Dc and predecessors vector P, it holds that ∀v∈V, DN − 1(v) ≤ DcN−1(v), where v is an unreduced vertex.
We can prove the correctness of Dijkstra’s algorithm with a similar reasoning because the same invariants are satisfied. Thus, for the next proof we assume that Dijkstra’s algorithm is correct and satisfies invariants analogous to those defined for Algorithm 3.
As demonstrated before, Algorithm 3 returns the shortest path in the reduced graph. However, it remains to prove that the cost of the shortest path obtained by the proposed algorithm and the one obtained by Dijkstra’s algorithm (in the original graph without reducing it) are the same.
Let:
G = (V, E, fc) a graph.
Gr = (Vr, Er, f) a reduced graph obtained from the graph G.
Theorem 2
Let Ca = (v1, ..., vn) be a path of cost c obtained by applying Dijkstra’s algorithm on the graph G, where v1 and vn are unreduced vertices on the graph Gr, then ∃Ca′ = (u1, u2, …, ut) with cost c, u1 = v1, ut = vn, such that Ca′ is an optimal path on Gr.
Proof
From Ca we can build a path Ca′ of cost c on the graph Gr as follows:
Substitute each sub-path vi, vi+1, …, vi+m for a path vi, vk, vi+m where:
vi+j∈ [vi], j = 1..m
vi, vi+m are external vertices. The other vertices are internal
vk is the reduced vertex (in the graph Gr) that represents the equivalence class [vi]
The cost of the path vi, vk, vi+m is equal to the cost of the path vi, vi+1, …, vi+m, by definition of function f. Thus, the paths Ca and Ca′ have the same cost.
Suppose that exists a path Cb′ = (u1, u2, …, up) of cost c1 < c in the graph Gr, where ui∈Vr, i = 1..p. Then we can obtain a path Cb of cost c1 on the graph G as follows:
Substitute each sub-path ui−1, ui, ui+1 by a path ui−1, uj, uj+1, uj+m, …, ui+1 of cost c3 where:
ui−1, ui+1 are unreduced vertices
uj+t∈ [ui], j = 1..m
c3 = f(ui−1, ui, ui+1)
Therefore paths Cb and Cb′ have the same cost (c1), this leads a contradiction. Thus, there is no path that has less cost than Ca. □
Corollary 1
Let Ca = (v1, ..., vn) a path obtained by applying Dijkstra’s algorithm on graph G, ∀i∈ {1, 2, ..., n} such that Ca[i] is an unreduced vertex in Gr, it holds that the distance to Ca[i] is equal to the distance obtained by MDijkstra algorithm on the reduced graph from v1 to Ca[i].
Theorem 2 establishes that the cost of the shortest path from a vertex vi to any vertex vj (vi and vj being unreduced vertices in Gr) obtained by applying Algorithm 3 is the same as the cost of the shortest path calculated by Dijkstra’s algorithm in the original graph (without reduction).
The fact that both source and destination must be unreduced vertices could be a limiting factor (in terms of the number of vertices to which one can calculate the shortest path) if one does not have a mechanism that allows obtaining a reduced graph Gri from Gr where vi∈V (vi is a vertex in the original graph G = (V, E)) is an unreduced vertex on Gri. This can be accomplished by one or more expansions applying rewrite rules to the reduced vertex that contains vertex vi.
Experimental results
The comparison of the results of shortest path search, applying Algorithm 3 (MDijkstra), Dijkstra’s algorithm and A* algorithm, provides elements emphasizing the advantages of the proposed approach. Besides, correctness proof of the proposed shortest path search algorithm is made.
Algorithm 3 was coded in Python, using the NetworkX library (Hagberg et al. 2008). This library provides an implementation of Dijkstra’s and A* algorithms, allowing to compare the three algorithms on the same technology and with efficient data structures. NetworkX uses a priority queue, implemented with a Heap, to find the shortest path using Dijkstra and A* algorithms. With this implementation, the complexity is O(|E| + |V| log(|V|)).
It is well-known that there are several techniques to make performance improvement on shortest path search, based on Dijkstra’s and A* algorithms; Zeng and Church compare some of them (Zeng and Church 2009). This performance improvement depends on several things, for example: programming language, data structures used in the implementation of algorithms, among others. Therefore, in order to be impartial with the proposal, we compare the proposed algorithm only with the implementation of Dijkstra’s and A* algorithms in the NetworkX library.
The algorithms were run on a Pentium 4 (3.2 GHz) with 1.5 Gb of RAM and the Kubuntu 11.10 operating system.
Two graphs were used for experimental test: one was obtained from a cartography of the North Carolina Statea and the other represents the road network of San Franciscob. The first graph, obtained from North Carolina cartography, has 41810 vertices. This graph was reduced twice. First, we arbitrarily construct two sets of polygons using zip codes. The first one has 30 polygons. The second one has 5 polygons (the second set of polygons does not depend of the first one). Obviously in the second case polygons are larger. In both reductions we use the equivalence relation “in”. If two points are into the same polygon, then they are related through relation “in”. We obtain a reduced graph of 1826 using the first set of polygons, and a reduced graph of 250 vertices using the second set.
The second graph, obtained from San Francisco cartography, has 149756 vertices and it was also reduced twice, using the equivalence relation defined above and two new arbitrary sets of polygons. The first set has 10 polygons and the second one has 4 polygons. In the first reduction, using the first set of polygons we obtain a reduced graph of 2617 vertices. Using the second set of polygons, we obtain another reduced graph of 769 vertices.
Dijkstra’s and A* algorithms were executed on the original graphs and the proposed algorithm was applied to the reduced ones. Each algorithm was executed 10 times; the highest and lowest values were discarded. Finally, the average time among the remaining 8 values are shown.
Table 2 shows a comparison among the three selected algorithms based on the run time of shortest path search.
Table 2
Time of shortest path search with Dijkstra’s and A* algorithms in two original graphs (G1,G2) and time of shortest path search in four reduced graphs with the proposed approach
The results shown in Table 2 confirm the fact that, for large graphs, the run time of shortest path search with the proposed approach would be smaller than the run time obtained with classical approaches.
If in the reduction process we obtain a graph G = (Vr, Er), such that , the temporal complexity of both algorithms (Dijkstra and MDijkstra) must be similar. However, as is impractical to obtain an optimal heuristics for this purpose, we can state that the proposal obtains a response in a lower run time than Dijkstra’s and A* algorithm if a condition is satisfied. Thus, if we assume that we have sufficient memory for storing reduced graphs, the proposed approach is better than Dijkstra’s and A* algorithms; taking into account that if we reduce original graph as proposed before, always we can obtain a response in a lower runtime. The proposal is not useful when the available memory is low and does not permit to store reduced graphs.
In the case of the run time of Algorithm 3 (MDijkstra) on the graph Gr 1.2, the obtained time is higher than the one obtained by A* algorithm. The reason of this result is that the graph Gr 1.2 has a number of vertices considerably higher than the square root of the number of vertices of G1. Notice that we state that the number of vertices of the reduced graph must be less than or equal to the square root of the number of vertices of the original graph. In the case of the graph Gr 2.2 a lower run time than the one obtained by A* algorithm is achieved, although the number of vertices is higher than the square root of the number of vertices of G2.
The selection of origin and destination of the shortest path search in a GIS is usually made using a map, i.e. a user selects these points by clicking in the map shown by the GIS. We believe that, at any time that a user selects an origin or a destination point, the GIS can make an expansion of the reduced graph, using the extent of the map that is visualized and the selected point. If a system for shortest path search is implemented in this way, the time needed to expand a reduced vertex would be irrelevant for the shortest path search, considering that the temporal complexity of expanding a reduced vertex is O(a), where a=max{|Ai|, Ai∈P}.
Most algorithms developed lately for shortest path search make efficiency improvement by reducing the search space, these approaches cause loss in accuracy. The presented approach makes use of a graph reduction algorithm without loss of information, in order to obtain a better run time of the search. This approach maintains the accuracy because the reduction algorithm guarantees no loss of data (see Table 1).
Generally, heuristic algorithms are developed in order to reduce the run time of a specific algorithm, which solves some problems whose optimal solution involves a high computational cost. Many heuristic algorithms are developed for shortest path search in GIS, with the assumption that a low bound of error is admissible in this area. However, with the proposed approach, it is possible to obtain the optimal path in a similar time, and even in less time, than with heuristic algorithms, as shown in Table 2.
Conclusions
In this paper, an algorithm for shortest path search on reduced graphs is developed. Experimental results show that the proposed algorithm is more efficient than Dijkstra’s algorithm on large graphs. In addition, we can conclude the following:
The proposed approach is particularly applicable to GIS, due to the way in which users perform a shortest path search in this kind of systems. This allows us to expand vertices avoiding the influence of the time used in this operation on the shortest path search.
The use of reduced graphs significantly reduces the response time in the shortest path search. That is one of the two main approaches used in literature to reduce the computational cost of this operation.
The shortest path search on a reduced graph ensures scalability regarding the size of the graph on which the analysis is performed.
We prove that the proposed algorithm allows us to obtain an optimal path in a reduced graph. The cost of the obtained path is equal to the cost of the path found using Dijkstra’s algorithm on the original graph.
We have developed a method capable of performing shortest path search in a run time similar to A* algorithm (with h=0 and h=Euclidean distance).
Future work
The modifications made on Dijkstra’s algorithm are related to the use of a new function that has the cost of going through a reduced vertex. Therefore, we can modify other algorithms to make shortest path search in reduced graph (like A* algorithm), whenever the cost of going through a reduced vertex is considered as the cost of the path.
Appendix
A Demonstration of cycle invariants of Algorithm 3
Preconditions that must be met to prove the correctness of Algorithm 3 are expressed by the following definitions and notations:
G = (V, E, fc) is a weighted graph. Without loss of generality we assume that V = {0, 1, ..., M − 1} to make demonstrations less complex.
Gr = (Vr, Er, f, R) is a reduced graph from G and the equivalence relation RE. Without loss of generality we assume that Vr = {0, 1, ..., N − 1}. It is important to notice that in each path of a reduced graph, between two reduced vertices there are, at least, two unreduced vertices, as is shown in Figure 4.
∀n < N, in the execution of Algorithm 3 we define:
A vertex wn, the vertex selected in step n.
A set Cn⊆Vr, the set of vertices visited in step n. C0 = {vo}, .
Dn represents the minimum distance from vo to each vertex v∈Vr as far as it is known in step n. D0(vo) = 0, Dn+1(v) = Min(Dn(wn) + fc(wn, v), Dn(v)) = Min(Dn(Pn(wn)) + f(Pn(wn), wn, v), Dn(v)).
Pn store, for each vertex, the predecessor in the shortest path from vo to vd, as far as it is known in step n. P0(vo) = vo,
Figure 4
Reduced graph example. Vertices 1,2,3 and 4 are reduced vertices, the rest are unreduced ones.
For the correctness proof it is necessary to demonstrate that the following cycle invariants are held:
∀n < N:
1.
|Cn| = n + 1. In the iteration n, there are n + 1 visited vertices.
2.
Dn(vo) = 0 ∧Pn(vo) = vo. The distance from origin vertex to itself is 0 at any iteration. The predecessor of the origin vertex is the vertex itself.
3.
Dn(v) = Dn(Pn(Pn(v))) + f(Pn(Pn(v)), Pn(v), v). The distance to a vertex depends on the distance to its predecessor in the shortest path.
4.
∀v∈CnDn+1(v) = Dn(v). The distance to a vertex in the step n is the same that the distance in the step n + 1, for all visited vertices.
5.
∀vi, vj∈V[vj∈Cn+1 → Dn+1(vi) ≤ Dn+1(Pn(vj)) + f(Pn(vj), vj, vi)]. The distance to any vertex vi is less than or equal to the distance to a visited vertex vj plus the distance from vj to vi.
Lemma 1
∀n < N, |Cn| = n + 1
Proof
(By induction on n)
From the definition of the algorithm, at each step a vertex w is visited, in step 0 vertex vo is visited, thus in the base case we have C0 = {vo}, |C0| = 1,
For n = k + 1, , being v the visited vertex in step k + 1, therefore |Ck+1| = |Ck| + |{v}| = k + 2. □
Lemma 2
∀n < N, Dn(vo) = 0 ∧Pn(vo) = vo
Proof
First, we visit vertex vo and update Dn(vo) = 0, i.e., the minimum distance from vo to itself is 0, the function Dn has its domain in , so the smallest possible value that can be achieved is 0;
Let cost = Dn(Pn(wn)) + f(Pn(wn), wn, v), ∀wn, v∈V, it holds that 0≤0+cost, because the image of the function f is and the vector D(Vr) is initialized from f.
The condition Dn(vo) > Dn(wn) + f(Pn(wn), wn, vo) is never satisfied, thus Dn[vo] and Pn[vo] never change. □
Lemma 3
∀n < N, Dn(v) = Dn(Pn(Pn(v))) + f(Pn(Pn(v)), Pn(v), v)
Proof
(By induction on n)
The base case n = 0, ∀v∈Vr, D0(v) = fc(vo, v), by preconditions.
f(vo, vo, v) = fc(vo, vo) + fc(vo, v) = fc(vo, v), by definition of f and fc, replacing f by fc:
The base case n = 0, C0 = {vj}, D0(vj) = 0, by definition, notice that vj is the only vertex in C0 (in the base case, if vj∈C0, vj is the origin vertex).
D1(vi) ≤ D0(vi), from the definition (Dn+1(v) = Min(Dn(wn) + fc(wn, v), Dn(v))) and D1(vj) = D0(vj) = 0 (notice that vj is the origin vertex). Replacing D0 by D1:
D1(vi) ≤ D1(P0(vj)) + f(P0(vj), vj, vi)
For n = k + 1:
Case 1: vj∈CkDk+1(vi) ≤ Dk(vi), by definition Dk+1(vi) ≤ Dk(Pk(vj)) + f(Pk(vj), vj, vi), by induction hypothesis Dk+1(vi) ≤ Dk+1(Pk(vj)) + f(Pk(vj), vj, vi) by Lemma 4
Case 2: vj = wk.
Dk+1(vi) ≤ Dk(Pk(wk)) + f(Pk(wk), wk, vi), by definition Dk+1(vi) ≤ Dk+1(Pk(wk)) + f(Pk(wk), wk, vi), by Lemma 4, replacing wn by vj:
Fei S, Wei D, Bing Z: Traffic information management and promulgating system based on gis. In 2010 International Conference on Optoelectronics and Image Processing (ICOIP), ICOIP ’10, vol. 2. Los Alamitos: IEEE Computer Society; 2010:676-679. 10.1109/ICOIP.2010.243
Fuhao Z, Jiping L: An algorithm of shortest path based on dijkstra for huge data. In Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009. FSKD ’09, vol. 4. Edited by: Chen Y, Deng H, Zhang D, Xiao Y. Los Alamitos: IEEE Computer Society; 2009:244-247. http://dl.acm.org/citation.cfm?id=1800875.1800929
Geisberger R, Sanders P, Schultes D, Delling D: Contraction hierarchies: faster and simpler hierarchical routing in road networks. In Proceedings of the 7th international conference on Experimental algorithms, WEA’08. Berlin, Heidelberg: Springer-Verlag; 2008:319-333. http://dl.acm.org/citation.cfm?id=1788888.1788912
Goldberg AV, Harrelson C: Computing the shortest path: A search meets graph theory. In Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms, SODA ’05. Philadelphia: Society for Industrial and Applied Mathematics; 2005:156-165. http://dl.acm.org/citation.cfm?id=1070432.1070455
Gonzalez H, Han J, Li X, Myslinska M, Sondag JP: Adaptive fastest path computation on a road network: a traffic mining approach. In Proceedings of the 33rd international conference on Very large data bases, VLDB ’07. Vienna, Austria: VLDB Endowment; 2007:794-805. http://dl.acm.org/citation.cfm?id=1325851.1325942
Gutman RJ: Reach-based routing: A new approach to shortest path algorithms optimized for road networks. In Proceedings of the Sixth Workshop on Algorithm Engineering and Experiments and the First Workshop on Analytic Algorithmics and Combinatorics. Edited by: Arge L, Italiano GF, Sedgewick R. New Orleans: SIAM; 2004:100-111.
Hart PE, Nilsson NJ, Raphael B: A formal basis for the heuristic determination of minimum cost paths. IEEE Trans Syst Sci Cybern 1968, 4(2):100-107. 10.1109/TSSC.1968.300136
Huang B, Wu Q, Zhan FB: A shortest path algorithm with novel heuristics for dynamic transportation networks. Int J Geogr Inf Sci 2007, 21(6):625-644. 10.1080/13658810601079759
Jagadeesh G, Srikanthan T: Route computation in large road networks: a hierarchical approach. Intell Transp Syst, IET 2008, 2(3):219-227. 10.1049/iet-its:20080012
Jiang L, Qi Q, Zhang A: The thematic mapping system on internet. In 2010 18th International Conference on Geoinformatics. Edited by: Liu Y, Chen A. Piscataway: IEEE Computer Society; 2010:1-4. 10.1109/GEOINFORMATICS.2010.5567802
Liu YC, Yang DH: A spatial restricted heuristic algorithm of shortest path. In International conference on artificial intelligence and computational intelligence, 2009. AICI ’09, vol. 2. Los Alamitos: IEEE Computer Society; 2009:pp 36-39. http://www.computer.org/csdl/proceedings/icsssm/2005/8971/02/01500172-abs.html
Liu QX, Cao BX, Zhao YW: An improved verification method for workflow model based on petri net reduction. In 2010 The 2nd IEEE International Conference on Information Management and Engineering (ICIME), vol. 2. Piscataway: IEEE Computer Society; 2010:252-256. 10.1109/ICIME.2010.5477436
Lu K, Liu Q: An algorithm combining graph-reduction and graph-search for workflow graphs verification. In 11th International conference on computer supported cooperative work in design, 2007. CSCWD 2007. Edited by: Shen W, Yang Y, Yong J, Hawryszkiewycz I, Lin Z, Barthès JPA, Maher ML, Hao Q, Tran MH. Washington: IEEE Computer Society; 2007:772-776. 10.1109/CSCWD.2007.4281534
Nazari S, Meybodi MR, Salehigh MA, Taghipour S: An advanced algorithm for finding shortest path in car navigation system. In First international conference on intelligent networks and intelligent systems, 2008. ICINIS ’08. Edited by: Zheng H, Li L, Eguchi K, Wang W. Los Alamitos: IEEE Computer Society; 2008:671-674. http://dx.doi.org/10.1109/ICINIS.2008.147. http://dl.acm.org/citation.cfm?id=1471609.1472922
Rodríguez-Puente R: Aplicación de las gramáticas de grafo en sistemas de información geográfica. Revista Cubana de Ciencias Informáticas (RCCI) 2010, 4(1/2):5-10.
Song Q, Wang X: Efficient routing on large road networks using hierarchical communities. Intell Transportation Syst, IEEE Trans 2011, 12(1):132-140. 10.1109/TITS.2010.2072503
Wagner D, Willhalm T: Speed-up techniques for shortest-path computations. In 24th Annual Symposium on Theoretical Aspects of Computer Science (STACS). Edited by: Thomas W, Weil P. Aachen: Springer; 2007:23-36.
Wang Z, Che O, Chen L, Lim A: An efficient shortest path computation system for real road networks. In Advances in applied artificial intelligence, 19th international conference on industrial, engineering and other applications of applied intelligent systems, IEA/AIE 2006, Lecture notes in computer science. Edited by: Ali M, Dapoigny R. Germany: Springer-Verlag; 2006:711-720.
Xu L: A decision support model based on gis for vehicle routing. In 2005 International conference on services systems and services management, 2005. Proceedings of ICSSSM ’05., vol. 2. Edited by: Chen J. Piscataway: IEEE Computer Society; 2005:1126-1129. http://doi.ieeecomputersociety.org/10.1109/ICSSSM.2005.1500172
Authors are very grateful to Yvonne Collada Peña and Yoan Martínez Márquez for the detailed revision of the manuscript. Authors also acknowledge the critical, thorough and detailed revision of the anonymous reviewer as well as all the valuable comments and suggestions.
Author information
Authors and Affiliations
Universidad de las Ciencias Informáticas, Habana, Cuba
Both authors, viz. RRP and MSLC, were involved in drafting the article and revising it critically, until the final approval of the version to be submitted. Programming and experiments were carried out by RRP. Both authors read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Rodríguez-Puente, R., Lazo-Cortés, M.S. Algorithm for shortest path search in Geographic Information Systems by using reduced graphs.
SpringerPlus2, 291 (2013). https://doi.org/10.1186/2193-1801-2-291