'MySQL Recursive query to find the shortest path
I have an issue I just can't get my head around...
I have a table called country_neighbour looking like this.
Country_name Country_id Neighbour_name Neighbour_id
Italy 1 France 2
Italy 1 Switzerland 6
Italy 1 Austria 5
France 2 Spain 3
France 2 Italy 1
France 2 Switzerland 6
Spain 3 France 2
Spain 3 Portugal 4
Portugal 4 Spain 3
What i want to get is the shortest path to get from one country to another lets say that I want to know how many borders is needed to cross to get to Portugal From Italy.
(Italy -> France) 1
(Italy -> France -> Spain -> Portugal) 3
I have been looking for ideas and find WITH cte a good approach to my problem but its not supported by MySQL
Is there anyone who can point me in the right direction. I appreciate all help I can get. Thanks.
Solution 1:[1]
This problem can be solved by first detecting the longest path, such that no same node occurs in the same path. After you get this numeric value, you can apply a sequence of several SELF LEFT JOIN operations by imposing two conditions:
- country-neighbour match: the neighbour name of the table t1 is equal to the country name of the table t2
- already seen node: the neighbour name of table t2 shall not be one of the previous country
Calculating the longest path will make you know how many LEFT JOIN you need to carry out to accomplish to this operation. Following there's a snippet given that the longest path is 4.
SELECT
r1.country_name AS r1_start,
r1.neighbour_name AS r1_neigh,
r2.neighbour_name AS r2_neigh,
r3.neighbour_name AS r3_neigh,
r4.neighbour_name AS r4_neigh
FROM routes r1
LEFT JOIN routes r2
ON r2.country_name = r1.neighbour_name
AND r2.neighbour_name NOT IN
(r1.country_name, r1.neighbour_name)
LEFT JOIN routes r3
ON r3.country_name = r2.neighbour_name
AND r3.neighbour_name NOT IN
(r1.country_name, r1.neighbour_name, r2.neighbour_name)
LEFT JOIN routes r4
ON r4.country_name = r3.neighbour_name
AND r4.neighbour_name NOT IN
(r1.country_name, r1.neighbour_name, r2.neighbour_name, r3.neighbour_name)
Once I run the query, I would store the result set inside a table (not recommended a view as this query can be quite expensive on big datasets, even though the conditions should release some weight from these joins): CREATE TABLE country_paths AS SELECT ...
Then, if I want to compute the paths for a specific country, I would query the generated table imposing a filter: SELECT * FROM country_paths WHERE r1.start = 'Italy'.
Here you can find a fiddle too: https://www.db-fiddle.com/f/tCkSHBH82RcFp1UMdXcfMW/0.
Side note: if the longest path is a really big number, then it would be advisable to use a prepared statement in combo with a cycle, like is done here.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | lemon |
