'Scalable implementation of TSP in Python
I have (let's say) a 100 points. I want to generate a path between them, the shorter, the better. This is the Travelling salesperson problem. Since the TSP is NP-hard, I am satisfied with not finding a global solution. I method which gives a solution quickly & scales well.
Generate example points:
import numpy as np
points = np.random.RandomState(42).rand(100,2)
Generate distance matrix, where the i,j entry contains distance between point[i] and point[j]. Using this answer:
z = np.array([[complex(*c) for c in points]]) # notice the [[ ... ]]
distance_matrix = abs(z.T-z)
How do I continue, to find a shortest path, with at least locally minimal pathlength?
Some related questions:
This thread: Travelling Salesman in scipy provides code for finding a solution to the TSP. It works in an iterative way though, so if the number of points to visit is large, the script does not produce a solution in reasonable times.
This thread: How to solve the Cumulative Traveling Salesman Problem using or-tools in python? does not have a code answer, and is not focused on classical TSP.
This thread: Optimizing a Traveling Salesman Algorithm (Time Traveler Algorithm) provides iterative solutions to the problem (which means bad scaling)
Solution 1:[1]
This is a examples based on simulated annealing (pip install frigidum). It is most likely slower than other solutions. I expect the posted Google OR to be better, but since it is a different approach might be of interest (albeit for learning/education).
import numpy as np
import frigidum
from frigidum.examples import tsp
points = np.random.RandomState(42).rand(100,2)
tsp.nodes = points
tsp.nodes_count = points.shape[0]
z = np.array([[complex(*c) for c in points]]) # notice the [[ ... ]]
distance_matrix = abs(z.T-z)
tsp.dist_eu = distance_matrix
We can use the following Simulated Annealing scheme (<30 seconds); (to get a better solution, get alpha closer to .999, and/or increate repeats with cost of longer calculation)
local_opt = frigidum.sa(random_start=tsp.random_start,
objective_function=tsp.objective_function,
neighbours=[tsp.euclidian_bomb_and_fix, tsp.euclidian_nuke_and_fix, tsp.route_bomb_and_fix, tsp.route_nuke_and_fix, tsp.random_disconnect_vertices_and_fix],
copy_state=frigidum.annealing.naked,
T_start=5,
alpha=.8,
T_stop=0.001,
repeats=10**2,
post_annealing = tsp.local_search_2opt)
local_opt is a tuple with its first element being the solution, and second element the objective value (route length). The the end it will output statistics and the minimum found; (this is not the one plotted).
(Local) Minimum Objective Value Found:
7.46016765
I used zabop plot code to plot the solution;
import matplotlib.pyplot as plt
plt.scatter(points[:,0],points[:,1])
route = local_opt[0]
for a, b in zip(route[:-1],route[1:]):
x = points[[a,b]].T[0]
y = points[[a,b]].T[1]
plt.plot(x, y,c='r',zorder=-1)
plt.gca().set_aspect('equal')
Solution 2:[2]
This solution uses Google's OR-Tools. Install ortools via python -m pip install --upgrade --user ortools.
from ortools.constraint_solver import routing_enums_pb2
from ortools.constraint_solver import pywrapcp
Generate points:
import numpy as np
points = np.random.RandomState(42).rand(100,2)
Calcualte distance matrix as in question:
z = np.array([[complex(*c) for c in points]]) # notice the [[ ... ]]
distance_matrix_pre = abs(z.T-z)
Since the docs say:
Note: Since the routing solver does all computations with integers, the distance callback must return an integer distance for any two locations. If any of the entries of
data['distance_matrix']are not integers, you need to round either the matrix entries, or the return values of the callback, to integers. See Scaling the distance matrix for an example that shows how to avoid problems caused by rounding error.
I am going to multiply by a 10000 every entry in our distance matrix and round down to an int. If want to use more decimal places, multiply by a larger number.
distance_matrix = np.floor(distance_matrix_pre*10000).astype(int).tolist()
Keep following Google's tutorial:
# Create the routing index manager.
manager = pywrapcp.RoutingIndexManager(len(points),1,0)
# Create Routing Model.
routing = pywrapcp.RoutingModel(manager)
def distance_callback(from_index, to_index):
"""Returns the distance between the two nodes."""
# Convert from routing variable Index to distance matrix NodeIndex.
from_node = manager.IndexToNode(from_index)
to_node = manager.IndexToNode(to_index)
return distance_matrix[from_node][to_node]
transit_callback_index = routing.RegisterTransitCallback(distance_callback)
# Define cost of each arc.
routing.SetArcCostEvaluatorOfAllVehicles(transit_callback_index)
The output depends on initial path-selection strategy used (similar to how a numerical solution can depend on inital guess). Here we specify initial strategy:
# Setting first solution heuristic.
search_parameters = pywrapcp.DefaultRoutingSearchParameters()
search_parameters.first_solution_strategy = (
routing_enums_pb2.FirstSolutionStrategy.PATH_CHEAPEST_ARC)
In Google's words:
The code sets the first solution strategy to
PATH_CHEAPEST_ARC, which creates an initial route for the solver by repeatedly adding edges with the least weight that don't lead to a previously visited node (other than the depot). For other options, see First solution strategy.
Solve the problem with initial strategy defined above (ie with PATH_CHEAPEST_ARC):
# Solve the problem.
solution = routing.SolveWithParameters(search_parameters)
Get a route:
if solution:
index = routing.Start(0)
route = [manager.IndexToNode(index)]
while not routing.IsEnd(index):
index = solution.Value(routing.NextVar(index))
route.append(manager.IndexToNode(index))
Plot result:
plt.scatter(points[:,0],points[:,1])
for a, b in zip(route[:-1],route[1:]):
x = points[[a,b]].T[0]
y = points[[a,b]].T[1]
plt.plot(x, y,c='r',zorder=-1)
plt.gca().set_aspect('equal')
The list of other initial strategies which can be used (source):
initial_strategy_list = \
['AUTOMATIC','PATH_CHEAPEST_ARC','PATH_MOST_CONSTRAINED_ARC','EVALUATOR_STRATEGY','SAVINGS','SWEEP','CHRISTOFIDES','ALL_UNPERFORMED','BEST_INSERTION','PARALLEL_CHEAPEST_INSERTION','LOCAL_CHEAPEST_INSERTION','GLOBAL_CHEAPEST_ARC','LOCAL_CHEAPEST_ARC','FIRST_UNBOUND_MIN_VALUE']
If want to increase confidence in the solution, could iterate over all of them and find the best one.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 |


