Rating: 8.5/10.
In Pursuit of the Traveling Salesman: Mathematics at the Limits of Computation by William J. Cook
Book about the Traveling Salesman Problem, written by a University of Waterloo professor and a leading researcher on this topic. The TSP is NP-hard, meaning there’s no efficient algorithm to solve it for all cases (unless P = NP). However, in practice, there are algorithms that perform well on very large instances, and there’s a continuous effort to enhance these algorithms for solving even larger instances. The book is aimed at a non-specialist but technical audience: it provides detailed algorithmic descriptions and mathematical concepts in optimization, linear programming and graph theory, etc, but it doesn’t assume any specialized knowledge, so readers with some undergraduate-level math background should be able to understand it.
The origins of the Traveling Salesman Problem are unclear, but there have been instances of actual salesmen describing the TSP statement. In the 18th century, the similar problems of Eulerian and Hamiltonian circuits were considered in graph theory. The TSP gained prominence in the 1930s to 40s, with some theoretical results emerging about the asymptotic behavior of tour lengths, before any efficient methods for finding an actual tour were known. The most obvious application of TSP is planning routes to visit multiple cities, whether for fun or business purposes, but the TSP also has applications in many scientific contexts, like repositioning telescopes or optimizing circuit carvings. Some problems are not clearly connected to traveling salesmen, but can be reduced to TSP using a clever distance function.
The next few chapsters delve into TSP algorithms. It starts with basic approaches like greedy nearest neighbor and insertion-based methods, but these are far from optimal. Christofides’ algorithm is based on spanning trees, guarantees a 1.5x approximation. Then there are iterative improvement algorithms that involve edge exchanges: emerging in the 1970s, they were the state-of-the-art for around two decades, they don’t have the theoretical guarantees of Christofides’ algorithm but perform well in practice. The Lin-Kernighan algorithm involves swapping multiple edges simultaneously, and when the number of swaps gets above 5, it gets quite convluted as there are hundreds of cases to handle. There are also metaheuristic algorithms like hill climbing, simulated annealing, and genetics, but don’t work as well as the iterative edge exchange family of methods.
The next set of algorithms is based in linear programming (LP), a general purpose method applicable to any problem that can be formulated as a set of linear equalities with an objective function and can be solved efficiently with the simplex algorithm, invented by Dantzig. Formulating TSP as a linear program is challenging: the straightforward formulation usually yields solutions that aren’t actual tours, instead multiple disjoint cycles, and preventing this requires adding an exponential number of “subtour constraints”. Nevertheless, solving the LP is valuable as it provides a lower bound on the optimal tour length, allowing you to see how close a solution is to the optimal one. The lower bound is derived using LP duality theory and can be graphically visualized using moats and control zones: they identify regions that the tour must traverse, and give a lower bound on the tour length.
The Cutting Planes method solves the TSP using iterative LP solves: when it generates an invalid solution, we add a constraint to eliminate that specific case, and run the solver again. A particularly effective family of inequalities is the “comb inequalities”, these are optimal in some sense as they are facets of the TSP polytope, but algorithmically finding these inequalities is challenging.
The branch and bound method, identifies edges where it is uncertain whether it should be included in the tour; it then divides the problem into two subtrees: one where it’s part of the tour and another where it isn’t. Each branch may be pruned if the best bound on that side is worse than an already found solution, and heuristics are used to decide how to traverse the tree. Over the years, researchers have tackled progressively larger TSP instances, achieving solutions quite close to optimality for millions of nodes, by combining these LP-based methods.
TSP belongs to the complexity class NP, so there is no guaranteed polynomial time solution. The best known approach remains the simple dynamic programming-based algorithm, with a complexity of O(n^2 x 2^n), no better algorithm has been discovered. Additionally, there have been attempts to explore biological computing, which promises to explore multiple options simultaneously, but this method isn’t practical for solving large instances.
The book concludes on a less technical note, discussing how humans and animals tackle the TSP. Humans are quite good at solving small to medium-sized instances through visual intuition, without explicit computation. The last chapter also showcases various artworks that are influenced by the TSP and visually represent tours and algorithms in aesthetically pleasing ways.