Machine Learning for Combinatorial Optimization: a Methodological Tour d’Horizon
We have surveyed and highlighted how machine learning can be used to build combinatorial optimization algorithms that are partially learned. We have suggested that imitation learning alone can be valuable if the policy learned is significantly faster to compute than the original one provided by an expert, in this case a combinatorial optimization algorithm. On the contrary, models trained with a reward signal have the potential to outperform current policies, given enough training and a supervised initialization. Learning a policy that generalizes to unseen problems is a challenge, this is why we believe learning should occur on a distribution small enough that the policy could fully exploit the structure of the problem and give better results. We believe end to end machine learning approaches to combinatorial optimization are not enough and advocate for using machine learning in combination withcurrent combinatorial optimization algorithms to benefit from the theoretical guarantees and state-of-the-art algorithms already available.
Other than performance incentives, there is also interest in using machinelearning as a modelling tool for operations research, as done by Lombardi and Milano (2018), or to extract intuition and knowledge about algorithms as mentioned in Bonami et al. (2018); Khalil et al. (2017a).
Although most of the approaches we discussed in this paper are still atan exploratory level of deployment, at least in terms of their use in general-purpose (commercial) solvers, we strongly believe that this is just the beginning of a new era for combinatorial optimization algorithms.