Policy Gradient MaxSAT Solver
Abstract
This paper presents a comparative study of various elements and strategies that can be incorporated into an autoregressive model to address the MaxSAT problem. Building upon a sequential architecture as our foundation, we optimize the model's parameters by maximizing the expected number of satisfied clauses. This optimization enables the model, given a SAT formula, to predict a distribution over potential solutions using the policy gradient method. Our controlled experiments pinpoint elements that guide the optimization process towards superior results.
Keywords
MaxSAT Problem; Policy Gradient; NP-Hard