[D] What is going on here? NeurIPS 2018 paper on Automated Theorem Proving
During this year, I have been talking to quite a lot of people from different research groups who said that they are working on accelerating automated theorem proving using Machine Learning. The principle idea is to guide the derivations performed by the theorem prover using some statistical model that has learned in which situation which steps lead to the completion of the proof.
Hearing this idea from multiple research groups at roughly the same time, made me interested in what progress has already been made on this topic.
I stumbled across the paper “Reinforcement Learning of Theorem Proving” (https://papers.nips.cc/paper/8098-reinforcement-learning-of-theorem-proving) presented at NeurIPS 2018. Essentially, the paper implements a simple version of the idea described above using reinforcement learning.
At first glance, it appears to present a significant improvement compared to “classical” theorem proving methods (mlCoP-vs-rlCoP). However, after more thorough research, it seems that the reported improvements are relative to the “leanCoP” algorithm, i.e., a very simple algorithm for theorem proving and by no means comparable to state-of-the-art automated theorem proofing systems.
When comparing to state-of-the-art theorem proofing algorithms, their approach significantly performs worse than these hand-tuned heuristics.
That’s when I became a bit confused.
NeurIPS is the top venue for machine learning research, so I assumed that papers presented there provide either some significant novelty in terms of learning or major advances in empirical performance.
However, it seems that this paper provides neither of those two.
As stated by the authors, the idea is a straightforward application of an AlphaGo style RL setting to the context of theorem proving. Moreover, as mentioned before, the experimental improvements are only relative to a very basic algorithm.
The reviews give some additional perspective and are accessible here (https://media.nips.cc/nipsbooks/nipspapers/paper_files/nips31/reviews/5309.html).
Reading the reviews suggests that for the initial submission, the authors did not include the comparison with state-of-the-art theorem proving systems (Vampire and E) but only the “baseline” of their basic leanCoP implementation. In other words, in their initial submission, it must have appeared to the reviewers that their method presented a significant advancement compared to existing (non-learning) approaches.
I am not saying this paper is terrible. I just had some quite high expectations before reading it, giving that it was published at NeurIPS 2018. However, I was utterly disappointed after realizing that there is virtually no contribution in terms of novel learning concepts or actual advances in automated theorem proving.
What is your perspective and opinion on this paper?
- Do you know any research groups that work on automated theorem proving using machine learning?
- What do you think of this paper?
- Do you think this paper provides a significant contribution to our research community?
- Do you think the reviewers had some mislead impressions on the paper?