Michael Beeson's Research

Utility Link | Utility Link | Utility Link

Tarski Formalization Project: Strategies

We found proofs of all the approximately 200 theorems that we attempted. These proofs fell into the following categories:

  • Some were found by OTTER without any special strategy, just using standard values for OTTER parameters like max_weight, max_distinct_variables, etc. Most of these were found in a few minutes, although a few took closer to an hour.
  • Some theorems were found using the diagram strategy. That refers to defining one (or a few) terms corresponding to the construction of certain points, usually indicated in a geometrical diagram in the book. Such proofs do not rely on the steps of the book proof; they are found without further use of the book proof, once the diagram is given.
  • Proofs found by the subformula strategy also do not rely on the book proof. Since we used this strategy mainly on difficult theorems, we usually used the diagram too.
  • When we could not find a proof without reference to the steps of the book proof, we put (at least some of) those steps in as intermediate goals; that is, in list(passive)). We used lemma adjunction and hint injection, and often case elimination, to get a proof "by hook or by crook". (All these terms are explained below and in our paper.) When iteration of this process finally produced a proof, we kept the steps as hints, so that the proof can be quickly reproduced.
Below we describe each of these strategies briefly; lengthier discussions can be found in our paper.

The diagram strategy

The diagram strategy works because the defined point is denoted by a single letter instead of a term using, for example, 8 symbols, so the "weight" of such clauses is smaller by 7, making them less likely to be discarded as exceeding max_weight and more likely to be used sooner rather than later once they are kept. This strategy was pioneered by Art Quaife in the 1990s. Please see our paper for an illustrative example, namely, the derivation of the "crossbar theorem" using Pasch.

The subformula strategy

The subformula strategy is very simple: just include as hints all subformulas of the formulas in the theorem one is trying to prove. By "include a clause as hints", we mean to place that clause in list(hints) or list(hints2). Either one results in the clause being given a low weight, making it more likely to be kept, and more likely to be used sooner rather than later. The same is also true of formulas that subsume a hint. There is a technical difference between hints and that is explained in the OTTER manual, and also mentioned in our paper; researcher planning to use these techniques should be aware of the difference, which will not be very important to readers not intending to conduct their own experiments.

We had several spectacular successes with the subformula strategy, described on other pages of this website. Most of the spectacular successes required more than a few minutes to find the proof. To achieve these successes, we set max_weight rather low, typically 8. Combined with the subformula strategy, this means that we are searching for a proof whose steps consist mostly of subformulas of the theorem to be proved, involving just one literal that is not a subformula of the theorem, or perhaps two short such literals. That this succeeds is amazing; but remember that the diagram is usually included as part of the theorem statement in these problems, so short formulas about the points constructed in the diagram are allowed. In other words, we are looking for a proof that involves short statements about the points constructed in the diagram and the original points, without constructing new points or mentioning other predicates not mentioned in the theorem.

Lemma adjunction and hint injection

After giving up on finding a proof without using the book proof, our next step was to put the (important) steps of the book proof into list(passive) as intermediate goals. For example, if the first step were $T(a,b,c)$, we would place -T(a,b,c) | $ANS(1). in list(passive).. When proofs were found of such steps, their steps would be put in as hints. We called this hint injection. Then the process could be iterated. For example, we might get four steps immediately, and the fifth after five minutes. Then, after hint injection, we would get all five immediately, and three more after five minutes. Eventually this iteration led either to a proof of the theorem, or to a stalemate situation, in which no more intermediate goals would be proved (say in an hour).

At that point we turned to lemma adjunction, by which we mean inserting one of the intermediate steps $Q$ that we could not prove as a new hypothesis. If we were lucky that would result in a proof (of the theorem from the new hypothesis $Q$). If not we could continue with further intermediate goals and hint injection. Eventually, in many cases, we would obtain a proof of the theorem from (one or more) additional hypotheses $Q$.

Then we would put in the steps of that proof as hints, and then replace $Q$ in the list of assumptions by $\neg Q$. If that resulted in a proof we would also place the steps of that proof in as hints, and try for a proof without either assumption $Q$ or $\neg Q$. If that failed we would be in a position to use the last strategy, case elimination.

Case elimination

As explained above, we often ended up with a proof of some desired theorem $A$ from a hypothesis $Q$, and another proof of $A$ from $\neg Q$, but even with both proofs inserted as hints, a proof of $A$ was not forthcoming. In that case sometimes we could succeed by including the tautology Q | -Q as a (harmless) assumption; sometimes this worked after hours of runtime. If it did not work, then we resorted to a more elaborate strategy, described in detail in our paper, that involves first seeking a proof of one case that only uses binary resolution and paramodulation (not hyperresolution and unit resolution). For example, Satz 10.2b resisted us for a long time, but eventually we were able to combine the cases. There are, however, other cases (such as the last theorem, Satz 12.11, Hilbert's parallel axiom), where we could not combine the cases. Of course, then one can just formulate the two cases as separate input files and derive the theorem in two steps from the two cases, but we find this less beautiful than having the input files correspond directly to the book theorems.