Tarski Formalization Project: Testing The Master List
This page explains the "correctness issues" about the master list,
and how the program
TestMasterList addresses those issues.
It contains a high-level description of the program
TestMasterList. You can find the code itself
from our Tools page. The code relies on
which contains the master list itself, in the form of an array
$TarskiTheorems, as well as
Here are the correctness issues for the entire project:
- Is the positive form of each theorem in the master list really the correct
Skolemization of the negative form?
Does the diagram entry in the master list have the form of an equation with a new
constant on the left? And on the right, does it mention only previously-introduced Skolem functions?
Is each Skolem function introduced in exactly one theorem?
Does the master list actually correspond to the statements of the theorems in the book?
Are the input files being correctly generated?
Do they all produce proofs?
And of course: are the proofs produced by OTTER correct? (This is not a novel
issue and will not be discussed here.)
Of these issues, the first four concern the master list, and are discussed on this page.
Each theorem has both a
NegatedForm and a
PositiveForm, and in case the theorem requires
one or more new Skolem symbols, those are listed as an array of strings in the
The first task of
TestMasterListis to check that the
PositiveForm is indeed a correct
Skolemization of the
NegatedForm. Both entries were originally computed by hand.
When we computed (by hand) the positive form, we followed rigid rules that were easy to program.
Constants such as $a$ were changed to variables such as $xa$.
In some cases we had constants $cu$ or $cx$, where the book proof
had $u$ or $x$; in computing the positive form these become $u$
or $x$ again, rather than $xcx$ or $xcu$. Thus,
carries out Skolemization on a text-only basis, without parsing
terms or clauses. (Any parse errors will turn up when we run the resulting input files.)
The program copes correctly when the negative form has one clause that is a disjunction.
In the entire Tarski corpus, there is only one theorem whose negative form contains
two disjunctions. That one was Skolemized by hand and checked carefully.
That the resulting input files do prove is another positive indication.
An issue in Skolemization was the order of arguments to Skolem functions. We carried out
Skolemization in two stages: first, we inserted the correct Skolem function symbol with no arguments.
reflect(). If the theorem introduces more than one Skolem symbol,
TestMasterList has to contain the knowledge as to which variable in the
is to be replaced by which Skolem symbol. This is usually just
x,y in that order, but there
are some exceptions. The second stage is to supply the correct arguments to each Skolem function.
In 2015 we could
not change the conventions that we adopted in 2013, as that would have rendered the
hints obtained from our existing proofs useless. That necessitated some
not very beautiful code that specifies the order per Skolem function. The correctness
of this code is evidenced by the fact that it did work with hints obtained from
existing proofs. Luckily, in 2013 when we worked on one file at a time, we were consistent
about the order of arguments of the same Skolem function in different files.
Each theorem that introduces a Skolem function must introduce a new Skolem symbol,
i.e. on that is not introduced by any theorem occurring earlier in the
and in fact is not even mentioned in any previous theorem or the diagram of any previous theorem.
This condition is checked by
Another concern about the correctness of the master list is the correctness of the diagram.
Since the diagram clauses are included in
, they need to be of a syntactic
form that is known to be conservative, so that we do not introduce any bogus proofs via the diagram.
Adding equations defining a new constant is conservative, i.e., no new theorems can be
proved by adding such axioms. However, the meaning of "new constant" is not quite
as obvious as one might think. Consider, for example, the diagram for Satz 3.17.
There are two equations:
$$ e = ip(c,b1,a1,a,p)$$
$$ d = ip(c,b,a,b1,e)$$
The new constants are $d$ and $e$. The equation for $d$ has $e$ on the right, but
that is all right. What would be wrong is if the equation for $d$ had $d$ on the right,
or if the equation for $e$ had $d$ on the right and the equation for $d$ had $e$ on the right.
What we mean by "new constant" is, "constant not occurring in the theorem or the
right sides of the previous diagram equations."
The syntactic requirement checked by
is that each diagram clause has the form of an
equation with a new constant on the left, in the precise sense defined above.
checks this condition as well. When we first ran this check, it exposed exactly one violation, in which
a diagram contained an illegal equation. (That was a harmless case, but still it showed that our program was
doing its job.)
The question, whether the diagram in the master list corresponds to the book's
diagram, is not relevant. If you get a proof with any diagram whatsoever (meeting the
above conditions) it is fine.
Sometimes OTTER can find a proof if $a=b$, and also if $a \neq b$, but cannot find a proof with neither assumption.
Then it sometimes help to add $a=b \lor a \neq b$. After that one can usually remove the tautology, but sometimes not.
Therefore, in the master list there is a field in the
Theorem data structure called
This is used for tautologies.
TestMasterList check that the strings in the
Cases field really are tautologies.
That was necessary since whatever is in the
Cases field is added to the hypotheses of the theorem when input files are generated. A typo in a tautology could cause a bogus proof.
We have observed other errors in the master list. For example, the members of various arrays are separated by
commas at the end of lines. If one of those commas is accidentally a period intead of a comma, that is not a PHP
error, because period concatenates strings. Then an incorrect string is written into the input file. The result might
be an OTTER syntax error, but at least once, it was not. Which correctness condition
was violated? It was the condition that the form of the theorem in the master list should match the form in the book.
Suppose the error occurred in Theorem 5.1. It might be caught when Theorem 5.1 did not prove. But it could also
happen that the bogus input file does produce a proof (but not of the real Theorem 5.1, only of the bogus Theorem 5.1).
Then we would hope that the error would be caught when Theorem 5.2 or Theorem 5.3 does not prove. (That was how we
found this error when it really happened.) But in general, there is no mechanical way to check that each
theorem in the master list correctly expresses what Szmielew meant in writing the theorem in the book.
However, if under those circumstances we did still find a proof of Theorem 5.3, the proof would be correct, because the "bogus" theorem 5.1
was still correctly proved and correctly used to prove Theorem 5.3.