Typing in lambda logic
This essay assumes you have already read implicit and explicit typing, which deals with typing when the inference rules involve only first order unification. Here we take up typing in lambda logic, with lambda unification used in the inference rules of resolution, paramodulation, and demodulation.
We will start with a simple example: group theory. If we use the axioms of group theory, must we relativize them to a unary predicate G(x)? As we have seen in implicit and explicit typing, that is not necessary when doing first-order inference. We could, for example, put in some axioms about natural numbers, and not relativize them to a unary predicate N(x), and as long as our axioms are correctly typed, our proofs will be correctly typed too. There is, however, reason to worry about this when we move to lambda logic.
In lambda calculus, every term has a fixed point. That is, for every term F we can find a term q such that Ap(F,q) = q. (For those unfamiliar with this fact, see lambda calculus.) Another form of the fixed point theorem says that for each term H, we can find a term f such that Ap(f,x) = H(f,x). Applying this to the special case when H(f,x) = c*Ap(f,x), where c is a constant and * is the group multiplication, we get Ap(f,x) = c*Ap(f,x) . It follows from the axioms of group theory that c is the group identity. On the other hand, in lambda logic it is given as an axiom that there exist two distinct objects, say c and d, and since each of d and c must equal the group identity, this leads to a contradiction. Looked at model-theoretically, this means it is impossible, given a lambda model M, to define a binary operation on M and an identity element of M that make M into a group.
Nevertheless, file lagrange3.in uses lambda logic to find an obviously correct proof of the algebraic part of Lagrange's theorem, which does not proceed by first deriving a contradiction, and does not use a unary predicate to relativize the group theory axioms. It is the purpose of this essay to explain and justify this phenomenon.
First, let us consider how explicit typing could be used in this situation. Writing G for the type of group elements, 1 for the group identity, and i(G,G) for the type of maps from G to G, we would specify the following in the Otter-λ input file.
in general, of course, we want type(i(X,Y),lambda(X,Y)), but the special case shown is enough in this example. According to these type specifications, the axioms are correctly typed, and if the command set(types) is in the input file, lambda-unification will produce only correctly typed unifiers.
However, input files like lagrange3.in run fine without using explicit typing. What is the explanation of that? That is what we shall now investigate.
Definition. A type specification is an expression of the form type(R, f(U,V)), where R, U, and V are "type symbols". Any terms may be used as type symbols. A type specification that contains variables is called polymorphic. Here 'type' must occur literally, and 'f' can be any symbol (but not a variable). The number of arguments of f, here shown as two, can be any number, including zero.
The type R is called the value type of f. For examples of type specifications, see list(types) above. The symbol f is called the symbol of the type specification, and the number of arguments of f is the arity.
Definition. A list of type specifications is called coherent if
Observe that in a coherent list of type specifications, the symbol, arity, and argument types determine the value type. This is the usual condition for legal overloading in programming languages.
Definition. A list of type specifications is called lambda-coherent if it is coherent, and (1) in the definition of coherence applies also to Ap and lambda.
Definition. A typing of a term is an assignment of types to the variables occurring in the term and to each subterm of the term. A typing of a literal is similar, but the formula itself must get value type bool. A typing of a clause is a an assignment that simultaneously types all the literals of the clause. A typing of a term (or literal or clause or set of clauses) t is correct with respect to a list of type specifications S provided that
(i) each occurrence of a variable in t is assigned the same type.
(ii) each subterm r of t is typed according to a type specification in S. That is, if r is f(u,v), and f(u,v),u, and v are assigned types a,b,c respectively, then there is a type specification in S of the form type(a,f(b,c)).
(iii) each occurrence of each subterm r of t in t has the same value type.
The phrase, correctly typed term t, is short for term t and a correct typing of t with respect to some list of type specifications given by the context. So far, we have not considered typing specifications in which more than one typing for a given function symbol is allowed. However, we want to allow such specifications, and condition (iii) is needed for this purpose. For example, we might want to have lamba-terms for more than one type of mapping in the same problem; that would be expressed by having more than one pair of types X and Y for which type(i(X,Y),lambda(X,Y)). If such "polymorphic typings" are allowed, then the definition of correctly typed term t should also specify that each occurrence of a subterm r of t has the same value type. Condition (iii) holds automatically in case there is only one typing for each function symbol.)
Definition. A list of type specifications S is called beta-correct if
(1) the type specifications of lambda terms all have the form type(i(X,Y),lambda(X,Y)) for some types X and Y. That is, if type(B, lambda(X,Y)) belongs to S, then B = i(X,Y).
(2) type(i(X,Y),lambda(X,Y)) belongs to S if and only if type(Y,Ap(i(X,Y),X)) belongs to S.
This condition on the list of type specifications guarantees that beta-reduction carries correctly typed terms to correctly typed terms.
Definition. A list of type specifications S is called lambda-complete with respect to a correctly typed term t, provided that for every pair of type symbols (U,V) such that U and V are assigned to some subterms of t, there is a typing in S of the form type(i(U,V), lambda(U,V)), and a typing in S of the form type(V,Ap(i(U,V),U)).
Lambda-unification's most interesting clause concerns the unification of Ap(x,y) with a term t, where t does contain the variable x. In this case the lambda-unification algorithm chooses a "masking subterm" r of t; this is a term that contains all the occurrences of x in t and satisfies some additional conditions. The result will be x = lambda(z,t[r::=z]), where z is a newly-created variable, and t[r::=z] indicates the "substitution" of z for q. The notion of substitution involved is not simply syntactic substitution, but is defined by mutual recursion with unification. The result of lambda-unification will be the substitution
y := r.
This clause can lead to untypable proofs, for example those needed to produce fixed points in lambda calculus. As an example, if we unify Ap(x,y) with f(Ap(x,y)), we get the following result:
x := lambda(z,f(Ap(z,z)))
Here the new variable z has been unified first with x, and then with y, so that type restrictions will be violated if we have specified typings:
Variable z is created without a type, then when unified with x it gets type i(A,B), and then the subsequent unification with y violates the given type restrictions. We say that a particular lambda-unification is type-safe (with respect to some explicit or implicit typings) if the masking subterm r selected by lambda unification has the same type as the variable y. Otter-lambda does not (yet) implement a flag to enable the user to enforce type-safe lambda unification, but this is easy to implement and is planned for June 2005.
Theorem. Let S be a beta-correct, coherent list of type specifications. Let σ be a well-typed substitution produced by type-safe lambda-unification, and let t be a correctly typed term with respect to S. Suppose S is lambda-complete with respect to t. Then tσ is correctly typed with respect to S, and gets the same type as t.
Proof. We proceed by induction on the length of the computation by lambda-unification that produces σ, and for fixed σ, by induction on the complexity of the term t. The inductive argument reduces the complexity of t until t is a variable or a term Ap(X,w) with X a variable, and then it reduces the length of the computation of σ. When t is a term f(r,q) (or with more arguments to f), we have tσ = f(rσ,qσ), and by the induction hypothesis, rσ is correctly typed and gets the same type as r, and qσ is correctly typed and gets the same type as q. Then f(rσ,qσ) is correctly typed (since f(r,q) is correctly typed) and gets the same type as f(r,q). If t is a constant then tσ is t and there is nothing to prove. If t is a variable, and σ is produced by unification of t with another term s, what must be proved is that t and s have the same value type. With one exception (see the next paragraph), a variable must occur as an argument of some term (or atom) and hence the situation really is that we are unifying P(t,...) with P(s,...), where P is either a function symbol or a predicate symbol, and t and s occur in corresponding argument positions (not necessarily the first as shown). Since these terms or atoms P(t,...) and P(s,...) are correctly typed, and S is lambda-coherent, t and s do have the same types. Note that coherence is not enough, we need lambda-coherence to cover the case when P is Ap or lambda.
The exception is when t is a variable, say z, that is created in the process of lambda unification, when unifying Ap(X,w) with another term. This case will be treated along with the so-far untreated case when is Ap(X,w) and s is unified with t using untyped lambda unification. Here w and t are terms, and X is a variable. The algorithm will invent a new variable z, and produce
X := lambda(z,rr)
where rr is obtained by substituting z for some or all of the occurrences of some masking subterm q of t. The definition of lambda unification satisfies this property:
Either w is a variable, or q is w.
The notion of "substitution" here is called "fsubst3", and it is mutually recursive with unification; it is not just syntactic substitution. Assume that t is a correctly typed term. Then every occurrence of q in t has the same type, by the definition of correctly typed. We claim that fsubst3 leads from correctly typed terms to correctly typed terms, just like syntactic substitution. This follows from the definition of fsubst3, in which every step is either done by unification (in which case the induction hypothesis applies, since this unification has fewer steps that the one that calls it), or by syntactic substitution, which obviously preserves correct typedness, if terms are substituted for other terms of the same value type, as they are here. It follows that rr is correctly typed. Now let V be the type of rr in the given typing of t, and U the type of z (which is the same as the type of w). Since S is lambda-complete with respect to t, there is a type specification in S of the form type(i(U,v),lambda(U,V)). Thus the term lambda(z,rr) can be correctly typed with type i(U,V). Then, since S is beta-correct, Ap has a type specification type(V,Ap(i(U,V),U)), and the term Ap(X,w) can be correctly typed with type V. The type of w is the same as the type of q, by the italicized property above. The new variable X gets the type i(U,V).
We are not finished with the proof yet though, since the substitution σ also contains the assignment x := q, if w is a variable. This assignment could in general destroy the well-typedness of the result, as in the fixed-point example above; but the hypothesis that the lambda-unification is type-safe is applicable, and says that w and q have the same type, so that applying this additional substitution does not destroy the well-typedness of X = lambda(z,rr).
This completes the proof of the theorem.
Example: Consider the unification of Ap(X,y) and f(Ap(X,y)). The masking term q is X and rr is X[z::=X] which is z. Thus
X := lambda(z, f(Ap(z,y)))
This is indeed a unifier since Ap(X,y) = Ap(lambda(z,f(Ap(z,y))),y) = f(Ap(y,y)) and applying y:= X we have Ap(X,X) on the left and f(Ap(X,X)) on the right. But it is not a type-safe lambda-unification because the type of the masking subterm X is i(A,B) and the type of the variable y is A.
Let T be a set of type symbols. The finite-type closure of T is the least set T* containing T and closed under the operation i(X,Y). That is, if X and Y belong to T*, then i(X,Y) belongs to T*. The members of T* are called finite types over T.
If we are given any list of type specifications S, then the lambda-completion S* of S is defined to be the least set of type specifications including S and satisfying the following condition: if T is the set of all type symbols occurring in S, and T* is the finite-type closure of T, and U and V are two members of T*, then type(i(U,V), lambda(U,V)) and type(V,Ap(i(U,V),U) both occur in S*. Again, the lambda-completion of S is easily constructed by an inductive definition. To be specific, let S = S, and S[n+1] be S[n] together with all type specifications of the forms
where U and V are type symbols occurring in S[n]. Then S* is defined as the union of the S[n] over all natural numbers n.
Intuitively, S* tells us how to assign a finite type over T to each lambda-term built up from variables and constants that are assigned types by some typing of a term t (or set of terms) with respect to T.
Lemma. Let S be a beta-correct, lambda-coherent list of type specifications, and let t be a correctly typed term (or literal or clause) with respect to S. Let S* be the lambda-completion of S. Then S* is beta-correct and lambda-coherent, and t is correctly typed with respect to S*.
Proof. The second claim, that t is correctly typed with respect to S*, is immediate since S* contains S. The first claim is proved by induction on the construction of S*. What has to be proved is that
(1) the type specifications of lambda terms in S* all have the form type(i(X,Y),lambda(X,Y)) for some types X and Y. That is, if type(B, lambda(X,Y)) belongs to S*, then B = i(X,Y).
(2) type(i(X,Y),lambda(X,Y)) belongs to S* if and only if type(Y,Ap(i(X,Y),X)) belongs to S*.
(3) S* is coherent.
We prove by induction on n that (1) and (2) hold with S* replaced by S[n]. When n = 0, this is just the hypothesis that that S is beta-correct. Suppose that (1) and (2) hold for S[n]. Then let type(B,lambda(X,Y)) belong to S[n+1]. If it already belongs to S[n], then by induction hypothesis B = i(X,Y); if it does not belong to S[n], then X and Y belong to S[n] and B = i(X,Y), by construction of S[n+1]. That proves (1) for S[n+1].
Similarly, if type(i(X,Y),lambda(X,Y)) belongs to S[n+1], then if it belongs to S[n] already then type(Y,Ap(i(X,Y),X)) belongs to S[n], and hence to S[n+1]; but if type(i(X,Y),lambda(X,Y)) does not belong to S[n], then by construction of S[n+1], type(Y,Ap(i(X,Y),X))) belongs to S[n+1]. Conversely, if type(Y,Ap(i(X,Y),X)) belongs to S[n+1], then if it belongs to S[n] already then type(i(X,Y),lambda(X,Y)) belongs to S[n], and hence to S[n+1]; but if type(Y,Ap(i(X,Y),X)) does not belong to S[n], then by construction of S[n+1], type(i(X,Y),lambda(X,Y)) belongs to S[n+1]. That completes the proof of (2) for S[n+1]. By induction, (1) and (2) hold for all n. Since S* is the union of the S[n], (1) and (2) hold for S*.
To prove (3), it suffices to prove by induction on n that S[n] is coherent, since if S* is not coherent, then either there are two different type specifications in S* with the same symbol and arity [but in that case they most both occur in some S[n], which is then not coherent], or there are type specifications of the wrong form with symbol Ap or lambda [but then they must have been in S already, since only the correct form of specifications are added at each stage]. Therefore, suppose S[n] is coherent; we must prove S[n+1] is coherent. If it is not, then either there must be different type specifications in S[n+1] with the same symbol (other than Ap or lambda) and arity [but in that case they must both occur already in S[n], indeed in S, since no such specfications are added in passing from S[n] to S[n+1]], or there must be type specifications of the wrong form with symbol Ap or lambda; but only ones of the right form are added in passing from S[n] to S[n+1], ans since S[n] is coherent, such specifications do not occur in S[n]. That completes the induction step, the proof of (3), and the proof of the lemma.
The next theorem mentions paramodulation and paramodulation from a variable. Readers not already familiar with these terms will find them explained in the last paragraph of the proof below.
Theorem. [Implicit Typing for Lambda Logic] Let A be a set of clauses, and let S be a beta-correct and coherent set of type specifications such that each clause in A is correctly typable with respect to S. Then all conclusions derived from A by binary resolution, hyperresolution, paramodulation, and demodulation, using type-safe lambda unification in these rules of inference, are correctly typable with respect to S, provided demodulation and paramodulation from or into variables are not allowed.
Proof. Note that a typing assigns type symbols to variables, and the scope of a variable is the clause in which it occurs, so as usual with resolution, we assume that all the variables are renamed, or indexed with clause numbers, or otherwise made distinct, so that the same variable cannot occur in different clauses. In that case the originally separate correct typings T[i], one for each clause C[i] in A, can be combined (by union of their graphs) into a single typing T. We claim that the set of clauses A is correctly typed with respect to this typing T. To prove this correctness we need to prove:
(i) each occurrence of a variable in A is assigned the same type by T. This follows from the correctness of C[i], since because the variables have been renamed, all occurrences of any given variable are contained in a single clause C[i].
(ii) If r is f(u,v), and r occurs in A, and f(u,v),u, and v are assigned types a,b,c respectively, then there is a type specification in S of the form type(a,f(b,c)). If the term r occurs in A, then r occurs in some C[i], so by the correctness of T[i], there is a type specification in S as required.
(iii) each occurrence of each term r that occurs in A has the same value type. This follows from the coherence of S. The different typings T[i] are not allowed to assign different value types to the same symbol and arity.
Hence A is correctly typed with respect to T.
Now, let S* be the lambda-completion of S. By the lemma, S* is beta-correct with respect to T. All references to correct typing in the rest of the proof refer to the typing T.
We prove by induction on the length of proofs that all proofs from A using the specified rules of inference lead to correctly typed conclusions. The base case of the induction is just the hypothesis that A is correctly typable. For the induction step, we take the rules of inference one at a time. We begin with binary resolution. Suppose the two clauses being resolved are P | Q and -R | B, where substitution σ is produced by lambda unification and satisfies Pσ = Rσ. Here Q and B can stand for lists of more than one literal, in other words the rest of the literals in the clause, and the fact that we have shown P and -R as the first literals in the clause is for notational convenience only. By hypothesis, P | Q is correctly typed with respect to S, and so is -R | B, and by the previous theorem, Pσ | Qσ and -Rσ | Bσ are also correctly typed. The result of the inference is Qσ | Bσ. But the union of correctly typed terms, literals, or sets of literals (with respect to a coherent set of type specifications) is again correctly typed, by the same argument as in the first part of the proof. In other words, coherence implies that if some subterm r occurs in both Qσ and in Bσ, then r gets the same value type in both occurrences. That completes the induction step when the rule of inference is binary resolution.
Hyperresolution and negative hyperresolution can be "simulated" by a sequence of binary resolutions, so the case in which the rule of inference is hyperresolution or negative hyperresolution reduces to the case of binary resolution.
Now consider demodulation. In this case we have already deduced t=q and P[z:=tσ] and we conclude P[z:=qσ], where the substitution σ is produced by lambda-unification of t with some subterm r of P[z:=r]. Since P[z:=tσ] is correctly typed, all occurrences of the subterm tσ have the same type. By the first theorem above, tσ is correctly typed and has the same value type as t, and qσ is correctly typed and has the same value type as q. Since t=q is correctly typed, the value types of t and q are the same, and hence the value types of tσ and qσ are the same. It follows that P[z:=qσ] is correctly typed. That completes the induction step for demodulation.
Now consider paramodulation. In that case we have already deduced t=q and P[z:=r], and unification of t and r produces a substitution σ such that tσ=rσ. The conclusion of the rule is P[z:=qσ]. (This is the promised explanation of paramodulation.) Paramodulation from variables is the case in which t is a variable or q is a variable. Paramodulation into a variable is the case in which r is a variable. We have disallowed paramodulation from or into variables in the statement of the theorem; therefore t, q, and r are not variables. Let us write Type(t) for the value type of (any term) t. Because t=q is correctly typed, we have Type(t) = Type(q). If neither t nor q is an Ap term, then Type( t σ) = Type(q σ), since they have the same functor. If one of them is an Ap term, the by the preceding theorem, Type( tσ) = Type(t) andType(qσ)=Type(q) = Type(t) = Type( tσ). Thus in any case Type(qσ) = Type(tσ). The value type of r is the same at every occurrence, since P[z:=r] is correctly typed. To show that P[z:=qσ] is correctly typed, it suffices to show that Type(qσ) = Type(r). which is the same as the type of r σ. Since the terms t and r unify, their main symbols are the same (unless r is a variable or an Ap term); hence Type( r) = Type(rσ) = Type(tσ) = Type(qσ), which is what had to be shown. That leaves the case in which r is an Ap term. Then since t and r unify, by the preceding theorem we have Type(r) = Type(rσ) = Type(tσ) = Type(qσ), which is what had to be shown.
Example: group theory in lambda logic
We reconsider the example of group theory, as in file lagrange3.in. Here is an appropriate list of type specifications:
Here g and r are Skolem functions in the negated goal
-f(Ap(X,g(X)) | Ap(X,Ap(Y,r(Y))) != r(Y) .
X comes out to be the desired isormorphism lambda(x,x*a) between H and Ha, and Y comes out to be the inverse of this isomorphism, lambda(y,y*i(a)). Observe that the list of type specifications is beta-correct, because the specifications of Ap and lambda have the proper forms, and it is coherent, since no symbol gets two different specifications, and the set of input clauses in lagrange3.in is correctly typed with respect to this list of type specifications. Therefore, according to the theorem, it is not an accident that the proof in lagrange3.out is also correctly typed with respect to these specifications.
Example: fixed points
The fixed point argument which shows that the group axioms are contradictory in lambda logic requires a term Ap(f,Ap(x,x)). The part of this that is problematic is Ap(x,x). If the type specification for Ap is type(V,Ap(i(U,V),U)), then for Ap(x,x) to be correctly typed, we must have V = U = i(U,U). If U and V are type symbols, this can never happen, so the fixed point construction cannot be correctly typed. It follows from the theorem above that this argument cannot be found by Otter-λ from a correctly typed input file. In particular, in lagrange3.in we have correctly typed axioms, so we will not get a contradiction from a fixed point argument. On the other hand, in file lambda4.in, we show that Otter-λ can verify the fixed-point construction. The input file contains the negated goal
ap(c,ap(lambda(x,ap(c,ap(x,x))), lambda(x,ap(c,ap(x,x))))) != ap(lambda(x,ap(c,ap(x,x))), lambda(x,ap(c,ap(x,x)))).
Since this contains the term Ap(x,x), it cannot be correctly typed with respect to some list of type specifications T which is lambda-coherent, beta-correct, and lambda-complete with respect to this formula. Otter-λ does find a proof using this input file, which is consistent with our argument above that fixed-point constructions will not occur in proofs from correctly typable input files. This input file cannot be correctly typed with respect to any lambda-coherent, beta-correct type specifications that are lambda-complete with respect to the above formula. That fact can be seen directly, as above, or as a corollary of the theorem, since Otter-λ finds a proof. The fact that the theoretical result agrees with the results of running the program is a good thing.
Corollary. Otter-λ is not refutation-complete for lambda logic.
Proof. The (unrelativized) axioms of group theory are contradictory in lambda logic, but by the theorem above, Otter-λ will find only correctly typed proofs, which will be valid in the finite type structure based on any group, and hence will not be proofs of a contradiction.
The situation indicates that we do not even want refutation completeness for lambda logic.