A Computing Procedure for Quantification Theory
A seminal paper on SAT solving
A Computing Procedure for Quantification Theory Martin Davis and Hilary Putnam JACM Volume 7, Issue 3
This is the oldest paper I’ve summarized. I think SAT solvers are magic and wanted to peek inside the black box. This paper describes a precursor to the full DPLL algorithm. About half of the paper is about dealing with existential and universal quantifiers, which I’m not describing here.
Conjunctive Normal Form
The input to the algorithm described here is a logic formula in conjunctive normal form (CNF). Conjunctive normal form comprises a set of clauses. A clause is the logical OR of a set of literals. A literal is either a variable, or the negation of a variable. Logical AND is used to combine all clauses. For example:
(A | !B | C) & (!A | D)There are straightforward algorithms to convert any logic formula into CNF (and more complicated ones like the Tseytin transformation).
The goal of this algorithm is to determine if the input logic formula is satisfiable. If it is satisfiable, then the algorithm produces a list, assigning each variable to a specific value (true or false).
An alternative normal form would be disjunctive normal form (DNF), but that isn’t practical. If you think about it, any algorithm that outputs DNF effectively outputs a list of all possible solutions to the SAT problem (each clause is a possible solution). So, either the list is small and SAT solving really is easy, or most interesting problems would be prohibitively large in DNF form. This paper even cites a textbook which says that the act of putting a formula into DNF “automatically reveals whether or not the formula is inconsistent”.
Three Rules
The search for a satisfying assignment (i.e., mapping of variable names to values) is described as the iterative application of three rules. The third rule contains the magic necessary to keep the search going.
Rule I - One-Literal Clauses
If a variable appears in a clause that only contains a single literal, then the variable can be assigned a value consistent with the polarity of the literal. For example, in the following formula, A must be true.
A & (B | C)Rule II - Affirmative-Negative Rule
If all appearances of a variable have the same polarity, then that variable can be assigned to a value consistent with that polarity. For example, in the following formula, B can safely be assigned to false.
(A | !B) & (C | !B)Again, the algorithm notes the value of B and then simplifies the formula by substituting that value for B.
Rule III - Eliminating Atomic Formulas
This one is interesting in that subsequent work found an equivalent rule which is more efficient. The first step in applying rule III is to pick a variable p (the choice can affect performance, but this paper doesn’t offer much guidance on how to choose), and refactor the formula such that it has the following form:
(A | p) & (B | !p) & RWhere A, B, and R are expressions that do not contain p. The paper doesn’t explicitly say this, but it seems this must cause the formula to temporarily leave CNF. Here is an example conversion.
Input:
(c | !d | p) & (c | e | p) & (e | !f | !p) & (g | h | !p) & (i | j)
Factor out p:
(((c | !d) & (c | e)) | p) & (((e | !f) & (g | h)) | !p) & (i | j)
Introduce A, B, and R as aliases:
A = (c | !d) & (c | e)
B = (e | !f) & (g | h)
R = (i | j)
Final result:
(A | p) & (B | !p) & RNow that the formula has been refactored, recursively solve the simpler formula:
(A | B) & RIn the example above, the simpler formula is:
(((c | !d) & (c | e)) | ((e | !f) & (g | h))) & (i | j)The intuition here is that p must either be true or false. If p is false, then A must be true. If p is true, then B must be true.
This paper introduces an equivalent Rule III*, which says to simply assume p == false and check if the formula is satisfiable with that assumption. If not, then assume p == true and check if the formula is satisfiable with that assumption. This avoids having to transform the formula into [(A | p) & (B | !p) & R] form, which “can easily increase the number and the lengths of the clauses in the expression rather quickly after several applications”.
Results
This paper has the greatest statement of an improvement in runtime:
The superiority of the present procedure over those previously available is indicated in part by the fact that a formula on which Gilmore’s routine for the IBM 704 causes the machine to compute for 21 minutes without obtaining a result was worked successfully by hand computation using the present method in 30 minutes.
Computers were slower in the 1960s. Imagine someone today coming up with an alternative way of training LLMs which can be done faster by hand than a GPU.
Dangling Pointers
This follow-on paper has an interesting observation about the choice of which variable to select in rule III:
By rearranging the clauses of a formula a different order would in general be created. In some cases, whether or not the program could actually prove the validity of a given formula (without running out of fast access storage) depended on how one shuffled the punched-data deck before reading it into the assembler! Thus, the variation in ordering of constants did affect by a factor of 10 (from 500 to 5000) …
In modern lingo, the branching heuristic matters a lot.

