Constraint programming boolean solver - constraint-programming

Huey, Dewey and Louie are being questioned by their uncle. These are the statements they make:
• Huey: “Dewey and Louie had equal share in it; if one is guilty, so is the other.”
• Dewey: “If Huey is guilty, then so am I.”
• Louie: “Dewey and I are not both guilty.”
Their uncle, knowing that they are scouts realizes that they cannot tell a lie.
My solution.
var bool :D; var bool :L; var bool :H;
constraint D <->L;
constraint H -> D;
constraint D!=L;
solve satisfy;
output[show(D), "\n", show(L),"\n", show(H)];
Minizinc can't solve it.

Here's my (old) version of this problem: http://www.hakank.org/minizinc/huey_dewey_louie.mzn
var bool: huey;
var bool: dewey;
var bool: louie;
constraint
% Huey: Dewey and Louie has equal share in it; if one is quitly, so is the other.
(dewey <-> louie)
% Dewey: If Huey is guilty, then so am I.
/\
(huey -> dewey)
% Louie: Dewey and I are not both quilty.
/\
(not (dewey /\ louie))
;

For this kind of problems I prefer to use Boolean Satisfiability (SAT) directly. Your problem can obviously be formulated as a propositional logic formula as follows (using the DIMACS format) :
Atom 1 : Dewey is guilty (i.e. will be associated to the literals -1 and 1 in the CNF)
Atom 2 : Louie is guilty (i.e. will be associated to the literals -2 and 2 in the CNF)
Atom 3 : Huey is guilty (i.e. will be associated to the literals -3 and 3 in the CNF)
The CNF file is then :
p cnf 4 3
-1 2 0
-2 1 0
-3 1 0
-1 -2 0
And here the solution using an 'online' SAT Solver : http://boolsat.com/show/5320e18a0148a30002000002

Yet another solution, using CLP(B) (constraint logic programming over Boolean variables) with SICStus Prolog or SWI:
:- use_module(library(clpb)).
guilty(H, D, L) :-
sat(D =:= L), % Huey
sat(H =< D), % Dewey
sat(~(D*L)). % Louie
Example query and its result:
?- guilty(H, D, L).
D = H, H = L, L = 0.

Another option is to ask WolframAlpha:
not (L xor D) and (H implies D) and not (L and D)
As suggested by Hakan, the following equivalent expression is also possible:
(L equivalent D) and (H implies D) and not (L and D)
Result is a truth table which has only (!D !H !L) as solution.

Related

Prolog implementation of Quine's algorithm for classical propositional logic (in Quine's "Methods of Logic")

I know only one prover that translates the algorithm that Quine gave for classical propositional logic in his book Methods of Logic (Harvard University Press, 1982, ch. 1 sec. 5, pp. 33-40), this prover is in Haskell and it is here:
Quine's algorithm in Haskell
I tried to translate Quine's algorithm in Prolog, but until now I have not succeeded. It is a pity because it is an efficient algorithm and a Prolog translation would be interesting. I am going to describe this algorithm. The only Prolog code that I give at the start is the list of operators that would be useful to test the prover:
% operator definitions (TPTP syntax)
:- op( 500, fy, ~). % negation
:- op(1000, xfy, &). % conjunction
:- op(1100, xfy, '|'). % disjunction
:- op(1110, xfy, =>). % conditional
:- op(1120, xfy, <=>). % biconditional
Truth constants are top and bot for, respectively, true and false. The algorithm starts as follows: For any propositional formula F, make two copies of it and replace the atom which has the highest occurrence in F by top in the first copy, and by bot in the second copy, and then apply the following ten reduction rules one rule at a time for as many times as possible, for each of the copies:
1. p & bot --> bot
2. p & top --> p
3. p | bot --> p
4. p | top --> top
5. p => bot --> ~p
6. p => top --> top
7. bot => p --> top
8. top => p --> p
9. p <=> bot --> ~p
10. p <=> top --> p
Of course, we have also the following rules for negation and double negation:
1. ~bot --> top
2. ~top --> bot
3. ~~p --> p
When there is neither top nor bot in the formula so none of the rules apply, split it again and pick one atom to replace it by top and by bot in yet another two sided table. The formula F is proved if and only if the algorithm ends with top in all copies, and fails to be proved, otherwise.
Example:
(p => q) <=> (~q => ~p)
(p => top) <=> (bot => ~p) (p => bot) <=> (top => ~p)
top <=> top ~p <=> ~p
top top <=> top bot <=> bot
top top
It is clear that Quine's algorithm is an optimization of the truth tables method, but starting from codes of program of truth tables generator, I did not succeed to get it in Prolog code.
A help at least to start would be welcome. In advance, many thanks.
EDIT by Guy Coder
This is double posted at SWI-Prolog forum which has a lively discussion and where provers Prolog are published but not reproduced in this page.
The Haskell code seemed complicated to me. Here's an implementation based on the description of the algorithm given in the question. (Using maplist and dif from the SWI-Prolog library, but easy to make self-contained.)
First, single simplification steps:
formula_simpler(_P & bot, bot).
formula_simpler(P & top, P).
formula_simpler(P '|' bot, P).
formula_simpler(_P '|' top, top). % not P as in the question
formula_simpler(P => bot, ~P).
formula_simpler(_P => top, top).
formula_simpler(bot => _P, top).
formula_simpler(top => P, P).
formula_simpler(P <=> bot, ~P).
formula_simpler(P <=> top, P).
formula_simpler(~bot, top).
formula_simpler(~top, bot).
formula_simpler(~(~P), P).
Then, iterated application of these steps to subterms and iteration at the root until nothing changes anymore:
formula_simple(Formula, Simple) :-
Formula =.. [Operator | Args],
maplist(formula_simple, Args, SimpleArgs),
SimplerFormula =.. [Operator | SimpleArgs],
( formula_simpler(SimplerFormula, EvenSimplerFormula)
-> formula_simple(EvenSimplerFormula, Simple)
; Simple = SimplerFormula ).
For example:
?- formula_simple(~ ~ ~ ~ ~ a, Simple).
Simple = ~a.
For the replacement of variables by other values, first a predicate for finding variables in formulas:
formula_variable(Variable, Variable) :-
atom(Variable),
dif(Variable, top),
dif(Variable, bot).
formula_variable(Formula, Variable) :-
Formula =.. [_Operator | Args],
member(Arg, Args),
formula_variable(Arg, Variable).
On backtracking this will enumerate all occurrences of variables in a formula, for example:
?- formula_variable((p => q) <=> (~q => ~p), Var).
Var = p ;
Var = q ;
Var = q ;
Var = p ;
false.
This is the only source of nondeterminism in the proof procedure below, and you can insert a cut after the formula_variable call to commit to a single choice.
Now the actual replacement of a Variable in a Formula by Replacement:
variable_replacement_formula_replaced(Variable, Replacement, Variable, Replacement).
variable_replacement_formula_replaced(Variable, _Replacement, Formula, Formula) :-
atom(Formula),
dif(Formula, Variable).
variable_replacement_formula_replaced(Variable, Replacement, Formula, Replaced) :-
Formula =.. [Operator | Args],
Args = [_ | _],
maplist(variable_replacement_formula_replaced(Variable, Replacement), Args, ReplacedArgs),
Replaced =.. [Operator | ReplacedArgs].
And finally the prover, constructing a proof term like the Haskell version:
formula_proof(Formula, trivial(Formula)) :-
formula_simple(Formula, top).
formula_proof(Formula, split(Formula, Variable, TopProof, BotProof)) :-
formula_simple(Formula, SimpleFormula),
formula_variable(SimpleFormula, Variable),
variable_replacement_formula_replaced(Variable, top, Formula, TopFormula),
variable_replacement_formula_replaced(Variable, bot, Formula, BotFormula),
formula_proof(TopFormula, TopProof),
formula_proof(BotFormula, BotProof).
A proof of the example from the question:
?- formula_proof((p => q) <=> (~q => ~p), Proof).
Proof = split((p=>q<=> ~q=> ~p),
p,
split((top=>q<=> ~q=> ~top),
q,
trivial((top=>top<=> ~top=> ~top)),
trivial((top=>bot<=> ~bot=> ~top))),
trivial((bot=>q<=> ~q=> ~bot))) .
All of its proofs:
?- formula_proof((p => q) <=> (~q => ~p), Proof).
Proof = split((p=>q<=> ~q=> ~p), p, split((top=>q<=> ~q=> ~top), q, trivial((top=>top<=> ~top=> ~top)), trivial((top=>bot<=> ~bot=> ~top))), trivial((bot=>q<=> ~q=> ~bot))) ;
Proof = split((p=>q<=> ~q=> ~p), p, split((top=>q<=> ~q=> ~top), q, trivial((top=>top<=> ~top=> ~top)), trivial((top=>bot<=> ~bot=> ~top))), trivial((bot=>q<=> ~q=> ~bot))) ;
Proof = split((p=>q<=> ~q=> ~p), q, trivial((p=>top<=> ~top=> ~p)), split((p=>bot<=> ~bot=> ~p), p, trivial((top=>bot<=> ~bot=> ~top)), trivial((bot=>bot<=> ~bot=> ~bot)))) ;
Proof = split((p=>q<=> ~q=> ~p), q, trivial((p=>top<=> ~top=> ~p)), split((p=>bot<=> ~bot=> ~p), p, trivial((top=>bot<=> ~bot=> ~top)), trivial((bot=>bot<=> ~bot=> ~bot)))) ;
Proof = split((p=>q<=> ~q=> ~p), q, trivial((p=>top<=> ~top=> ~p)), split((p=>bot<=> ~bot=> ~p), p, trivial((top=>bot<=> ~bot=> ~top)), trivial((bot=>bot<=> ~bot=> ~bot)))) ;
Proof = split((p=>q<=> ~q=> ~p), q, trivial((p=>top<=> ~top=> ~p)), split((p=>bot<=> ~bot=> ~p), p, trivial((top=>bot<=> ~bot=> ~top)), trivial((bot=>bot<=> ~bot=> ~bot)))) ;
Proof = split((p=>q<=> ~q=> ~p), p, split((top=>q<=> ~q=> ~top), q, trivial((top=>top<=> ~top=> ~top)), trivial((top=>bot<=> ~bot=> ~top))), trivial((bot=>q<=> ~q=> ~bot))) ;
Proof = split((p=>q<=> ~q=> ~p), p, split((top=>q<=> ~q=> ~top), q, trivial((top=>top<=> ~top=> ~top)), trivial((top=>bot<=> ~bot=> ~top))), trivial((bot=>q<=> ~q=> ~bot))) ;
false.
This contains lots of redundancy. Again, this is because formula_variable enumerates occurrences of variables. It can be made more deterministic in various ways depending on one's requirements.
EDIT: The above implementation of formula_simple is naive and inefficient: Every time it makes a successful simplification at the root of the formula, it revisits all of the subformulas as well. But on this problem, no new simplifications of the subformulas will become possible when the root is simplified. Here is a new version that is more careful to first fully rewrite the subformulas, and then only iterate rewrites at the root:
formula_simple2(Formula, Simple) :-
Formula =.. [Operator | Args],
maplist(formula_simple2, Args, SimpleArgs),
SimplerFormula =.. [Operator | SimpleArgs],
formula_rootsimple(SimplerFormula, Simple).
formula_rootsimple(Formula, Simple) :-
( formula_simpler(Formula, Simpler)
-> formula_rootsimple(Simpler, Simple)
; Simple = Formula ).
This is considerably faster:
?- time(formula_simple(~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~(a & b & c & d & e & f & g & h & i & j & k & l & m & n & o & p & q & r & s & t & u & v & w & x & y & z), Simple)).
% 11,388 inferences, 0.004 CPU in 0.004 seconds (100% CPU, 2676814 Lips)
Simple = ~ (a&b&c&d&e&f&g&h& ... & ...).
?- time(formula_simple2(~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~(a & b & c & d & e & f & g & h & i & j & k & l & m & n & o & p & q & r & s & t & u & v & w & x & y & z), Simple)).
% 988 inferences, 0.000 CPU in 0.000 seconds (100% CPU, 2274642 Lips)
Simple = ~ (a&b&c&d&e&f&g&h& ... & ...).
Edit: As pointed out in the comments, the prover as written above can be veeery slow on slightly bigger formulas. The problem is that I forgot that some operators are commutative! Thanks jnmonette for pointing this out. The rewrite rules must be expanded a bit:
formula_simpler(_P & bot, bot).
formula_simpler(bot & _P, bot).
formula_simpler(P & top, P).
formula_simpler(top & P, P).
formula_simpler(P '|' bot, P).
formula_simpler(bot '|' P, P).
...
And with this the prover behaves nicely.
Here is a skeleton of solution. I hope it can help you fill the holes.
is_valid(Formula) :-
\+ derive(Formula,bot).
is_satisfiable(Formula) :-
derive(Formula, top).
derive(bot,D):-
!,
D=bot.
derive(top,D):-
!,
D=top.
derive(Formula,D):-
reduce(Formula, Formula1),
(
Formula=Formula1
->
branch(Formula1,D)
;
derive(Formula1,D)
).
Now you need to implement reduce/2 that applies the reduction rules (recursively in the sub-formulas), and branch/2 that replaces non-deterministically an atom of the formula with either top or bot, then calls recursively derive/2. Something like:
branch(Formula, D):-
pickAtom(Formula, Atom),
(
Rep=top
;
Rep=bot
),
replace(Formula, Atom, Rep, Formula1),
derive(Formula1,D).
Warning! We dont use TPTP syntax, but syntax from SWI-Prolog CLP(B).
Seems that this brute force method is older (*), and
as Prolog code its so small, it even fits into the
pocket of your trousers:
Here is a full implementation. The cut is only used
to priorize the rewriting and corresponds pretty much
Haskell rules. Except that Haskell might not have a
datatype logical variable like Prolog:
:- op(300, fy, ~).
eval(A, A) :- var(A), !.
eval(A+B, R) :- !, eval(A, X), eval(B, Y), simp(X+Y, R).
eval(A*B, R) :- !, eval(A, X), eval(B, Y), simp(X*Y, R).
eval(~A, R) :- !, eval(A, X), simp(~X, R).
eval(A, A).
simp(A, A) :- var(A), !.
simp(A+B, B) :- A == 0, !.
simp(A+B, A) :- B == 0, !.
simp(A+_, 1) :- A == 1, !.
simp(_+B, 1) :- B == 1, !.
simp(A*_, 0) :- A == 0, !.
simp(_*B, 0) :- B == 0, !.
simp(A*B, B) :- A == 1, !.
simp(A*B, A) :- B == 1, !.
simp(~A, 1) :- A == 0, !.
simp(~A, 0) :- A == 1, !.
simp(A, A).
The code is not pure Prolog and uses non-logical var/1, (==)/2, etc.. meta programming. Like Boole we linearly reduce and perform a conjunction of the two substitutions, so we do the Quine split without some branching and via a single front:
judge(A, [B|R]) :- eval(A, B),
term_variables(B, L), judge(B, L, R).
judge(_, [], R) :- !, R = [].
judge(A, [B|L], R) :-
copy_term(A-[B|L], C-[0|L]),
copy_term(A-[B|L], D-[1|L]), judge(C*D, R).
In the above we use copy_term/2 to do substitution. The idea is borrowed from Ulrich Neumerkels lambda library. We need to also make available =< and =:= in eval/2 and simp/2. For full source code see here. Here are example runs in any of your favorite ISO Prolog:
?- judge(A+ ~A, L).
L = [A+ ~A, 1] /* Ends in 1, Tautology */
?- judge(A+ ~B, L).
L = [A+ ~B, ~B, 0] /* Ends in 0, Falsifiable */
?- judge(((P+Q)=<R)=:=((P=<R)*(Q=<R)), L).
L = [(P+Q =< R) =:= (P =< R)*(Q =< R),
((Q =< R) =:= (Q =< R))*(R =:= R*(Q =< R)),
(R =:= R)*((R =:= R)*(R =:= R*R)), 1].
(*) From:
U. Martin and T. Nipkow. Boolean unification—the story so far.
In Unification, pages 437–455. Academic Press, London, 1990.

How can one represent distinct non-numeric symbols in sympy?

I am experimenting with the representation of a trivial statistics problem in Sympy:
For a sample space S, there are 6 possible distinct outcomes
a,b,c,d,e,f. We can define event A as having occurred if any of
a,b,c have, and event B as having ocurred if any of b,c,d have.
Given a probability mass function pmf defined over S, what is the
probability of event A?
When attempting to implement this symbolically, I receive the following error:
~/project/.envs/dev/lib/python3.6/site-packages/sympy/stats/frv.py in _test(self, elem)
164 elif val.is_Equality:
165 return val.lhs == val.rhs
--> 166 raise ValueError("Undecidable if %s" % str(val))
167
168 def __contains__(self, other):
ValueError: Undecidable if Eq(d, a) | Eq(d, b) | Eq(d, c)
The problem is implemented as follows with comments on the failing lines of code:
from sympy import Eq, Function, symbols
from sympy.logic import Or
from sympy.sets import FiniteSet, Union
from sympy.stats import FiniteRV, P
# 1. Define a sample space S with outcomes: a,b,c,d,e,f; Define events A, B
A = FiniteSet(*symbols('a b c'))
B = FiniteSet(*symbols('b c d'))
S = Union(A, B, FiniteSet(*symbols('e f')))
display("Sample Space", S)
pmfFunc = Function("pmf")
pmfDict = {v: pmfFunc(v) for v in S}
X = FiniteRV('X', pmfDict)
a,b = symbols('a b')
# 2. P(X = a) = pmf(a)
display(P(Eq(X,a)))
# 3. A.as_relational(X) yields `(X=a) \lor (X=b) \lor (X=c)`
display(A.as_relational(X))
# 4. P(X = a \lor X = b) = pmf(a) + pmf(b)
# - Actual Output: ValueError: Undecidable if Eq(c, a) | Eq(c, b)
display(P(Or(Eq(X,a), Eq(X,b)))) # [FAILS]
# 5. P(A) = pmf(a) + pmf(b) + pmf(c)
# - Actual Output: ValueError: Undecidable if Eq(d, a) | Eq(d, b) | Eq(d, c)
display(P(A.as_relational(X))) # [FAILS]
I obtain expected output up to display(A.as_relational(X)):
Interpreting the failure message suggests that Sympy is unable to tell that the set members are distinct. Replacing the symbols with integers resolves the error and I get output similar to what I desire.
A = FiniteSet(1, 2, 3)
B = FiniteSet(2, 3, 4)
S = Union(A, B, FiniteSet(5, 6))
If I am not misunderstanding the error or the fundamental use of the library, is there a way to tell Sympy that a collection of symbols is entirely distinct? I have attempted to replace the symbols with Dummy instances without success, and I have also attempted to leverage the assumptions module without success:
facts = [Eq(a,b) if a is b else Not(Eq(a,b)) for a, b in itertools.product(S, S)]
with assuming(*facts):
I would like to avoid confusing mappings between integers and symbolic forms, as user error may not be apparent when the results are printed as latex. I am willing to adopt some burden in a workaround (e.g., as it would have been maintaining a collection of Dummy instances), but I have yet to find an acceptable workaround.
Interesting question. Maybe it can be done with with assuming(Ne(a,b), ...): context but I take a more pragmatic approach: replace symbols with cos(non-zero integer) which SymPy can easily distinguish as equal or not:
>>> reps = dict(zip(var('a:f'),(cos(i) for i in range(1,7))))
>>> ireps = {v:k for k,v in reps.items()}
>>> a,b,c,d,e,f = [reps[i] for i in var('a:f')]
Then remove your a, b = symbols... line and replace display(x) with display(x.subs(ireps) to get
('Sample Space', FiniteSet(a, b, c, d, e, f))
(pmf(a),)
(Eq(X, a) | Eq(X, b) | Eq(X, c),)
(pmf(a) + pmf(b),)
(I use cos(int) instead of int because I am not sure whether any computation would result in addition of two elements and I want to make sure they stay distinct.)
Another approach would be to define a constant class that derives from Symbol:
class con(Symbol):
def __hash__(self):
return id(self)
def __eq__(a,b):
if isinstance(b, con):
return a.name == b.name
_eval_Eq = __eq__
a,b,c,d,e,f=map(con,'abcdef')
display=lambda*x:print(x)
from sympy import Eq, Function, symbols
from sympy.logic import Or
from sympy.sets import FiniteSet, Union
from sympy.stats import FiniteRV, P
A = FiniteSet(a,b,c)
B = FiniteSet(b,c,d)
S = Union(A, B, FiniteSet(e,f))
pmfFunc = Function("pmf")
pmfDict = {v: pmfFunc(v) for v in S}
X = FiniteRV('X', pmfDict)
display("Sample Space", S)
display(P(Eq(X,a)))
display(A.as_relational(X))
display(P(Or(Eq(X,a), Eq(X,b))))
display(P(A.as_relational(X)))
gives
('Sample Space', FiniteSet(a, b, c, d, e, f))
(pmf(a),)
(Eq(X, a) | Eq(X, b) | Eq(X, c),)
(pmf(a) + pmf(b),)
(pmf(a) + pmf(b) + pmf(c),)

Dependently typed bounded range in Lean

Suppose I would like to create a bounded integer Z with bounds a b.
def zbound (x₁ x₂ : ℤ) :=
{ n : ℤ // x₁ ≤ n ∧ n ≤ x₂ }
Is this a reasonable representation of a bounded integer?
Now I would like to create a range of numbers from a to b.
def range : ∀(a b : ℤ), list (zbound a b)
| fro to := if h : fro < to
then ⟨fro, and.intro (le_refl _) (int.le_of_lt h)⟩
:: range (fro + 1) to
else []
I can get this to work with range : ℤ → ℤ → list ℤ, including the proof of termination using using_well_founded. However, I find it impractical in this form, because it doesn't carry a proof that every number within the range is zbound a b.
As such, I would like to get my dependent version. However, I run into the isue that range (fro + 1) to is, naturally, of type list (zbound (fro + 1) to). What I need is list (zbound fro to). How does one get around this issue? I tried approaching the problem by showing that if x is lower bounded by a, then it is also bounded by every number less than a, therefore keeping the bound of form zbound fro to (os this obviously bounds zbound (fro + 1) to). I have however no idea how to use this idea, or even if it makes sense to use it.
I am not sure this is an ideal solution, but it does work for me.
First we need a lemma to weaken the bounded range:
def range_weaken {a b : ℤ} : zbound (a + 1) b → zbound a b
| ⟨i, ⟨lbound, rbound⟩⟩ :=
⟨i, and.intro
(le_of_add_le_left _ 1 _ dec_trivial lbound)
rbound⟩
Then we can redefine range in terms of weakened ranges:
def range : ∀(a b : ℤ), list (zbound a b)
| fro to := if h : fro < to
then ⟨fro, and.intro (le_refl _) h⟩
:: list.map range_weaken (range (fro + 1) to)
else []
using_well_founded { ... }
Note: I couldn't find a lemma I was looking for, so I hand-proved the following:
def le_of_add_le_left (a b c : ℤ) : 0 ≤ b → a + b ≤ c → a ≤ c

Reversible predicates and Strings in SWI-Prolog

append/3 is a very powerful predicate. Suppose I want a predicate that works the same way but for SWI-Prolog's strings.
The easiest approach I see is to transform those strings into lists with string_codes/2, then apply append/3, then use string_codes/2 back. The big problem with this approach is that string_codes/2 does not work if both variables are not unified.
Here is an extremely ugly solution I came up with, which checks which strings are unified to apply string_codes/2 when needed:
append_strings(S1, S2, S3) :-
nonvar(S1),
nonvar(S2),!,
string_codes(S1, A),
string_codes(S2, B),
append(A,B,C),
string_codes(S3, C).
append_strings(S1, S2, S3) :-
nonvar(S1),
nonvar(S3),!,
string_codes(S1, A),
string_codes(S3, C),
append(A,B,C),
string_codes(S2, B).
append_strings(S1, S2, S3) :-
nonvar(S2),
nonvar(S3),!,
string_codes(S2, B),
string_codes(S3, C),
append(A,B,C),
string_codes(S1, A).
append_strings(S1, S2, S3) :-
nonvar(S3),
string_codes(S3, C),
append(A,B,C),
string_codes(S1, A),
string_codes(S2, B).
This yields the correct results for the following cases:
?- append_strings("test","auie","testauie").
true.
?- append_strings("test",A,"testauie").
A = "auie".
?- append_strings(A,"auie","testauie").
A = "test" ;
false.
?- append_strings(A,B,"testauie").
A = "",
B = "testauie" ;
A = "t",
B = "estauie" ;
A = "te",
B = "stauie" ;
A = "tes",
B = "tauie" ;
A = "test",
B = "auie" ;
A = "testa",
B = "uie" ;
A = "testau",
B = "ie" ;
A = "testaui",
B = "e" ;
A = "testauie",
B = "" ;
false.
Is there really no way to make things simpler than this? Suppose I want to make a whole bunch of predicates that work with strings just like they would with lists: I obviously don't want to have to write what I did for append/3 for all of them. But I also don't want to work with code strings because then I have no way of knowing whether I am manipulating a normal list or really a string.
Since the predicate is working on lists, it seems tempting to me to use DCGs. First let's observe that strings in Prolog are really lists of character codes:
?- X="test".
X = [116,101,115,116]
Of course this is not very readable, so let's see the characters themselves intead of their codes:
?- set_prolog_flag(double_quotes,chars).
yes
?- X="test".
X = [t,e,s,t]
That's better. Thinking about the relation the predicate should describe, I opt for a descriptive name like list_list_appended/3. This predicate has one goal: a dcg-rule, let's call it list_list//2, that uses another dcg, let's call it list//2, to actually write the lists:
list_list_appended(L1,L2,L3) :-
phrase(list_list(L1,L2),L3). % L3 is L1+L2
list([]) --> % if the list is empty ...
[]. % ... there's nothing in the list
list([X|Xs]) --> % if there's a head element ...
[X], % ... it's in the list
list(Xs). % the tail is also a list
list_list(L1,L2) --> % the list consists of ...
list(L1), % ... L1 followed by ...
list(L2). % L2
Your example queries:
?- list_list_appended("test","auie","testauie").
yes
?- list_list_appended(L1,"auie","testauie").
L1 = [t,e,s,t] ? ;
no
?- list_list_appended("test",L2,"testauie").
L2 = [a,u,i,e] ? ;
no
?- list_list_appended("test","auie",L3).
L3 = [t,e,s,t,a,u,i,e]
?- list_list_appended(L1,L2,"testauie").
L1 = [],
L2 = [t,e,s,t,a,u,i,e] ? ;
L1 = [t],
L2 = [e,s,t,a,u,i,e] ? ;
L1 = [t,e],
L2 = [s,t,a,u,i,e] ? ;
L1 = [t,e,s],
L2 = [t,a,u,i,e] ? ;
L1 = [t,e,s,t],
L2 = [a,u,i,e] ? ;
L1 = [t,e,s,t,a],
L2 = [u,i,e] ? ;
L1 = [t,e,s,t,a,u],
L2 = [i,e] ? ;
L1 = [t,e,s,t,a,u,i],
L2 = [e] ? ;
L1 = [t,e,s,t,a,u,i,e],
L2 = [] ? ;
no
As a SWI user you could also use this library in combination with set_prolog_flag(double_quotes,chars). to get the output in desired form. Refer to this answer for details.
Just use string_concat/3. Like ISO atom_concat/3, it can be used in many modes, including (-,-,+).
This is a more compact definition:
append_strings(S1, S2, S3):-
append_strings1(S1, L1, [1]-[], N1),
append_strings1(S2, L2, [1|N1]-N1, N2),
append_strings1(S3, L3, [1,1|N2]-N2, N3),
(N3\=[_,_|_] ->instantiation_error(append_strings/3); true),
append(L1, L2, L3),
(ground(S1)->true;string_codes(S1, L1)),
(ground(S2)->true;string_codes(S2, L2)),
(ground(S3)->true;string_codes(S3, L3)).
append_strings1(S, L, G-NG, N):-
(ground(S) -> (string_codes(S, L), N=G) ; N=NG).
It checks whether each argument is ground and tries to convert to codes, then checks if either the third argument is ground or the other two are, and throws an instantiation error if conditions are not met.
After the append it converts back to string arguments which where not ground.
there has been a similar question some time ago, I will show my proposal, revised
:- meta_predicate when_(0).
when_(P) :-
strip_module(P,_,Q), Q =.. [_|As],
or_list(As, Exp), % hurry debugging :-) display(Exp),
when(Exp, P).
or_list([A], ground(A)) :- !.
or_list([A|As], (ground(A);Exp)) :- or_list(As, Exp).
append_strings(S1, S2, S3) :-
maplist(when_, [string_codes(S1, A), string_codes(S2, B), append(A,B,C), string_codes(S3, C)]).
If you are interested, I can add an operator to hide the syntax details, to get something like
append_strings(S1, S2, S3) -:-
string_codes(S1, A), string_codes(S2, B), append(A,B,C), string_codes(S3, C).

How to minimize a string's length by iteratively removing all occurrences of some specified words from the string

This question appeared in a programming contest and we still have no idea how to solve it.
Question:
Given a string S and a list of strings L, we want to keep removing all occurences of substrings that may be in L. And we have to minimize the length of the final string formed. Also note that removal of a string may initiate more removals.
For example,
S=ccdedefcde, L={cde}
then answer = 1. Because we can reduce S by ccdedefcde -> cdefcde -> fcde -> f.
S=aabaab, L={aa, bb} then answer = 0 as reduction can be carried out by aabaab -> aabb -> aa -> ‘Empty String’
S=acmmcacamapapc, L={mca, pa} then answer=6 as reduction can be carried out by acmmcacamapapc-> acmcamapapc -> acmapapc -> acmapc.
The maximum length of S can be 50 and the maximum length of list L can be 50.
My approach is a basic recursive traversal in which I return the minimum length that I can get by removing different sub-strings. Unfortunately this recursive approach will time out in the worst case input as we have 50 options at each step and the recursion depth is 50.
Please suggest an efficient algorithm that may solve this problem.
Here's a polynomial-time algorithm that yields optimal results. Since it's convenient for me, I'm going to use the polynomial-time CYK algorithm as a subroutine, specifically the extension that computes a minimum-weight parse of a string according to a context-free grammar with weighted productions.
Now we just have to formalize this problem with a context-free grammar. The start symbol is A (usually S, but that's taken already), with the following productions.
A -> N (weight 0)
A -> A C N (weight 0)
I'll explain N shortly. If N and C were terminals, then A would accept the regular language N (C N)*. The nonterminal C matches a single terminal (character).
C -> a (weight 1)
C -> b (weight 1)
C -> c (weight 1)
...
The nonterminal N matches strings that are nullable, that is, strings that can be reduced to the empty string by deleting strings in L. The base case is obvious.
N -> (weight 0)
We also have a production for each element of L. When L = {mca, pa}, for example, we have the following productions.
N -> N m N c N a N (weight 0)
N -> N p N a N (weight 0)
I hope that it's clear how to construct the one-to-one correspondence between iterative removals and parses, where the parse weight is equal to the length of the residual string.
Note: this is not an optimal solution, since it doesn't work for the example S=ABAABABAA, L={ABA}
Algorithm
RECURSIVE_FUNCTION ( STRING STR, STRING PATTERN) :
1. STRING LEFT = STR.SUBSTR (0, STR.FIND(PATTERN))
2. STRING RIGHT = STR.SUBSTR(STR.FIND(PATTERN), STR.LENGTH)
3. IF (RIGHT is empty) THEN RETURN LEFT
4. STRING FIN = RECUR(LEFT) + RECUR(RIGHT)
5. RETURN RECUR(FIN)
function SUBSTR(A,B) will return substring of the string, from index A inclusive to index B exclusive
Operation A + B is concatenation of string A and B
function RECUR(A) call the same function, aka recurrence
Example: ccdedefcde
First it will branch down with RECUR(LEFT) + RECUR(RIGHT):
c[cde]defcde
/ \
c def[cde]
/
def
Then it will RECUR(FIN) on merge:
cdef*
/ \
c def
/
def
* will RECUR to do the following before that MERGE completes:
[cde]f
\
f
and finally the ROOT call returns f

Resources