Path: ...!news.roellig-ltd.de!open-news-network.org!weretis.net!feeder8.news.weretis.net!reader5.news.weretis.net!news.solani.org!.POSTED!not-for-mail From: Mild Shock Newsgroups: sci.logic Subject: Re: The solver does not terminate Date: Sat, 7 Dec 2024 23:02:40 +0100 Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Sat, 7 Dec 2024 22:02:39 -0000 (UTC) Injection-Info: solani.org; logging-data="1008041"; mail-complaints-to="abuse@news.solani.org" User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 SeaMonkey/2.53.19 Cancel-Lock: sha1:9YSQ7zzncJv1+Q2hZob9t7ORSKs= In-Reply-To: X-User-ID: eJwFwYEBwCAIA7CXqEBx5yjU/09Ykk6wK5iMfPkaK8pYYXQriQYr1NwzHuEtn/4SddD7XAGxZNtmXVL63g8m9BTp Bytes: 6944 Lines: 168 Corr.: Forgot to paste nasty2/2: nasty2(0, x0) :- !. nasty2(N, (Y<->X)) :- number_codes(N, L), atom_codes(X, [0'x|L]), M is N-1, nasty2(M, Y). Mild Shock schrieb: > Ok, what helps here is reordering the rules: > > prove(L) :- select(-A,L,R), member(A,R), !. > prove(L) :- select((A->B),L,R), !, prove([-A,B|R]). > prove(L) :- select((A<->B),L,R), !, prove([-A,B|R]), prove([-B,A|R]). > prove(L) :- select(-(A->B),L,R), !, prove([A|R]), prove([-B|R]). > prove(L) :- select(-(A<->B),L,R), !, prove([-A,-B|R]), prove([B,A|R]). > > The above runs fast in the previous test cases, since > Axiom does not need to work with atoms anymore, and > > syntactically equivalent formulas are found by Prolog > unification. But we can make the test case more difficult, > > by swapping sides: > > % nasty1(+Integer, -Expr) > nasty1(0, x0) :- !. > nasty1(N, (X<->Y)) :- >    number_codes(N, L), >    atom_codes(X, [0'x|L]), >    M is N-1, >    nasty1(M, Y). > > % nasty1(+Integer, -Expr) > nasty1(0, x0) :- !. > nasty1(N, (X<->Y)) :- >    number_codes(N, L), >    atom_codes(X, [0'x|L]), >    M is N-1, >    nasty1(M, Y). > > Now the picture is again not very favorable, despite > we did an early non atomic Axiom test in prove/1: > > /* success timing */ > ?- between(5,10,N), nasty1(N,X), nasty2(N,Y), >    time(prove([X<->Y])), fail; true. > % 38,207 inferences, 0.000 CPU in 0.002 seconds > % 111,919 inferences, 0.000 CPU in 0.006 seconds > % 314,127 inferences, 0.016 CPU in 0.017 seconds > % 852,143 inferences, 0.031 CPU in 0.045 seconds > % 2,248,175 inferences, 0.109 CPU in 0.117 seconds > % 5,795,311 inferences, 0.281 CPU in 0.299 seconds > true. > > /* failure timing */ > ?- between(5,10,N), nasty1(N,X), nasty2(N,Y), >    time(\+ prove([(X<->(q<->Y))])), fail; true. > % 23,651 inferences, 0.000 CPU in 0.002 seconds > % 66,537 inferences, 0.000 CPU in 0.004 seconds > % 182,047 inferences, 0.000 CPU in 0.010 seconds > % 485,421 inferences, 0.000 CPU in 0.025 seconds > % 1,264,835 inferences, 0.047 CPU in 0.065 seconds > % 3,229,393 inferences, 0.156 CPU in 0.165 seconds > true. > > The input size grows linearly but the execution > time grows exponential. > > Mild Shock schrieb: >> Hi, >> >> One of the pelettier test cases was to test >> how long it takes to terminate with "false", >> the other was to test how long it takes to >> terminate with "true". The Prolog time/1 >> >> predicate can be used to measure both, a >> "false" outcome and a "true" outcome. >> Wangs algorithm can be easily extended to >> biconditional: >> >> :- op(1100, xfx, <->). >> >> prove(L) :- select((A->B),L,R), !, prove([-A,B|R]). >> prove(L) :- select((A<->B),L,R), !, prove([-A,B|R]), prove([-B,A|R]). >> prove(L) :- select(-(A->B),L,R), !, prove([A|R]), prove([-B|R]). >> prove(L) :- select(-(A<->B),L,R), !, prove([-A,-B|R]), prove([B,A|R]). >> prove(L) :- select(-A,L,R), member(A,R), !. >> >> Usually Theorem Provers choke on these >> test cases, unless they work with some >> XOR representation inspired by Boolean Rings. >> But provers insprired by Boolean Algebra explode: >> >> /* success timing */ >> ?- between(5,10,N), nasty_bicond(N,X), >>     time(prove([X<->X])), fail; true. >> % 211,301 inferences, 0.000 CPU in 0.013 seconds >> % 971,051 inferences, 0.047 CPU in 0.060 seconds >> % 4,386,029 inferences, 0.266 CPU in 0.270 seconds >> % 19,547,363 inferences, 1.219 CPU in 1.209 seconds >> % 86,183,093 inferences, 5.141 CPU in 5.355 seconds >> % 376,634,555 inferences, 22.969 CPU in 23.355 seconds >> true. >> >> /* failure timing */ >> ?- between(5,10,N), nasty_bicond(N,X), >>     M is N+1, nasty_bicond(M,Y), >>     time(\+ prove([X<->Y])), fail; true. >> % 113,383 inferences, 0.000 CPU in 0.008 seconds >> % 514,955 inferences, 0.031 CPU in 0.032 seconds >> % 2,308,703 inferences, 0.141 CPU in 0.143 seconds >> % 10,233,763 inferences, 0.625 CPU in 0.632 seconds >> % 44,928,359 inferences, 2.781 CPU in 2.773 seconds >> % 195,659,867 inferences, 12.078 CPU in 12.142 seconds >> true. >> >> The pelletier test case generator: >> >> nasty_bicond(0, x0) :- !. >> nasty_bicond(N, (X<->Y)) :- >>     number_codes(N, L), >>     atom_codes(X, [0'x|L]), >>     M is N-1, >>     nasty_bicond(M, Y). >> >> What is a Boolean Ring: >> https://en.wikipedia.org/wiki/Boolean_ring >> >> What is a Boolean Algebra: >> https://en.wikipedia.org/wiki/Boolean_algebra_%28structure%29 >> >> Julio Di Egidio schrieb: >>> On 05/12/2024 22:26, Julio Di Egidio wrote: >>>> On 02/12/2024 12:37, Julio Di Egidio wrote: >>>>> On 02/12/2024 09:49, Mild Shock wrote: >>>>>> Could it be that your procedure doesn't >>>>>> terminate always? Why is this not finished >>>>>> after more than 1 minute? >>>>> >>>>> The solver *is* very slow at the moment, and you are trying to >>>>> prove a too complicated statement: >>>>> >>>>> ``` >>>>> ?- unfold(tnt((p<->(q<->r))<->(p<->q)), X). >>>>> ``` >>> >>> That 'pelletier' is actually false.  I guess you have miscopied it, >>> this one is true (classically): >>> >>>    `((p <-> (q <-> r)) <-> (p <-> q)) <-> r`. >>> >>> Meanwhile, I have narrowed the problem of non-termination down to the >>> imp-elim rule, and already have a tentative fix in place. >>> >>> Though I have added that 'pelletier' to my neg cases and the test is >>> now failing: and I don't know exactly why (the trace is again >>> overwhelming), but I guess either I am trading termination for >>> completeness, or this is finally a failure of TNT. >>> >>> I need a (more) principled approach... >>> >>> -Julio >>> >> >