Path: news.eternal-september.org!eternal-september.org!feeder3.eternal-september.org!news.swapon.de!weretis.net!feeder8.news.weretis.net!reader5.news.weretis.net!news.solani.org!.POSTED!not-for-mail From: Mild Shock Newsgroups: comp.lang.prolog Subject: Re: Higher Order Logic Programming and Autograd Date: Tue, 11 Mar 2025 13:14:54 +0100 Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Tue, 11 Mar 2025 12:14:51 -0000 (UTC) Injection-Info: solani.org; logging-data="1425304"; mail-complaints-to="abuse@news.solani.org" User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:128.0) Gecko/20100101 Firefox/128.0 SeaMonkey/2.53.20 Cancel-Lock: sha1:6PeF65QJJf8gxqBjdGtzBWDHZxc= X-User-ID: eJwFwYEBwCAIA7CXJpWi5wDF/09Y4uBix6Zz+/PHsBVGWoK3oLHhQ/Rqp8p2fbgZglW6oeKMTuhosgXh+wE7gRWO In-Reply-To: But where is Autograd, automatic derivation from some symbolic input? In general you can objectify neural networks which I already did with the Prolog list, and routines such as back/3 are pure Prolog. Basically you could symbolically derive expit (activation), mulderiv (the product with the derivative of the activation) and matrran (the jacobian without activation) from a DAG of vector functions. In a linear neural network, the jacobian without activation is the same as the weights, and expit has a simple derivative that is based on the expit result itself which is already stored as the activation: /* g(x) = logistic function */ expit(X, Y) :- Y is 1/(1+exp(-X)). /* g'(x) = g(x)*(1-g(x)) */ mulderiv(X, Y, Z) :- Z is X*Y*(1-Y). See also: A Gentle Introduction to torch.autograd https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html Mild Shock schrieb: > What can we do with these new toys, we > can implement vector operations and matrice > operations. An then apply it for example > > to layered neural networks by > representing them as: > > /** >  * Network is represented as [N0,M1,N1,...,Mn,Nn] >  * - Where N0 are the input neurons vector >  * - Where N1 .. Nn-1 are the hidden neurons vectors >  * - Where Nn are the output neurons vector >  * . Where M1 .. Mn are the transition weights matrice >  */ > > ?- mknet([3,2], X). > X = [''(-1, 1, 1), ''(''(1, 1, -1), ''(1, 1, -1)), ''(-1, 1)]. > > The model evaluation at a data point > is straight forward: > > eval([V], [V]) :- !. > eval([V,M,_|L], [V,M|R]) :- !, >    matmul(M, V, H), >    vecact(H, expit, J), >    eval([J|L], R). > > The backward calculation of deltas > is straight forward: > > back([V], U, [D]) :- !, >    vecact(U, V, sub, E), >    vecact(E, V, mulderiv, D). > back([V,M,W|L], U, [D2,M,D|R])  :- >    back([W|L], U, [D|R]), >    mattran(M, M2), >    matmul(M2, D, E), >    vecact(E, V, mulderiv, D2). > > You can use this to compute weight changes > and drive a gradient algorithm. > > Mild Shock schrieb: >> Somehow I shied away from implementing call/n for >> my new Prolog system. I thought my new Prolog system >> has only monomorphic caches , I will never be able to >> >> replicate what I did for my old Prolog system with >> arity polymorphic caches. This changed when I had >> the idea to dynamically add a cache for the duration >> >> of a higher order loop such as maplist/n, foldl/n etc… >> >> So this is the new implementation of maplist/3: >> >> % maplist(+Closure, +List, -List) >> maplist(C, L, R) :- >>     sys_callable_cacheable(C, D), >>     sys_maplist(L, D, R). >> >> % sys_maplist(+List, +Closure, -List) >> sys_maplist([], _, []). >> sys_maplist([X|L], C, [Y|R]) :- >>     call(C, X, Y), >>     sys_maplist(L, C, R). >> >> Its similar as the SWI-Prolog implementation in that >> it reorders the arguments for better first argument >> indexing. But the new thing is sys_callable_cacheable/1, >> >> which prepares the closure to be more efficiently >> called. The invocation of the closure is already >> quite fast since call/3 is implemented natively, >> >> but the cache adds an itch more speed. Here some >> measurements that I did: >> >> /* SWI-Prolog 9.3.20 */ >> ?- findall(X,between(1,1000,X),L), time((between(1,1000,_), >>     maplist(succ,L,_),fail; true)), fail. >> % 2,003,000 inferences, 0.078 CPU in 0.094 seconds >> >> /* Scryer Prolog 0.9.4-350 */ >> ?- findall(X,between(1,1000,X),L), time((between(1,1000,_), >>     maplist(succ,L,_),fail; true)), fail. >>      % CPU time: 0.318s, 3_007_105 inferences >> >> /* Dogelog Player 1.3.1 */ >> ?- findall(X,between(1,1000,X),L), time((between(1,1000,_), >>     maplist(succ,L,_),fail; true)), fail. >> % Zeit 342 ms, GC 0 ms, Lips 11713646, Uhr 10.03.2025 09:18 >> >> /* realla Prolog 2.64.6-2 */ >> ?- findall(X,between(1,1000,X),L), time((between(1,1000,_), >>      maplist(succ,L,_),fail; true)), fail. >> % Time elapsed 1.694s, 15004003 Inferences, 8.855 MLips >> >> Not surprisingly SWI-Prolog is fastest. What was >> a little surprise is that Scryer Prolog can do it quite >> fast, possibly since they heavily use maplist/n all >> >> over the place, they came up with things like '$fast_call' >> etc.. in their call/n implementation. Trealla Prolog is >> a little bit disappointing at the moment. >> >