Path: ...!news.nobody.at!weretis.net!feeder8.news.weretis.net!reader5.news.weretis.net!news.solani.org!.POSTED!not-for-mail
From: Mild Shock <janburse@fastmail.fm>
Newsgroups: sci.math
Subject: Credits go to Rolf Pfeiffer (Re: Progress via library(linear))
Date: Sun, 16 Mar 2025 23:17:35 +0100
Message-ID: <vr7ils$u1h$1@solani.org>
References: <vr46pj$1gq98$2@solani.org> <vr7ho3$tg0$3@solani.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 16 Mar 2025 22:17:32 -0000 (UTC)
Injection-Info: solani.org;
	logging-data="30769"; mail-complaints-to="abuse@news.solani.org"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:128.0) Gecko/20100101
 Firefox/128.0 SeaMonkey/2.53.20
Cancel-Lock: sha1:mxji6p/2nhyhO6RBwRIXLIeXbIc=
X-User-ID: eJwFwYEBwDAEBMCV8B4ZB2n2H6F3RGhsejCcjw8m2WO58EctwKmGs3Xlg6z19durJSm9Pdrqld6HNTY8+gMwOxSt
In-Reply-To: <vr7ho3$tg0$3@solani.org>
Bytes: 7075
Lines: 172


What made me do the little prototype? Try this one,
it has a little Java code. But its a little ancient
technologie using the sigmoid activation function. And
it seems to me it uses some graph datastructure:

 > Neural Networks
 > Rolf Pfieffer et al. - 2012
 > 
https://www.ifi.uzh.ch/dam/jcr:00000000-7f84-9c3b-ffff-fffffb34b58a/NN20120315.pdf

I guess it corresponds to this here, which is a SWI-Prolog and C
hybrid, when using FANN_SIGMOID:

 > FANN - Fast Artificial Neural Network
 > Package for SWI-Prolog - 2018
 > https://www.swi-prolog.org/pack/list?p=plfann

Translating the Java code to Prolog from the Pfeiffer
paper into linear algebra using vectors and matrixes, I
have now a little piece of pure Prolog code, that runs

also in the Browser, that can already learn an
AND, and its using the ReLU activation function,
i.e. not the FANN_SIGMOID activation function anymore.

I simulated the bias by an extra input neuron
which is always 1, because I was to lazy to have
bias in the model. Sample output:

      A -- 0.99 ---\
                    \
      B -- 0.99 -----+-- ReLu -->
                    /
      1 -- -0.98 --/

It can als learn an XOR. Libraries such as PyTorch
cooperate with optimizer libraries that provide a
variety of gradient search methods. One needs

to study how these library are architectured so that
they provide plug and play. Maybe can bring the same
architecture to Prolog:

 > A Gentle Introduction to torch.autograd
 >
 > Next, we load an optimizer, in this case SGD with a
 > learning rate of 0.01 and momentum of 0.9. We register all
 > the parameters of the model in the optimizer.
 >
 > optim = torch.optim.SGD(model.parameters(), lr=1e-2, momentum=0.9)

https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html

Mild Shock schrieb:
> 
> Ok some progress report here. I have currently a
> library(linear) in the working which is only a few
> lines of code, but it provides vectors and matrixes.
> One can use the library to define matrix exponentiation:
> 
> matexp(M, 1, M) :- !.
> matexp(M, N, R) :- N mod 2 =:= 0, !,
>     I is N // 2,
>     matexp(M, I, H),
>     matmul(H, H, R).
> matexp(M, N, R) :-
>     I is N-1,
>     matexp(M, I, H),
>     matmul(H, M, R).
> 
> And then do fancy stuff like answering the question
> what are the last 8 digits of fibonacci(1000000):
> 
> ?- time((fib(1000000, _X), Y is _X mod 10^8)).
> % Zeit 28 ms, GC 0 ms, Lips 88857, Uhr 16.03.2025 22:48
> Y = 42546875
> 
> The 28 ms execution time are not bad, since modulo was not
> integrated into matexp/3, making it to compute the full
> fibonacci(1000000) before taking the modulo. Not sure whether
> JavaScript bigint is faster or slower than GMP ?
> 
> So what can we do with library(linear) besides implementing
> eval/3 and back/3 ? We can finally update a neural network
> and do this iteratively. Using a very simple random pick
> to choose some training data sample:
> 
> update([V], _, [V])  :- !.
> update([V,M|L], [_,M3|R], [V,M4|S]) :-
>     maplist(maplist(compose(add,mul(0.1))), M3, M, M4),
>     update(L, R, S).
> 
> iter(0, _, N, N) :- !.
> iter(I, Z, N, M) :-
>     random(R), K is floor(R*4)+1,
>     call_nth(data(Z, X, Y), K),
>     eval(N, X, U),
>     back(U, Y, V),
>     update(U, V, W),
>     J is I-1,
>     iter(J, Z, W, M).
> 
> Disclaimer: This is only a proof of concept. It mostlikely
> doesn’t have all the finess of Python torch.autograd. Also
> it uses a very simple update of the weights via μ Δwij with
> μ = 0.1. But you can already use it to learn an AND
> 
> or to learn an XOR.
> 
> Mild Shock schrieb:
>> new Prolog system. I thought my new Prolog system
>> has only monomorphic caches , I will never be able to
>>
>> replicate what I did for my old Prolog system with
>> arity polymorphic caches. This changed when I had
>> the idea to dynamically add a cache for the duration
>>
>> of a higher order loop such as maplist/n, foldl/n etc…
>>
>> So this is the new implementation of maplist/3:
>>
>> % maplist(+Closure, +List, -List)
>> maplist(C, L, R) :-
>>     sys_callable_cacheable(C, D),
>>     sys_maplist(L, D, R).
>>
>> % sys_maplist(+List, +Closure, -List)
>> sys_maplist([], _, []).
>> sys_maplist([X|L], C, [Y|R]) :-
>>     call(C, X, Y),
>>     sys_maplist(L, C, R).
>>
>> Its similar as the SWI-Prolog implementation in that
>> it reorders the arguments for better first argument
>> indexing. But the new thing is sys_callable_cacheable/1,
>>
>> which prepares the closure to be more efficiently
>> called. The invocation of the closure is already
>> quite fast since call/3 is implemented natively,
>>
>> but the cache adds an itch more speed. Here some
>> measurements that I did:
>>
>> /* SWI-Prolog 9.3.20 */
>> ?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
>>     maplist(succ,L,_),fail; true)), fail.
>> % 2,003,000 inferences, 0.078 CPU in 0.094 seconds
>>
>> /* Scryer Prolog 0.9.4-350 */
>> ?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
>>     maplist(succ,L,_),fail; true)), fail.
>>      % CPU time: 0.318s, 3_007_105 inferences
>>
>> /* Dogelog Player 1.3.1 */
>> ?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
>>     maplist(succ,L,_),fail; true)), fail.
>> % Zeit 342 ms, GC 0 ms, Lips 11713646, Uhr 10.03.2025 09:18
>>
>> /* realla Prolog 2.64.6-2 */
>> ?- findall(X,between(1,1000,X),L), time((between(1,1000,_),
>>      maplist(succ,L,_),fail; true)), fail.
>> % Time elapsed 1.694s, 15004003 Inferences, 8.855 MLips
>>
>> Not surprisingly SWI-Prolog is fastest. What was
>> a little surprise is that Scryer Prolog can do it quite
>> fast, possibly since they heavily use maplist/n all
>>
>> over the place, they came up with things like '$fast_call'
>> etc.. in their call/n implementation. Trealla Prolog is
>> a little bit disappointing at the moment.
>