Deutsch   English   Français   Italiano  
<q5ednVGUN9_pqS_6nZ2dnZfqlJ-dnZ2d@giganews.com>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!Xl.tags.giganews.com!local-2.nntp.ord.giganews.com!news.giganews.com.POSTED!not-for-mail
NNTP-Posting-Date: Sun, 16 Feb 2025 18:54:43 +0000
Date: Sun, 16 Feb 2025 12:54:44 -0600
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: Re: Experimental aggregate_by/4 was dismissed (Was: India & France
 had their AI Bikini Moment)
Newsgroups: comp.lang.prolog
References: <vodduk$1vor$1@solani.org> <vodmsd$25ad$1@solani.org>
 <voshsq$a340$1@solani.org> <voshvr$a340$2@solani.org>
Content-Language: en-US
From: olcott <NoOne@NoWhere.com>
In-Reply-To: <voshvr$a340$2@solani.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Antivirus: Norton (VPS 250216-4, 2/16/2025), Outbound message
X-Antivirus-Status: Clean
Message-ID: <q5ednVGUN9_pqS_6nZ2dnZfqlJ-dnZ2d@giganews.com>
Lines: 165
X-Usenet-Provider: http://www.giganews.com
X-Trace: sv3-3Zqzu3i3A+1i56FguN8/b+41jXxrsWQCBGy0ynfJH/3tsL5IXTNc7pNWkuc+1qdUpiiQJy9uspFuth2!5Mt6XjxvwRmGaF/hL+VCu423FXRMCv3Yf2n+JyhS/ordhtZVbcRomDghPIUB6LO4Mhve8uJkX6UD
X-Complaints-To: abuse@giganews.com
X-DMCA-Notifications: http://www.giganews.com/info/dmca.html
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.40
Bytes: 6224

On 2/16/2025 5:25 AM, Mild Shock wrote:
> I was exploring group_by/4 respectively a new
> aggregate_by/4 for some machine learning statistics.
> But the slowdown is not that aggravated if the extra
> parameter _H is ground. The cost is smaller
> 
> factor then. So I had a change of mind in favor
> of more declarative aggregate/3. Too much new predicates
> isn't healthy so I dismissed the idea of supporting
> distinct/2, group_by/4 and a new aggregate_by/4.
> 
> But somehow I fell in love with the idea of a new
> firstof/2 predicate, instead of distinct/2, it could
> be bootstrapped as follows:
> ```
> firstof(X, Q) :-
>     bagof(X, Q, L),
>     L = [X|_].
> ```
> Except it can be implemented like distinct/2 more
> eagerly fitting to some other predicates from
> the library(sequence):
> ```
> p(1,a).
> p(1,b).
> p(2,c).
> p(2,d).
> p(2,e).
> 
> ?- firstof(Y,p(X,Y)).
> Y = a, X = 1;
> Y = c, X = 2;
> fail.
> ```
> 
> Mild Shock schrieb:
>> Just noticed that group_by/4 calculates variables
>> and then delegates to bagof/3. But the later predicate
>> calculates also varables, so I suspect quite an overhead:
>>
>> /* SWI-Prolog 9.3.19 */
>> group_by(By, Template, Goal, Bag) :-
>>      ordered_term_variables(Goal, GVars),
>>      ordered_term_variables(By+Template, UVars),
>>      ord_subtract(GVars, UVars, ExVars),
>>      bagof(Template, ExVars^Goal, Bag).
>>
>> I went with another soluton. First I provided a variant
>> of aggregate/3 by the name aggregate_by/4 where one can
>> offload the internal term_variables/2 calculation.
>> Then use this bootstrapping:
>>
>> /* Dogelog Player 1.3.0 */
>> group_by(Witness, Template, Goal, List) :-
>>     aggregate_by(Witness, bag(Template), Goal, List).
>>
>> Here is some testing:
>>
>> /* SWI-Prolog 9.3.19 */
>> ?- length(_H,4000), time((between(1,2000,_),
>>          group_by(X,Y,(nonvar(_H),between(1,10,Y),between(1,10,X)),L),
>>          fail; true)).
>> % 1,153,998 inferences, 0.562 CPU in 0.568 seconds (99% CPU, 2051552 
>> Lips)
>> true.
>> ?- length(_H,8000), time((between(1,2000,_),
>>          group_by(X,Y,(nonvar(_H),between(1,10,Y),between(1,10,X)),L),
>>          fail; true)).
>> % 1,153,998 inferences, 1.047 CPU in 1.060 seconds (99% CPU, 1102326 
>> Lips)
>> true.
>>
>> /* Dogelog Player 1.3.0 */
>> ?- length(_H,4000), time((between(1,2000,_),
>>          group_by(X,Y,(nonvar(_H),between(1,10,Y),between(1,10,X)),L),
>>          fail; true)).
>> % Zeit 399 ms, GC 0 ms, Lips 16987636, Uhr 10.02.2025 10:49
>> true.
>> ?- length(_H,8000), time((between(1,2000,_),
>>          group_by(X,Y,(nonvar(_H),between(1,10,Y),between(1,10,X)),L),
>>          fail; true)).
>> % Zeit 400 ms, GC 1 ms, Lips 16945167, Uhr 10.02.2025 10:50
>> true.
>>
>> The old version suffers from some term_variables/2
>> dependency whereas the new version is totally immune
>> on the size of the given goal, since any internal
>> term_variables/2 has been offloaded.
>>
>> I couldn’t name aggregate_by/4 as aggregate/4, since
>> the later already exists in SWI-Prolog and SICStus Prolog
>> and has a different semantics, it is not the analog of
>> distinct/2, where one can specify Witnesses.
>>
>> Mild Shock schrieb:
>>> Hi,
>>>
>>> India & France had their AI Bikini Moment.
>>> Facinating behavior:
>>>
>>> Macron Says He And PM Modi Will Push
>>> https://www.youtube.com/watch?v=LwCK8yAnlkA
>>>
>>> But don't be fooled, things are possibly
>>> more connected:
>>>
>>> Synthesia: France's 109-billion-euro AI investment
>>> https://www.youtube.com/watch?v=_uyo4RG0Q6I
>>>
>>> Bye
>>>
>>>
>>> Mild Shock schrieb:
>>>> Hi,
>>>>
>>>> Suddently I got an allergy to name a predicate
>>>> distinct/2. It is not so obvious that distinct/1 and
>>>> distinct/2 are related. There is no constant C such that:
>>>>
>>>> distinct(X) :- distinct(C, X).
>>>>
>>>> Just joking, but for some consistency with the introduction
>>>> of group_by/4 and aggregate_by/4 I went for the
>>>> name first_by/2. The name is more intuitive:
>>>>
>>>> ?- [user].
>>>> p(1,a).
>>>> p(1,b).
>>>> p(2,c).
>>>> p(2,d).
>>>> p(2,e).
>>>> ^Z
>>>> true.
>>>>
>>>> Now some queries:
>>>>
>>>> ?- p(X,Y), write(X-Y), nl, fail; true.
>>>> 1-a
>>>> 1-b
>>>> 2-c
>>>> 2-d
>>>> 2-e
>>>> true.
>>>>
>>>> ?- first_by(X, p(X,Y)), write(X-Y), nl, fail; true.
>>>> 1-a
>>>> 2-c
>>>> true.
>>>>
>>>> Cool! The name is also used here with the same semantics:
>>>>
>>>> https://deephaven.io/core/docs/reference/table-operations/group-and- 
>>>> aggregate/firstBy/
>>>>
>>>
>>
> 
test

-- 
Copyright 2025 Olcott

"Talent hits a target no one else can hit;
  Genius hits a target no one else can see."
  Arthur Schopenhauer