| Deutsch English Français Italiano |
|
<2025Feb27.230355@mips.complang.tuwien.ac.at> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.lang.forth
Subject: Re: Stack vs stackless operation
Date: Thu, 27 Feb 2025 22:03:55 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 72
Message-ID: <2025Feb27.230355@mips.complang.tuwien.ac.at>
References: <591e7bf58ebb1f90bd34fba20c730b83@www.novabbs.com> <34df278ef0a52d0eab9d035f45795389@www.novabbs.com> <a6c9d8a0aa7e1046af7948093e07cff0@www.novabbs.com> <2025Feb26.153250@mips.complang.tuwien.ac.at> <2025Feb26.184613@mips.complang.tuwien.ac.at> <875xkwo5io.fsf@nightsong.com> <2025Feb27.082944@mips.complang.tuwien.ac.at> <871pvjnnlc.fsf@nightsong.com>
Injection-Date: Thu, 27 Feb 2025 23:30:13 +0100 (CET)
Injection-Info: dont-email.me; posting-host="139269432581838c150e739c2a2a64f6";
logging-data="3477830"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Ek/aGSceiRuR/2HF5/xFf"
Cancel-Lock: sha1:rd+J2oqglsGnDqswxc6PKE8N0Lo=
X-newsreader: xrn 10.11
Bytes: 4328
Paul Rubin <no.email@nospam.invalid> writes:
>anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>> Results (on Zen4):
>> gforth-fast (development): ...
>
>It's interesting how little difference there is with gforth-fast. Could
>you also do gforth-itc?
gforth-itc (development):
:=: exchange ex ex-locals exchange2
7_527_256_553 5_224_615_325 6_825_283_178 9_238_357_501 7_036_128_309 c.
13_127_503_990 9_326_561_471 12_927_054_153 16_927_820_825 12_027_146_677 i.
For comparison: gforth-fast (development):
:=: exchange ex ex-locals exchange2
814_881_277 879_389_133 928_825_521 875_574_895 808_543_975 cyc.
3_908_874_164 3_708_891_336 4_508_966_770 4_209_778_557 3_708_865_505 inst.
>exchange2 is a big win with VFX, suggesting its
>optimizer could do better with some of the other versions.
On VFX exchange2 takes the same speed and the same number of
instructions as :=:. EX is slower because VFX does not analyse the
return stack, unlike the data stack. EX-LOCALS is slow because VFX's
locals implementation is not particularly good.
To see what a better analysis can do, let's look at lxf:
:=: ex ex-locals exchange2
502_740_029 502_189_567 502_134_842 502_043_217 cycles
1_701_663_782 1_701_657_866 1_701_677_273 1_701_684_186 instructions
The cycles and instructions are worse (except for ex-locals) than with
VFX, but that's due to inlining (which VFX does and lxf does not).
E.g., here's lxf's code for EX-LOCALS:
869204C 804FCE2 23 88C8000 5 normal EX-LOCALS
804FCE2 8B4500 mov eax , [ebp]
804FCE5 8B00 mov eax , [eax]
804FCE7 8BCB mov ecx , ebx
804FCE9 8B09 mov ecx , [ecx]
804FCEB 8B5500 mov edx , [ebp]
804FCEE 890A mov [edx] , ecx
804FCF0 8903 mov [ebx] , eax
804FCF2 8B5D04 mov ebx , [ebp+4h]
804FCF5 8D6D08 lea ebp , [ebp+8h]
804FCF8 C3 ret near
It's the same code as lxf produces for :=:.
The code lxf produces for EX and EXCHANGE2 is:
804FCF9 8BC3 mov eax , ebx
804FCFB 8B00 mov eax , [eax]
804FCFD 8B4D00 mov ecx , [ebp]
804FD00 8B09 mov ecx , [ecx]
804FD02 890B mov [ebx] , ecx
804FD04 8B5D00 mov ebx , [ebp]
804FD07 8903 mov [ebx] , eax
804FD09 8B5D04 mov ebx , [ebp+4h]
804FD0C 8D6D08 lea ebp , [ebp+8h]
804FD0F C3 ret near
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2023 proceedings: http://www.euroforth.org/ef23/papers/
EuroForth 2024 proceedings: http://www.euroforth.org/ef24/papers/