Deutsch English Français Italiano |
<2025Feb27.230355@mips.complang.tuwien.ac.at> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: anton@mips.complang.tuwien.ac.at (Anton Ertl) Newsgroups: comp.lang.forth Subject: Re: Stack vs stackless operation Date: Thu, 27 Feb 2025 22:03:55 GMT Organization: Institut fuer Computersprachen, Technische Universitaet Wien Lines: 72 Message-ID: <2025Feb27.230355@mips.complang.tuwien.ac.at> References: <591e7bf58ebb1f90bd34fba20c730b83@www.novabbs.com> <34df278ef0a52d0eab9d035f45795389@www.novabbs.com> <a6c9d8a0aa7e1046af7948093e07cff0@www.novabbs.com> <2025Feb26.153250@mips.complang.tuwien.ac.at> <2025Feb26.184613@mips.complang.tuwien.ac.at> <875xkwo5io.fsf@nightsong.com> <2025Feb27.082944@mips.complang.tuwien.ac.at> <871pvjnnlc.fsf@nightsong.com> Injection-Date: Thu, 27 Feb 2025 23:30:13 +0100 (CET) Injection-Info: dont-email.me; posting-host="139269432581838c150e739c2a2a64f6"; logging-data="3477830"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Ek/aGSceiRuR/2HF5/xFf" Cancel-Lock: sha1:rd+J2oqglsGnDqswxc6PKE8N0Lo= X-newsreader: xrn 10.11 Bytes: 4328 Paul Rubin <no.email@nospam.invalid> writes: >anton@mips.complang.tuwien.ac.at (Anton Ertl) writes: >> Results (on Zen4): >> gforth-fast (development): ... > >It's interesting how little difference there is with gforth-fast. Could >you also do gforth-itc? gforth-itc (development): :=: exchange ex ex-locals exchange2 7_527_256_553 5_224_615_325 6_825_283_178 9_238_357_501 7_036_128_309 c. 13_127_503_990 9_326_561_471 12_927_054_153 16_927_820_825 12_027_146_677 i. For comparison: gforth-fast (development): :=: exchange ex ex-locals exchange2 814_881_277 879_389_133 928_825_521 875_574_895 808_543_975 cyc. 3_908_874_164 3_708_891_336 4_508_966_770 4_209_778_557 3_708_865_505 inst. >exchange2 is a big win with VFX, suggesting its >optimizer could do better with some of the other versions. On VFX exchange2 takes the same speed and the same number of instructions as :=:. EX is slower because VFX does not analyse the return stack, unlike the data stack. EX-LOCALS is slow because VFX's locals implementation is not particularly good. To see what a better analysis can do, let's look at lxf: :=: ex ex-locals exchange2 502_740_029 502_189_567 502_134_842 502_043_217 cycles 1_701_663_782 1_701_657_866 1_701_677_273 1_701_684_186 instructions The cycles and instructions are worse (except for ex-locals) than with VFX, but that's due to inlining (which VFX does and lxf does not). E.g., here's lxf's code for EX-LOCALS: 869204C 804FCE2 23 88C8000 5 normal EX-LOCALS 804FCE2 8B4500 mov eax , [ebp] 804FCE5 8B00 mov eax , [eax] 804FCE7 8BCB mov ecx , ebx 804FCE9 8B09 mov ecx , [ecx] 804FCEB 8B5500 mov edx , [ebp] 804FCEE 890A mov [edx] , ecx 804FCF0 8903 mov [ebx] , eax 804FCF2 8B5D04 mov ebx , [ebp+4h] 804FCF5 8D6D08 lea ebp , [ebp+8h] 804FCF8 C3 ret near It's the same code as lxf produces for :=:. The code lxf produces for EX and EXCHANGE2 is: 804FCF9 8BC3 mov eax , ebx 804FCFB 8B00 mov eax , [eax] 804FCFD 8B4D00 mov ecx , [ebp] 804FD00 8B09 mov ecx , [ecx] 804FD02 890B mov [ebx] , ecx 804FD04 8B5D00 mov ebx , [ebp] 804FD07 8903 mov [ebx] , eax 804FD09 8B5D04 mov ebx , [ebp+4h] 804FD0C 8D6D08 lea ebp , [ebp+8h] 804FD0F C3 ret near - anton -- M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html New standard: https://forth-standard.org/ EuroForth 2023 proceedings: http://www.euroforth.org/ef23/papers/ EuroForth 2024 proceedings: http://www.euroforth.org/ef24/papers/