Path: news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: Stephen Pelc Newsgroups: comp.lang.forth Subject: Re: OOS approach revisited Date: Sat, 28 Jun 2025 09:37:33 -0000 (UTC) Organization: A noiseless patient Spider Lines: 65 Message-ID: <103od4s$pis9$1@dont-email.me> References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=fixed Content-Transfer-Encoding: 8bit Injection-Date: Sat, 28 Jun 2025 11:37:33 +0200 (CEST) Injection-Info: dont-email.me; posting-host="2ba889cbcec782843888680410bc9a37"; logging-data="838537"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18u0YqRriFg+Q2X0tda+BXD" User-Agent: Usenapp for MacOS Cancel-Lock: sha1:284+Dn2vp/92KFlOCD/yOWXpTuM= X-Usenapp: v1.27.4/l - Full License On 27 Jun 2025 at 22:35:32 CEST, "minforth" wrote: > Am 27.06.2025 um 20:15 schrieb albert@spenarnc.xs4all.nl: >> In article , >> LIT wrote: >>>> It really depends on how counted loops are implemented. >>>> Most CPUs have operators for register-based count-down loops >>>> that are blazingly fast. >>>> >>>> If they can be used within Forth-based loop constructs >>>> I would expect a greater speed increase than what you measured. >>> >>> In that old fig-Forth it's rather short and simple: >>> >>> sqHeader '(LOOP)' >>> XLOOP dw $ + 2 >>> mov BX,1 >>> XLOO1: add [BP],BX >>> mov AX,[BP] >>> sub AX,[BP+2] >>> xor AX,BX >>> js BRAN1 >>> add BP,4 >>> inc SI >>> inc SI >>> jmp NEXT >>> >>> It doesn't look that bad. Can it be >>> done even shorter? >> >> My optimiser looks into the combination of DO and LOOP, >> transfers the returns stack into registers after inlining >> everything. It is near vfx performance. >> All experimental, but yes there is much to be gained. > > Must be tricky to do UNLOOP in a register-based loop. ;-) Here are the code generators for VFX x64 LOOP and UNLOOP. All the complexity is in the DO and ?DO code. : c_loop \ mrk> drbid -- ; compile code for LOOP ; SFP094 c_shuffle reset-opt \ SFP097 a[ INC r14 ]a use-a \ update index a[ INC r15 ]a use-a \ update limit-index-$8000.0000 a[ JNO ]a RES \ resolve forward branch ; : c_unloop \ -- ; compile code for UNLOOP c_shuffle reset-opt a[ pop r14 \ restore old index pop r15 \ restore old index-limit-xorbit63 pop rax \ discarded ]a use-a ; Stephen -- Stephen Pelc, stephen@vfxforth.com Wodni & Pelc GmbH Vienna, Austria Tel: +44 (0)7803 903612, +34 649 662 974 http://www.vfxforth.com/downloads/VfxCommunity/ free VFX Forth downloads